WO2021237475A1 - 图像编解码方法和装置 - Google Patents

图像编解码方法和装置 Download PDF

Info

Publication number
WO2021237475A1
WO2021237475A1 PCT/CN2020/092408 CN2020092408W WO2021237475A1 WO 2021237475 A1 WO2021237475 A1 WO 2021237475A1 CN 2020092408 W CN2020092408 W CN 2020092408W WO 2021237475 A1 WO2021237475 A1 WO 2021237475A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
layer
frame
code stream
enhancement layer
Prior art date
Application number
PCT/CN2020/092408
Other languages
English (en)
French (fr)
Inventor
张怡轩
陈绍林
孟琳
冯俊凯
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/092408 priority Critical patent/WO2021237475A1/zh
Priority to CN202080101374.6A priority patent/CN115699745A/zh
Priority to EP20937790.2A priority patent/EP4156686A4/en
Publication of WO2021237475A1 publication Critical patent/WO2021237475A1/zh
Priority to US17/993,533 priority patent/US20230103928A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding

Definitions

  • This application relates to image coding and decoding technology, and in particular to an image coding and decoding method and device.
  • Wireless projection technology refers to video data generated by devices with strong processing capabilities (for example, game images rendered by a graphics processing unit (GPU)) after encoding and compression, and then sending them wirelessly to weaker processing capabilities.
  • devices with good display effects for example, televisions, virtual reality (VR) helmets, etc.
  • Applications that use wireless projection technology, such as game projection, VR glasses, etc. have interactive features and therefore require extremely low transmission delay.
  • anti-interference is also an important requirement for such applications.
  • the larger the amount of data the greater the transmission power consumption, so it is also important to improve video compression efficiency and reduce transmission power consumption.
  • the scalable video coding (SVC) protocol encodes image frames in a source video into multiple image layers, the multiple image layers correspond to different qualities or resolutions, and the multiple image layers have a reference relationship.
  • SVC scalable video coding
  • the relevant data is transmitted in sequence from the base layer, the lower quality/smaller resolution image layer to the higher quality/larger resolution image layer.
  • the more image layer data of a frame of image received by the decoder the better the quality of the reconstructed image.
  • This technology can more easily match the transmitted bit rate to the changing bandwidth, without switching the code stream and avoiding the delay caused by switching the code stream.
  • This application provides an image encoding and decoding method and device to improve the quality or resolution of the current image frame.
  • this application provides an image encoding method, including: acquiring an image to be encoded, the image to be encoded is divided into a base layer and at least one enhancement layer; The reconstructed image corresponding to the frame sequence number and the layer sequence number indicated in the feedback information is determined to be the first reference frame, and the base layer is inter-encoded according to the first reference frame to obtain the code stream of the base layer; At least one enhancement layer is respectively encoded to obtain the code stream of the at least one enhancement layer; the code stream of the base layer and the code stream of the at least one enhancement layer are sent to the decoding end, and the code stream of the base layer carries Encoding reference information, where the encoding reference information includes the frame sequence number and the layer sequence number of the first reference frame.
  • the base layer can only refer to the reconstructed image corresponding to the base layer of the previous n-th frame image, and n is A positive integer greater than or equal to 1, it should be understood that the previous nth frame of image represents a certain frame of image before the image to be encoded.
  • the quality or resolution of the corresponding reconstructed image is higher than that of the reconstructed image corresponding to the base layer.
  • any one The reconstructed image corresponding to the enhancement layer cannot be used as the reference frame of the base layer, resulting in lower quality of the code stream obtained by the base layer encoding, and the quality or resolution of the image reconstructed based on this is also lower, even the decoding end is based on this The quality or resolution of the reconstructed image obtained by decoding is also low.
  • the encoding end obtains the image layer of the image frame with the highest quality or resolution that the decoding end can obtain based on the feedback information from the decoding end, and uses the reconstructed image corresponding to the image layer as the reference frame of the base layer, that is, encoding
  • the image referenced by the inter-frame encoding is the reconstructed image corresponding to the image layer with the highest quality or resolution that has been successfully decoded, received by the decoding end, or will be decoded in the previous nth frame of image.
  • This image layer is also the highest-level image layer that meets the network transmission status and bit rate requirements and is fed back by the decoder.
  • the coding layer uses the reconstructed image corresponding to such an image layer as a reference frame to perform inter-frame coding on the base layer, which can improve the quality of the code stream obtained by the base layer coding, and can also improve the quality of the image reconstructed based on the code stream. Or resolution, it can even improve the quality or resolution of the reconstructed image obtained by decoding the code stream of the base layer by the decoder, thereby improving the quality or resolution of the current image frame as a whole.
  • the decoding end can feedback each frame of image or sub-image, which avoids error transmission and improves image quality. It also avoids the periodic insertion of intra-frame coded frames, thereby reducing the code rate.
  • the image to be encoded is an entire frame image or one of the sub-images of an entire frame image.
  • the feedback information when the image to be encoded is one of the sub-images of the entire frame image, the feedback information further includes position information, and the position information is used to indicate the sub-image to be encoded The position in the whole frame of image.
  • the frame sequence number indicates the first nth frame of the image to be encoded, and n is a positive integer; the layer sequence number corresponds to the first nth frame of the image to be encoded from the decoding end.
  • the base layer can only refer to the reconstructed image corresponding to the base layer of the previous n-th frame image, and n is A positive integer greater than or equal to 1, it should be understood that the previous nth frame of image represents a certain frame of image before the image to be encoded.
  • the quality or resolution of the corresponding reconstructed image is higher than that of the reconstructed image corresponding to the base layer.
  • any one The reconstructed image corresponding to the enhancement layer cannot be used as the reference frame of the base layer, resulting in lower quality of the code stream obtained by the base layer encoding, and the quality or resolution of the image reconstructed based on this is also lower, even the decoding end is based on this The quality or resolution of the reconstructed image obtained by decoding is also low.
  • the encoding end obtains the image layer of the image frame with the highest quality or resolution that the decoding end can obtain based on the feedback information from the decoding end, and uses the reconstructed image corresponding to the image layer as the reference frame of the base layer, that is, encoding
  • the image referenced by the inter-frame encoding is the reconstructed image corresponding to the image layer with the highest quality or resolution that has been successfully decoded, received by the decoding end, or will be decoded in the previous nth frame of image.
  • the feedback from the decoder usually also reflects the network transmission status, that is, which image layer transmission demand and bit rate can be satisfied by the current network status.
  • the encoding layer uses the reconstructed image corresponding to such an image layer as a reference frame to perform inter-frame encoding on the base layer, which provides a good reference basis for the relevant area (such as static area) of the image to be encoded, and can improve the base layer encoding.
  • the quality of the obtained code stream can also improve the quality or resolution of the image reconstructed based on the code stream, and even the quality or resolution of the reconstructed image obtained by decoding the base layer code stream by the decoder, and then Improve the quality or resolution of the current image frame as a whole.
  • the method further includes: when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure, according to the third
  • the reference frame performs inter-frame encoding on the base layer, and the third reference frame is a reference frame of the base layer of the previous frame of the image to be encoded.
  • the method further includes: when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure, The base layer performs intra-frame coding.
  • the respectively encoding the at least one enhancement layer to obtain the code stream of the at least one enhancement layer includes: performing inter-frame encoding on the first enhancement layer according to the second reference frame to obtain the The code stream of the first enhancement layer, the first enhancement layer is any one of the at least one enhancement layer, the second reference frame is a reconstructed image corresponding to the first image layer, and the first image The quality or resolution of the layer is lower than the quality or resolution of the first enhancement layer.
  • the enhancement layer needs to refer to the reconstructed image corresponding to the same layer image layer of the previous nth frame image and the reconstructed image corresponding to the lower image layer of the same frame image at the same time, namely
  • the relevant area to be coded such as the static area
  • the relative processing of the two reference frames increases the amount of calculation.
  • the reference frame of the enhancement layer can only be the reconstructed image corresponding to the same layer image layer of the previous nth frame image and the reconstructed image corresponding to the lower image layer of the same frame image, which results in the quality or resolution of the enhancement layer being limited .
  • the image referenced by the base layer encoding is the quality or resolution of the previous n-th frame image that has been successfully decoded, received or will be decoded by the decoding end.
  • the image layer with the highest rate which has improved the quality or resolution of the base layer, and further improved the quality of the code stream encoded by the enhancement layer of the reference base layer, and can also improve the quality or quality of the image reconstructed based on the code stream.
  • the resolution can even improve the quality or resolution of the reconstructed image obtained by decoding the code stream of the base layer by the decoder.
  • the enhancement layer itself is also directly or indirectly referenced to the base layer. Therefore, the quality of the code stream encoded by the enhancement layer can also be improved, and the quality or resolution of the image reconstructed based on the code stream can also be improved, or even It can also improve the quality or resolution of the reconstructed image obtained by decoding the code stream of the base layer by the decoder.
  • the base layer when the base layer is coded, a good reference has been provided for the relevant area to be coded (such as the static area).
  • the high-level image layer of the same frame image uses the low-level image layer as the reference frame, which can further be the occluded area. Provide a reference to ultimately improve the higher-level image quality or resolution.
  • the enhancement layer only refers to the reconstructed image corresponding to the lower image layer of the same frame of image, which reduces the amount of calculation.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • the base layer and the low-level enhancement layer use low-rate MCS to enable user equipment with poor channel conditions to obtain basic video services, while the advanced enhancement layer uses high-rate MCS to make the channel conditions good User equipment gets higher-quality, higher-resolution video services.
  • the method further includes: buffering the base layer and the at least one enhancement layer; One enhancement layer corresponds to the reconstructed image.
  • the method when the feedback information sent by the decoding end is received, before the reconstructed image corresponding to the frame sequence number and the layer sequence number indicated in the feedback information is determined as the first reference frame, the method further includes : Monitor the feedback information within the set time period; if the feedback information is received within the set time period, it is determined that the feedback information is received.
  • the encoding end if the encoding end does not receive the feedback information within the set time period, it is considered that the feedback information has not been received, and the encoding end will not continue to monitor at this time. On the one hand, it avoids unnecessary waiting and reduces consumption. On the one hand, it can also avoid processing the received invalid feedback information as useful information, which will lead to the wrong judgment of the reference frame by the encoder.
  • the present application provides an image decoding method, including: receiving a code stream of a base layer of an image to be decoded from an encoding end and a code stream of at least one enhancement layer, the code stream of the base layer carries coding reference information, and
  • the coding reference information includes a first frame sequence number and a first layer sequence number; the first reference frame is determined according to the first frame sequence number and the first layer sequence number, and the coding of the base layer according to the first reference frame Perform inter-frame decoding of the stream to obtain the reconstructed image corresponding to the base layer; decode the code stream of the at least one enhancement layer to obtain the reconstructed image respectively corresponding to the at least one enhancement layer; send feedback to the encoding end Information, the feedback information includes a second frame sequence number and a second layer sequence number, the second frame sequence number corresponds to the image to be decoded, and the second layer sequence number corresponds to the base layer of the image to be decoded and the quality in at least one enhancement layer Or the image layer with the
  • the image to be decoded is an entire frame image or one of the sub-images of the entire frame image.
  • the feedback information when the image to be decoded is one of the sub-images of the entire frame of image, the feedback information further includes position information, and the position information is used to indicate that the image to be decoded is The position in the entire frame image.
  • the second layer sequence number corresponds to the base layer of the image to be decoded and the image layer with the highest quality or resolution among at least one enhancement layer, specifically including: the second layer sequence number corresponds to the slave The code stream of the base layer of the image to be decoded and the image layer with the highest quality or resolution successfully decoded in the code stream of at least one enhancement layer; or, the second layer sequence number corresponds to the image layer from the base layer of the image to be decoded The image layer with the highest quality or resolution successfully received among the code streams of the at least one enhancement layer and the code streams of the at least one enhancement layer; or, the second layer sequence number corresponds to the currently determined code stream from the base layer of the image to be decoded And the image layer with the highest quality or resolution to be decoded in the code stream of at least one enhancement layer.
  • the method further includes: when the reception of the code stream of the base layer and the code stream of the at least one enhancement layer fails, the feedback information includes identification information for indicating the reception failure; or When the decoding of the code stream of the base layer and/or the code stream of the at least one enhancement layer fails, the feedback information includes identification information used to indicate the decoding failure.
  • the method further includes: obtaining the to-be-obtained image according to the reconstructed image corresponding to the base layer and the reconstructed image corresponding to the at least one enhancement layer. Decode the image.
  • the respectively decoding the code streams of the at least one enhancement layer to obtain the reconstructed images corresponding to the at least one enhancement layer respectively includes: comparing the first enhancement layer with the second reference frame Decode the code stream between frames to obtain the reconstructed image corresponding to the first enhancement layer, the first enhancement layer is any one of the at least one enhancement layer, and the second reference frame is the first image layer In the corresponding reconstructed image, the quality or resolution of the first image layer is lower than the quality or resolution of the first enhancement layer.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • the feedback information when the feedback information includes the frame numbers and layer numbers of all image layers that have been successfully decoded, about to be decoded, or successfully received, the reconstructed images corresponding to all image layers are cached; or, when the When the feedback information includes the frame sequence number and the layer sequence number of the image layer with the highest quality or resolution that is successfully decoded, decoded, or successfully received, the image layer with the highest quality or resolution corresponding to the successfully decoded, decoded, or successfully received image layer is buffered Reconstruct the image.
  • the code stream of the base layer of the image to be decoded and the code stream of at least one enhancement layer from the encoding end further includes: when the code stream of the base layer and/or the code stream of the When the code stream of at least one enhancement layer includes coding mode indication information, the corresponding image layer is decoded in the manner indicated by the coding mode indication information, and the mode indicated by the coding mode indication information includes intra-frame decoding or inter-frame decoding.
  • the present application provides an encoding device, including: a receiving module for acquiring an image to be encoded, the image to be encoded is divided into a base layer and at least one enhancement layer; an encoding module for receiving a decoding end When sending feedback information, the reconstructed image corresponding to the frame number and the layer number indicated in the feedback information is determined as the first reference frame, and the base layer is inter-encoded according to the first reference frame to obtain the The code stream of the base layer; the code stream of the at least one enhancement layer is respectively encoded to obtain the code stream of the at least one enhancement layer; the sending module is configured to send the code stream of the base layer and the at least one code stream to the decoding end
  • the code stream of the enhancement layer, the code stream of the base layer carries coding reference information, and the coding reference information includes the frame sequence number and the layer sequence number of the first reference frame.
  • the image to be encoded is an entire frame image or one of the sub-images of an entire frame image.
  • the feedback information when the image to be encoded is one of the sub-images of the entire frame image, the feedback information further includes position information, and the position information is used to indicate the sub-image to be encoded The position in the whole frame of image.
  • the frame sequence number indicates the first nth frame of the image to be encoded, and n is a positive integer; the layer sequence number corresponds to the first nth frame of the image to be encoded from the decoding end.
  • the processing module is further configured to: when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure, pairing according to the third reference frame
  • the base layer performs inter-frame encoding
  • the third reference frame is a reference frame of the base layer of the previous frame of the image to be encoded.
  • the processing module is further configured to perform processing on the base layer when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure Intra coding.
  • the encoding module is specifically configured to perform inter-frame encoding on the first enhancement layer according to the second reference frame to obtain the code stream of the first enhancement layer, and the first enhancement layer is the code stream of the first enhancement layer.
  • the second reference frame is a reconstructed image corresponding to the first image layer, and the quality or resolution of the first image layer is lower than the quality or resolution of the first enhancement layer Resolution.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • the method further includes: a processing module, configured to cache reconstructed images respectively corresponding to the base layer and the at least one enhancement layer.
  • the processing module is further configured to monitor the feedback information within a set time period; if the feedback information is received within the set time period, it is determined to receive all the feedback information. ⁇ Feedback information.
  • the present application provides a decoding device, including: a receiving module, configured to receive a code stream of a base layer of an image to be decoded from an encoding end and a code stream of at least one enhancement layer, and the code stream of the base layer carries the code Reference information, the coding reference information includes a first frame sequence number and a first layer sequence number; a decoding module, configured to determine a first reference frame according to the first frame sequence number and the first layer sequence number, and according to the first layer sequence number; Perform inter-frame decoding on the code stream of the base layer with reference frames to obtain the reconstructed image corresponding to the base layer; respectively decode the code stream of the at least one enhancement layer to obtain the reconstruction corresponding to the at least one enhancement layer.
  • sending module used to send feedback information to the encoding end, the feedback information includes a second frame sequence number and a second layer sequence number, the second frame sequence number corresponds to the image to be decoded, the second layer sequence number The image layer with the highest quality or resolution in the base layer and at least one enhancement layer corresponding to the image to be decoded.
  • the image to be decoded is an entire frame image or one of the sub-images of the entire frame image.
  • the feedback information when the image to be decoded is one of the sub-images of the entire frame of image, the feedback information further includes position information, and the position information is used to indicate that the image to be decoded is The position in the entire frame image.
  • the second layer sequence number corresponds to the base layer of the image to be decoded and the image layer with the highest quality or resolution among at least one enhancement layer, specifically including: the second layer sequence number corresponds to the slave The code stream of the base layer of the image to be decoded and the image layer with the highest quality or resolution successfully decoded in the code stream of at least one enhancement layer; or, the second layer sequence number corresponds to the image layer from the base layer of the image to be decoded The image layer with the highest quality or resolution successfully received among the code streams of the at least one enhancement layer and the code streams of the at least one enhancement layer; or, the second layer sequence number corresponds to the currently determined code stream from the base layer of the image to be decoded And the image layer with the highest quality or resolution to be decoded in the code stream of at least one enhancement layer.
  • the feedback information when the reception of both the code stream of the base layer and the code stream of the at least one enhancement layer fails, the feedback information includes identification information used to indicate the reception failure; or When decoding of the code stream of the base layer and/or the code stream of the at least one enhancement layer fails, the feedback information includes identification information used to indicate the decoding failure.
  • the decoding module is further configured to obtain the image to be decoded according to the reconstructed image corresponding to the base layer and the reconstructed image corresponding to the at least one enhancement layer.
  • the decoding module is specifically configured to perform inter-frame decoding on the bitstream of any image layer according to the second reference frame to obtain a reconstructed image corresponding to the first enhancement layer
  • the first enhancement layer is any one of the at least one enhancement layer
  • the second reference frame is a reconstructed image corresponding to the first image layer
  • the quality or resolution of the first image layer is lower than the any one. The quality or resolution of an image layer.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • it further includes: a processing module, which is used to cache the replays corresponding to all image layers when the feedback information includes frame numbers and layer numbers of all image layers that have been successfully decoded, about to be decoded, or successfully received. Structure image; or, when the feedback information includes the frame sequence number and layer sequence number of the image layer with the highest quality or resolution that is successfully decoded, about to be decoded, or successfully received, buffer the quality of the successfully decoded, about to be decoded, or successfully received quality or The reconstructed image corresponding to the image layer with the highest resolution.
  • a processing module which is used to cache the replays corresponding to all image layers when the feedback information includes frame numbers and layer numbers of all image layers that have been successfully decoded, about to be decoded, or successfully received. Structure image; or, when the feedback information includes the frame sequence number and layer sequence number of the image layer with the highest quality or resolution that is successfully decoded, about to be decoded, or successfully received, buffer the quality of the successfully decoded, about to be decode
  • the decoding module is further configured to use the coding mode indication when the code stream of the base layer and/or the code stream of the at least one enhancement layer includes coding mode indication information
  • the corresponding image layer is decoded in the manner indicated by the information, and the manner indicated by the encoding manner indicated by the information includes intra-frame decoding or inter-frame decoding.
  • the present application provides an encoder, including: a processor and a transmission interface;
  • the processor is configured to call program instructions stored in the memory to implement the method according to any one of the above-mentioned first aspects.
  • the present application provides a decoder, including: a processor and a transmission interface;
  • the processor is configured to call program instructions stored in the memory to implement the method according to any one of the above-mentioned second aspects.
  • the present application provides a computer-readable storage medium, which is characterized by comprising a computer program that, when executed on a computer or a processor, causes the computer or the processor to execute the first To the method of any one of the two aspects.
  • this application also provides a computer program product.
  • the computer program product includes computer program code.
  • the computer program code runs on a computer or a processor, the computer or the processor executes the first to second The method of any one of the aspects.
  • FIG. 1A is a block diagram of an example of a video encoding and decoding system 10 used to implement an embodiment of the present application;
  • FIG. 1B is a block diagram of an example of a video decoding system 40 used to implement an embodiment of the present application
  • FIG. 2 is a flowchart of an embodiment of an image coding method according to this application.
  • FIG. 3 is a flowchart of an embodiment of an image decoding method according to this application.
  • Figure 4 shows an exemplary schematic diagram of an image encoding and decoding process
  • Fig. 5 shows an exemplary schematic diagram of image layered coding and decoding
  • Fig. 6 shows an exemplary schematic diagram of the encoding process of the encoding end
  • Fig. 7 shows an exemplary schematic diagram of the decoding process of the decoding end
  • Fig. 8 shows an exemplary schematic diagram of the image coding method of the present application
  • FIG. 9 is a schematic structural diagram of an embodiment of an encoding device of this application.
  • FIG. 10 is a schematic structural diagram of an embodiment of a decoding device of this application.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships. For example, “A and/or B” can mean: only A, only B, and both A and B. , Where A and B can be singular or plural. The character “/” generally indicates that the associated objects before and after are in an “or” relationship. "The following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one (a) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
  • the technical solutions involved in the embodiments of this application may not only be applied to existing video coding standards (such as H.264/advanced video coding (AVC), H.265/high efficiency video coding) , HEVC) and other standards), and may also be applied to future video coding standards (such as H.266/versatile video coding (VVC) standards).
  • AVC H.264/advanced video coding
  • HEVC high efficiency video coding
  • VVC variatile video coding
  • Video encoding is performed on the source side, and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture, so as to store and/or transmit more efficiently.
  • Video decoding is performed on the destination side, and usually includes inverse processing relative to the encoder to reconstruct the video picture.
  • the “encoding” of video pictures involved in the embodiments should be understood as involving the “encoding” or “decoding” of the video sequence.
  • the combination of the encoding part and the decoding part is also called codec.
  • FIG. 1A is a block diagram of an example of a video encoding and decoding system 10 used to implement an embodiment of the present application.
  • the video encoding and decoding system 10 may include a source device 12 and a destination device 14.
  • the source device 12 generates encoded video data. Therefore, the source device 12 may be referred to as a video encoding device.
  • the destination device 14 can decode the encoded video data generated by the source device 12, and therefore, the destination device 14 can be referred to as a video decoding device.
  • Various implementations of source device 12 or destination device 14 may include one or more processors and memory coupled to the one or more processors.
  • the memory may include, but is not limited to, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EPROM, EEPROM), flash memory
  • RAM random access memory
  • ROM read-only memory
  • EPROM electrically erasable programmable read-only memory
  • flash memory The memory or any other medium that can be used to store the desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.
  • the source device 12 and the destination device 14 may include various devices, including desktop computers, mobile computing devices, notebook (for example, laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones. , Televisions, cameras, display devices, digital media players, video game consoles, on-board computers, wireless communication equipment, or the like.
  • FIG. 1A shows the source device 12 and the destination device 14 as separate devices
  • the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or the corresponding function. And the destination device 14 or the corresponding functionality.
  • the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality.
  • the source device 12 and the destination device 14 can communicate with each other via a link 13, and the destination device 14 can receive encoded video data from the source device 12 via the link 13.
  • the link 13 may include one or more media or devices capable of moving encoded video data from the source device 12 to the destination device 14.
  • link 13 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
  • the source device 12 may modulate the encoded video data according to a communication standard (for example, a wireless communication protocol), and may transmit the modulated video data to the destination device 14.
  • the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the Internet).
  • the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.
  • the source device 12 includes an encoder 20, and optionally, the source device 12 may also include a picture source 16, a picture preprocessor 18, and a communication interface 22.
  • the encoder 20, the picture source 16, the picture preprocessor 18, and the communication interface 22 may be hardware components in the source device 12, or may be software programs in the source device 12. They are described as follows:
  • the picture source 16 which can include or can be any type of picture capture device, used to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also considered to be encoded Picture or part of an image) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real world pictures, computer animation pictures (for example, screen content, virtual reality, VR) pictures), and/or any combination thereof (for example, augmented reality (AR) pictures).
  • the picture source 16 may be a camera for capturing pictures or a memory for storing pictures.
  • the picture source 16 may also include any type (internal or external) interface for storing previously captured or generated pictures and/or acquiring or receiving pictures.
  • the picture source 16 When the picture source 16 is a camera, the picture source 16 may be, for example, a local or an integrated camera integrated in the source device; when the picture source 16 is a memory, the picture source 16 may be local or, for example, an integrated camera integrated in the source device. Memory.
  • the interface When the picture source 16 includes an interface, the interface may be, for example, an external interface for receiving pictures from an external video source.
  • the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device, such as It is an external computer graphics processor, computer or server.
  • the interface can be any type of interface based on any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface.
  • the picture preprocessor 18 is configured to receive the original picture data 17 and perform preprocessing on the original picture data 17 to obtain the preprocessed picture 19 or the preprocessed picture data 19.
  • the pre-processing performed by the picture pre-processor 18 may include trimming, color format conversion, toning, or denoising. It should be noted that performing preprocessing on the image data 17 is not a required processing procedure of this application, and this application does not specifically limit this.
  • the encoder 20 (or video encoder 20) is configured to receive the pre-processed picture data 19, and process the pre-processed picture data 19 using a relevant prediction mode (such as the prediction mode in the various embodiments herein), thereby The encoded picture data 21 is provided.
  • the encoder 20 may be used to implement the various embodiments described below to realize the application of the image coding method described in this application on the coding side.
  • the communication interface 22 can be used to receive the encoded picture data 21, and can transmit the encoded picture data 21 to the destination device 14 or any other device (such as a memory) through the link 13 for storage or direct reconstruction.
  • the other device can be any device used for decoding or storage.
  • the communication interface 22 can be used, for example, to encapsulate the encoded picture data 21 into a suitable format, such as a data packet, for transmission on the link 13.
  • the destination device 14 includes a decoder 30, and optionally, the destination device 14 may also include a communication interface 28, a picture post processor 32, and a display device 34. They are described as follows:
  • the communication interface 28 can be used to receive the encoded picture data 21 from the source device 12 or any other source, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.
  • the communication interface 28 can be used to transmit or receive the encoded picture data 21 via the link 13 between the source device 12 and the destination device 14 or via any type of network.
  • the link 13 is, for example, a direct wired or wireless connection, of any type.
  • the network of is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof.
  • the communication interface 28 may be used, for example, to decapsulate the data packet transmitted by the communication interface 22 to obtain the encoded picture data 21.
  • Both the communication interface 28 and the communication interface 22 can be configured as a one-way communication interface or a two-way communication interface, and can be used, for example, to send and receive messages to establish connections, confirm and exchange any other communication links and/or, for example, encoded picture data Information about the transmission of the transmitted data.
  • the decoder 30 (or referred to as the decoder 30) is configured to receive the encoded picture data 21 and provide the decoded picture data 31 or the decoded picture 31.
  • the decoder 30 may be used to implement the various embodiments described below to realize the application of the image decoding method described in this application on the decoding side.
  • the picture post processor 32 is configured to perform post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) to obtain post-processed picture data 33.
  • the post-processing performed by the picture post-processor 32 may include: color format conversion, toning, trimming or resampling, or any other processing, and may also be used to transmit the post-processed picture data 33 to the display device 34. It should be noted that performing post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) is not a required processing procedure of this application, and this application does not specifically limit this.
  • the display device 34 is configured to receive the post-processed picture data 33 to display the picture to, for example, a user or a viewer.
  • the display device 34 may be or may include any type of display for presenting reconstructed pictures, for example, an integrated or external display or monitor.
  • the display may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), Digital light processor (digital light processor, DLP) or any type of other display.
  • FIG. 1A shows the source device 12 and the destination device 14 as separate devices
  • the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or the corresponding Functionality and destination device 14 or corresponding functionality.
  • the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality.
  • the source device 12 and the destination device 14 may include any of a variety of devices, including any types of handheld or stationary devices, such as notebook or laptop computers, mobile phones, smart phones, tablets or tablet computers, cameras, desktop computers , Set-top boxes, televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices, etc. , And can not use or use any type of operating system.
  • Both the encoder 20 and the decoder 30 can be implemented as any of various suitable circuits, for example, one or more microprocessors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits (application-specific integrated circuits). circuit, ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof.
  • the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present disclosure. . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be regarded as one or more processors.
  • the video encoding and decoding system 10 shown in FIG. 1A is only an example, and the technology of the present application can be applied to video encoding settings that do not necessarily include any data communication between encoding and decoding devices (for example, video encoding or video encoding). decoding).
  • the data can be retrieved from local storage, streamed on the network, etc.
  • the video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by devices that do not communicate with each other but only encode data to the memory and/or retrieve data from the memory and decode the data.
  • FIG. 1B is a block diagram of an example of a video decoding system 40 used to implement an embodiment of the present application.
  • the video decoding system 40 can implement a combination of various technologies in the embodiments of the present application.
  • the video coding system 40 may include an imaging device 41, an encoder 20, a decoder 30 (and/or a video encoder/decoder implemented by the logic circuit 47 of the processing unit 46), and an antenna 42 , One or more processors 43, one or more memories 44, and/or display devices 45.
  • the imaging device 41, the antenna 42, the processing unit 46, the logic circuit 47, the encoder 20, the decoder 30, the processor 43, the memory 44, and/or the display device 45 can communicate with each other.
  • the encoder 20 and the decoder 30 are used to illustrate the video coding system 40, in different examples, the video coding system 40 may include only the encoder 20 or only the decoder 30.
  • antenna 42 may be used to transmit or receive an encoded bitstream of video data.
  • the display device 45 may be used to present video data.
  • the processing unit 46 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and so on.
  • the video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include ASIC logic, a graphics processor, a general-purpose processor, and the like.
  • the processing unit 46 may be implemented by hardware, such as dedicated hardware for video encoding, and the processor 43 may be implemented by general software, an operating system, and the like.
  • the memory 44 may be any type of memory, such as volatile memory (for example, static random access memory (SRAM), dynamic random access memory (DRAM), etc.) or non-volatile memory. Memory (for example, flash memory, etc.), etc.
  • the memory 44 may be implemented by cache memory.
  • the logic circuit 47 may access the memory 44 (e.g., to implement an image buffer).
  • the logic circuit 47 and/or the processing unit 46 may include a memory (for example, a cache, etc.) for implementing an image buffer and the like.
  • the encoder 20 implemented by logic circuits may include an image buffer (e.g., implemented by the processing unit 46 or the memory 44) and a graphics processing unit (e.g., implemented by the processing unit 46).
  • the graphics processing unit may be communicatively coupled to the image buffer.
  • the graphics processing unit may include an encoder 20 implemented by a logic circuit 47 to implement various modules discussed in any other encoder system or subsystem described herein. Logic circuits can be used to perform the various operations discussed herein.
  • decoder 30 may be implemented by logic circuit 47 in a similar manner to implement the various modules discussed in any other decoder system or subsystem described herein.
  • the decoder 30 implemented by logic circuits may include an image buffer (implemented by the processing unit 2820 or the memory 44) and a graphics processing unit (implemented by the processing unit 46, for example).
  • the graphics processing unit may be communicatively coupled to the image buffer.
  • the graphics processing unit may include a decoder 30 implemented by a logic circuit 47 to implement the various modules discussed in any other decoder system or subsystem described herein.
  • antenna 42 may be used to receive an encoded bitstream of video data.
  • the encoded bitstream may include data, indicators, index values, mode selection data, etc., related to the encoded video frame discussed herein, such as data related to encoded partitions (e.g., transform coefficients or quantized transform coefficients). , (As discussed) optional indicators, and/or data defining coded partitions).
  • the video coding system 40 may also include a decoder 30 coupled to the antenna 42 and used to decode the encoded bitstream.
  • the display device 45 is used to present video frames.
  • the decoder 30 may be used to perform the reverse process.
  • the decoder 30 can be used to receive and parse such syntax elements, and decode related video data accordingly.
  • the encoder 20 may entropy encode the syntax elements into an encoded video bitstream. In such instances, decoder 30 can parse such syntax elements and decode related video data accordingly.
  • the encoder 20 and decoder 30 in the embodiment of the present application may be, for example, H.263, H.264, HEVC, moving picture experts group (MPEG)-2, MPEG-4, Encoder/decoder corresponding to video standard protocols such as VP8 and VP9 or next-generation video standard protocols (such as H.266, etc.).
  • MPEG moving picture experts group
  • MPEG-4 Encoder/decoder corresponding to video standard protocols such as VP8 and VP9 or next-generation video standard protocols (such as H.266, etc.).
  • Fig. 2 is a flowchart of an embodiment of an image coding method according to this application.
  • the process 200 may be performed by the encoder of the source device.
  • the process 200 is described as a series of steps or operations. It should be understood that the process 200 may be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in FIG. 2.
  • the method of this embodiment may include:
  • Step 201 Obtain an image to be encoded.
  • the image to be encoded is the entire frame image or one of the sub-images of the entire frame image. For details, please refer to the above-mentioned related description of the image frame, which will not be repeated here.
  • the image to be coded in this application is divided into a base layer and at least one enhancement layer, and the at least one enhancement layer is arranged in order of quality or resolution from low to high.
  • SVC Scalable Video Coding
  • the SVC protocol divides the image frame in the video into a base layer and multiple enhancement layers according to requirements.
  • the base layer provides users with the most basic image.
  • Quality, frame rate and resolution the enhancement layer completes the image quality and provides more information such as image resolution, grayscale, and pixel value.
  • MCS modulation and coding schemes
  • the base layer and the low-level enhancement layer use low-rate MCS can enable user equipment with poor channel conditions to obtain basic video services, while the advanced enhancement layer uses high-rate MCS to enable user equipment with good channel conditions to obtain higher-quality, higher-resolution video services.
  • Step 202 When the feedback information sent by the decoding end is received, the reconstructed image corresponding to the frame sequence number and the layer sequence number indicated in the feedback information is determined as the first reference frame, and the base layer is inter-coded according to the first reference frame.
  • the code stream of the basic layer is determined as the first reference frame, and the base layer is inter-coded according to the first reference frame.
  • the feedback information is fed back to the encoding end based on the reception status of the code stream or the decoding status of the code stream when the decoding end receives the code stream from the encoding end.
  • the decoder may be processing the first nth frame of the current image (the frame number is assumed to be m) (The frame number is mn), n is 1 that the decoder may be processing the previous frame of the current image (frame number is m-1), n is 2 that the decoder may be processing the current image The first two frames of images (the frame number is m-2), and so on.
  • the feedback information sent to the encoding end by the decoding end may carry the information of the previous nth frame image, including its frame number (m-n) and layer number.
  • the encoding end and the decoding end determine the layer sequence number carried in the feedback information through a pre-appointed or pre-set method based on successful decoding.
  • the layer sequence number corresponds to the previous nth frame of the decoding end.
  • the image layer with the highest quality or resolution successfully decoded in the bitstream of the image (frame number is mn).
  • the encoding end and the decoding end determine the layer sequence number carried in the feedback information through a pre-appointed or pre-set method to be subject to successful reception.
  • the layer sequence number corresponds to the previous nth frame of the decoding end
  • the image layer with the highest quality or resolution successfully received in the bitstream of the image (frame number is mn).
  • the decoding end can judge the amount of decoding that it can complete within a predetermined time according to the size of the received bitstream, and determine the image layer that it can decode soon according to the judgment result.
  • the layer number corresponds to The image layer with the highest quality or resolution to be decoded in the code stream of the previous n-th frame image (frame number mn) determined by the decoding end. That is, the decoded image layer represents the image layer that the decoding end can successfully decode the code stream within a predetermined time but has not been decoded; that is, after the decoding end receives the code stream, it will combine the size of the code stream and its own decoding capability Make a judgment.
  • the feedback information can be sent to the encoding end, and there is no need to wait until the decoding is successful before sending the feedback information.
  • the feedback information when the image to be encoded is one of the sub-images of the entire frame of image, the feedback information further includes position information, and the position information is used to indicate the position of the image to be encoded in the entire frame of image.
  • the pixels of the whole frame of image are 64 ⁇ 64, which are divided into 4 32 ⁇ 32 non-intersecting sub-images, and their positions are respectively located at the upper left, upper right, lower left or lower right of the whole frame of image.
  • the position information is used to indicate the Which one of the aforementioned four sub-images is the encoded image.
  • the feedback information when the image to be encoded is one of the sub-images of the entire frame of image, the feedback information also includes information reflecting the position of the image layer fed back by the decoder in the entire frame of image, for example, slice ( Slice) (when the sub-image is a slice), the sequence number of the sub-image (the size of the sub-image has been pre-arranged), and the width or height of the sub-image.
  • slice Slice
  • the sequence number of the sub-image the size of the sub-image has been pre-arranged
  • the width or height of the sub-image when the image to be encoded is one of the sub-images of the entire frame of image.
  • the encoder can monitor the feedback information within a set time period, and if the feedback information is received within the set time period, it is determined to receive the feedback information. That is, the encoding end can set a time length, starting from sending out the code stream of a frame of image, if the feedback information is received within this time length, it is considered that the feedback information has been received, and if the feedback information is not received within the time length, then It is believed that no feedback information has been received.
  • the encoding end After the encoding end separately encodes the base layer and at least one enhancement layer of the image to be encoded, it will decode according to the method corresponding to the encoding of each layer to obtain the reconstructed image corresponding to each layer. These reconstructed images will be buffered as reference frames for subsequent images.
  • the base layer can only refer to the reconstructed image corresponding to the base layer of the previous n-th frame image, and n is A positive integer greater than or equal to 1, it should be understood that the previous nth frame of image represents a certain frame of image before the image to be encoded.
  • the quality or resolution of the corresponding reconstructed image is higher than that of the reconstructed image corresponding to the base layer.
  • any one The reconstructed image corresponding to the enhancement layer cannot be used as the reference frame of the base layer, resulting in lower quality of the code stream obtained by the base layer encoding, and the quality or resolution of the image reconstructed based on this is also lower, and even the decoding end is based on this The quality or resolution of the reconstructed image obtained by decoding is also low.
  • the encoding end obtains the image layer of the image frame with the highest quality or resolution that the decoding end can obtain based on the feedback information from the decoding end, and uses the reconstructed image corresponding to the image layer as the reference frame of the base layer, that is, encoding
  • the image referenced by inter-frame encoding is the reconstructed image corresponding to the image layer with the highest quality or resolution that has been successfully decoded, successfully received by the decoder, or will be decoded in the previous n-th frame image.
  • the feedback from the decoder usually also reflects the network transmission status, that is, which image layer transmission demand and bit rate can be satisfied by the current network status.
  • the coding layer uses the reconstructed image corresponding to such an image layer as a reference frame to perform inter-frame coding on the base layer, which provides a good reference basis for the relevant area (such as static area) of the image to be coded, and can improve the base layer coding.
  • the quality of the obtained code stream can also improve the quality or resolution of the image reconstructed based on the code stream, and even improve the quality or resolution of the reconstructed image obtained by decoding the base layer code stream by the decoder. Improve the quality or resolution of the current image frame as a whole.
  • Step 203 Perform inter-frame coding on the first enhancement layer according to the second reference frame to obtain a code stream of the first enhancement layer, where the first enhancement layer is any one of the at least one enhancement layer.
  • the first enhancement layer is any one of the at least one enhancement layer, the first image layer is one of the base layer and the at least one enhancement layer, and the quality or resolution of the first image layer is lower than that of the first enhancement layer or Resolution.
  • the higher image layer can be coded with reference to the reconstructed image corresponding to the lower image layer.
  • the image to be coded has a basic layer and 3 enhancement layers, the layer number of the base layer is 0, and the enhancement layer is in the order of quality or resolution from low to high, and the layer numbers are 1, 2, and 3 respectively.
  • the reference frame when encoding enhancement layer 1 is the reconstructed image corresponding to base layer 0
  • the reference frame when encoding enhancement layer 2 is the reconstructed image corresponding to enhancement layer 1 or the reconstructed image corresponding to base layer 0
  • the reference frame when encoding enhancement layer 3 is the reconstructed image corresponding to enhancement layer 2 or the reconstructed image corresponding to enhancement layer 1 or the reconstructed image corresponding to base layer 0.
  • this application does not specifically limit the reconstructed image corresponding to the image layer in which the enhancement layer specifically refers to the same frame of image. .
  • the enhancement layer needs to refer to the reconstructed image corresponding to the same layer image layer of the previous nth frame image and the reconstructed image corresponding to the lower image layer of the same frame image at the same time, namely
  • the relevant area to be coded such as the static area
  • the relative processing of the two reference frames increases the amount of calculation.
  • the reference frame of the enhancement layer can only be the reconstructed image corresponding to the same layer image layer of the previous nth frame image and the reconstructed image corresponding to the lower image layer of the same frame image, which results in the quality or resolution of the enhancement layer being limited .
  • the image referenced by the base layer encoding is the quality or resolution of the previous n-th frame image that has been successfully decoded, received or will be decoded by the decoding end.
  • the image layer with the highest rate which has improved the quality or resolution of the base layer, and further improved the quality of the code stream encoded by the enhancement layer of the reference base layer, and can also improve the quality or quality of the image reconstructed based on the code stream.
  • the resolution can even improve the quality or resolution of the reconstructed image obtained by decoding the code stream of the base layer by the decoder.
  • the enhancement layer itself is also directly or indirectly referenced to the base layer. Therefore, the quality of the code stream encoded by the enhancement layer can also be improved, and the quality or resolution of the image reconstructed based on the code stream can also be improved, or even It can also improve the quality or resolution of the reconstructed image obtained by decoding the code stream of the base layer by the decoder.
  • the high-level image layer of the same frame image uses the low-level image layer as the reference frame, which can further be the occluded area. Provide a reference to ultimately improve the higher-level image quality or resolution.
  • Step 204 Send the code stream of the base layer and the code stream of at least one enhancement layer to the decoding end.
  • the code stream of the base layer carries coding reference information, and the coding reference information includes the frame sequence number and the layer sequence number of the above-mentioned first reference frame.
  • the encoding end can pack the code stream of the base layer and the code stream of at least one enhancement layer together and send it to the decoding end, or the code stream of the base layer and the code stream of at least one enhancement layer can be packaged separately according to the image layer and sent to the decoding end. At the end, this application does not specifically limit this.
  • the encoding end sends the frame sequence number and layer sequence number of the reference frame used when encoding the base layer to the decoding end, and the decoding end can directly obtain the reconstructed image of the corresponding image layer as a reference image when performing inter-frame decoding.
  • the encoding end After the encoding end sends the above code stream, it will start a timer and monitor the feedback information from the decoding end within a set time period for subsequent image frames to determine the base layer reference frame during encoding.
  • the encoding end obtains the image layer of the image frame with the highest quality or resolution that the decoding end can obtain based on the feedback information from the decoding end. Therefore, the quality or resolution of the base layer can be improved if the bit rate is required, and the enhancement layer of the same frame image is encoded with reference to the reconstructed image of the lower layer, which can improve the quality or resolution of the current image frame as a whole.
  • the base layer is inter-encoded according to the third reference frame, and the third reference frame It is the reference frame of the base layer of the previous frame of the image to be encoded.
  • the encoder does not receive the feedback information from the decoder within a set time period when monitoring the feedback information
  • the base layer of the current image frame can be encoded by referring to the reference frame of the base layer of the previous image frame. Since the changes between adjacent image frames in the video are very small, even if the latest feedback information cannot be received due to network factors, you can refer to the previous image without causing too much impact on the quality or resolution of the current image frame Impact.
  • intra-frame coding is performed on the base layer.
  • the base layer of the current image frame can also be encoded by means of intra-frame encoding. In this way, the intra-frame coding method will not affect the quality or resolution of the base layer, thereby ensuring the quality or resolution of the current image frame.
  • Fig. 3 is a flowchart of an embodiment of an image decoding method according to the present application.
  • the process 300 may be executed by the decoder of the target device.
  • the process 300 is described as a series of steps or operations. It should be understood that the process 300 may be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in FIG. 3.
  • the method of this embodiment may include:
  • Step 301 Receive the code stream of the base layer of the image to be decoded and the code stream of at least one enhancement layer from the encoding end.
  • the decoding end receives the code stream of the base layer of the image to be decoded from the encoding end, or the code stream of the base layer and at least one enhancement layer, and the code stream of the base layer carries the coding reference information
  • the encoding reference information includes the frame number and layer number of the reference frame used when the encoding end encodes the base layer of the image (corresponding to the image to be decoded).
  • the image to be decoded can be an entire frame of image or one of the sub-images of the entire frame of image.
  • the encoding reference information further includes position information, and the position information is used to indicate the reference frame used by the encoder when encoding the base layer of the image (corresponding to the image to be decoded) The position in the entire frame of the image.
  • Step 302 Determine a first reference frame according to the frame sequence number and the layer sequence number, and perform inter-frame decoding on the code stream of the base layer according to the first reference frame to obtain a reconstructed image corresponding to the base layer.
  • the decoding end can directly obtain the reference frame of the base layer based on the information carried in the code stream, and perform inter-frame decoding on the base layer based on the reference frame.
  • Step 303 Perform inter-frame decoding on the code stream of the first enhancement layer according to the second reference frame to obtain a reconstructed image corresponding to the first enhancement layer, where the first enhancement layer is any one of the at least one enhancement layer.
  • the first enhancement layer is any one of the at least one enhancement layer
  • the second reference frame is the reconstructed image corresponding to the first image layer
  • the first image layer is one of the base layer and the at least one enhancement layer
  • the quality or resolution of the layer is lower than the quality or resolution of the first enhancement layer.
  • This application uses a decoder corresponding to the encoder to decode layer by layer starting from the base layer, and the reconstructed image corresponding to the lower layer is used as the reference frame of the higher image layer.
  • the reference frame of the higher image layer may be
  • the reconstructed image corresponding to the lower layer may also be the reconstructed image corresponding to the basic layer, or the reconstructed image corresponding to the lower layers, which is not specifically limited in this application.
  • the decoding end may decode the corresponding image layer in the manner indicated by the coding mode indication information.
  • the mode indicated by the coding mode indication information includes intra-frame decoding or inter-frame decoding.
  • the decoding end if the encoding end uses intra-frame encoding when encoding a certain image layer, then the decoding end also needs to use intra-frame decoding when decoding the image layer; if the encoding end encodes an image layer, it is based on a certain image layer.
  • the reference frame adopts inter-frame coding, so when decoding the image layer, the decoder also needs to adopt inter-frame decoding based on the reference frame.
  • the decoding end may obtain the image to be decoded according to the reconstructed image corresponding to the base layer and the reconstructed image corresponding to at least one enhancement layer.
  • Step 304 Send feedback information to the encoding end.
  • the feedback information includes a second frame sequence number and a second layer sequence number, the second frame sequence number corresponds to the image to be decoded, and the second layer sequence number corresponds to the base layer of the image to be decoded and the image layer with the highest quality or resolution among at least one enhancement layer .
  • the decoding end can send feedback information related to the image to be decoded to the encoding end. As described in the above embodiment, the feedback information at this time is used for the encoding end to determine the encoding of subsequent image frames The reference frame of the base layer.
  • the frame number in the above feedback information corresponds to the frame number of the image to be decoded.
  • the layer number corresponds to the image layer with the highest quality or resolution successfully decoded from the bitstream of the base layer of the image to be decoded and the bitstream of at least one enhancement layer; or, the layer number corresponds to the bitstream and the bitstream from the base layer of the image to be decoded.
  • the image layer with the highest quality or resolution successfully received in the code stream of at least one enhancement layer; or, the layer sequence number corresponds to the currently determined code stream from the base layer of the image to be decoded and the code stream of at least one enhancement layer.
  • the image layer with the highest decoded quality or resolution corresponds to the frame number of the image to be decoded.
  • the layer sequence number corresponds to one of successfully decoded, successfully received, or about to be decoded. It is related to the priority agreement between the encoding end and the decoding end or the way set in advance, or is related to the processing capability of the decoding end. I won't repeat them here.
  • the decoding end when the reception of the code stream of the base layer and the code stream of at least one enhancement layer fails, the decoding end may carry identification information used to indicate the reception failure in the feedback information; or, when the base layer When the decoding of the code stream and/or the code stream of at least one enhancement layer fails, the decoding end may carry identification information used to indicate the decoding failure in the feedback information.
  • the decoder when the feedback information includes the frame numbers and layer numbers of all image layers that have been successfully decoded, about to be decoded, or successfully received, the decoder can buffer the reconstructed images corresponding to all image layers of the image to be decoded; Or, when the feedback information includes the frame sequence number and layer sequence number of the image layer with the highest resolution or the quality of successfully decoded, about to be decoded, or successfully received, the decoder can only buffer the quality of the image to be decoded that is successfully decoded, about to be decoded or received Or the reconstructed image corresponding to the image layer with the highest resolution.
  • Figure 4 shows an exemplary schematic diagram of the image encoding and decoding process.
  • the encoding end includes the encoding end reference frame establishment, encoding and code stream transmission
  • the decoding end includes code stream reception and feedback
  • the decoding end Processing units such as reference frame establishment and decoding.
  • the image encoding and decoding method provided in this application mainly involves the establishment of the encoding end/decoding end reference frame, encoding and decoding, and feedback.
  • Figure 5 shows an exemplary schematic diagram of image layered coding and decoding.
  • the source image is divided into a base layer and at least one enhancement layer (for example, enhancement layer 1 and enhancement layer 2), and these image layers are respectively
  • multiple bit streams including the bit stream of the base layer, the bit stream of the enhancement layer 1 and the bit stream of the enhancement layer 2 are generated, and these bit streams are transmitted to the decoding end through the network.
  • the decoding end decodes the code stream of the base layer, the code stream of the enhancement layer 1 and the code stream of the enhancement layer 2 layer by layer to obtain the reconstructed image corresponding to the base layer, the reconstructed image corresponding to the enhancement layer 1, and the reconstruction corresponding to the enhancement layer 2. image.
  • the decoding end can reconstruct images of different resolutions or qualities by decoding the foregoing partial or all code streams. The more code streams decoded, the higher the resolution or quality of the reconstructed image.
  • Figure 6 shows an exemplary schematic diagram of the encoding process at the encoding end.
  • the base layer of the source image is encoded by the base layer encoder to obtain the base layer code stream
  • the reference frame of the inter-frame encoding is the most The optimal reference frame.
  • the determination of the optimal reference frame is related to the feedback information received by the transceiver from the decoder.
  • the base layer encoder can also deconstruct the reconstructed image of the base layer.
  • the enhancement layer 1 of the source image is encoded by the enhancement layer 1 encoder to obtain the code stream of the enhancement layer 1.
  • the reference frame of the inter-frame coding is the reconstructed image of the base layer, and the enhancement layer 1 encoder can also deconstruct the enhancement layer 1.
  • the reconstructed image is the base layer encoder to obtain the base layer code stream.
  • the enhancement layer 2 of the source image is encoded by the enhancement layer 2 encoder to obtain the enhancement layer 2 code stream.
  • the reference frame of the inter-frame encoding is the reconstructed image of the enhancement layer 1, and the enhancement layer 2 encoder can also deconstruct the enhancement layer 2. The reconstructed image. And so on.
  • the code stream of the base layer, the code stream of the enhancement layer 1 and the code stream of the enhancement layer 2 are sent by the transceiver.
  • FIG. 7 shows an exemplary schematic diagram of the decoding process of the decoding end.
  • the transceiver of the decoding end receives the code stream of the base layer, the code stream of the enhancement layer 1 and the code stream of the enhancement layer 2 from the encoding end.
  • the base layer decoder performs inter-frame decoding on the code stream of the base layer to obtain the reconstructed image of the base layer, and the determination of the reference frame is based on the information carried in the code stream of the base layer.
  • the enhancement layer 1 decoder performs inter-frame decoding on the code stream of the enhancement layer 1 to obtain the reconstructed image of the enhancement layer 1, and the reference frame is the reconstructed image of the base layer.
  • the enhancement layer 2 decoder performs inter-frame decoding on the code stream of the enhancement layer 2 to obtain the reconstructed image of the enhancement layer 2, and the reference frame is the reconstructed image of the enhancement layer 1. And so on.
  • the decoding end can store the reconstructed image of the base layer, the reconstructed image of the enhancement layer 1, and the reconstructed image of the enhancement layer 2.
  • Figure 8 shows an exemplary schematic diagram of the image coding method of the present application.
  • a frame of image is divided into three sub-images (Slice0, Slice1, and Slice2), and each sub-image is divided into a base layer (BL ) And multiple enhancement layers (EL0, EL1,...) are coded separately.
  • the optimal reference frame of the base layer is updated in the unit of Slice.
  • the update signal is a new feedback signal, that is, the image layer with the highest quality or resolution that has been successfully decoded, successfully received, or will be decoded at the decoding end.
  • the update signal is the encoding reference information carried in the code stream of the base layer, that is, the image layer of the image frame used by the encoder during encoding. If all the image layers of the image frame are not received by the decoder or successfully decoded, the optimal reference frame of the image frame is not updated.
  • image frame 1 After image frame 1 is encoded, cache the reconstructed images corresponding to all image layers of all sub-images of image frame 1, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1.
  • the feedback signal includes the layer of the image layer with the highest quality or resolution that has been successfully decoded by the decoder, received or will be decoded. Serial number.
  • Each of the updated optimal reference frames is used as the reference frame of the base layer of each sub-image of image frame 2 for inter-coding of the base layer of each sub-image of image frame 2.
  • image frame 2 After image frame 2 is encoded, cache the reconstructed images corresponding to all image layers of all sub-images of image frame 2, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1.
  • the feedback signal includes the layer of the image layer with the highest quality or resolution that is successfully decoded by the decoder, received or will be decoded. Serial number.
  • the updated optimal reference frames are respectively used as the reference frames of the base layer of each sub-image of image frame 3 for inter-coding of the base layer of each sub-image of image frame 3.
  • the reconstructed images corresponding to all image layers of all sub-images of image frame 3 are cached, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1.
  • the feedback signal includes the layer of the image layer with the highest quality or resolution that is successfully decoded, received by the decoder, or is about to be decoded Serial number.
  • Each of the updated optimal reference frames is used as the reference frame of the base layer of each corresponding sub-image of image frame 4, for inter-coding of the base layer of each sub-image of image frame 4.
  • situation 1 If a feedback signal is sent for each layer of image frame 1, that is, a feedback signal is sent every time the bit stream of an image layer is successfully received, or every image layer is successfully decoded Send a feedback signal, etc., to buffer the reconstructed images corresponding to all image layers of all sub-images of image frame 1, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1; Case 2: If only one feedback signal is sent for image frame 1, only the quality of image frame 1 or the reconstructed image Slice0 EL1 corresponding to the image layer with the highest resolution will be stored, Slice1 EL0, Slice2 BL.
  • Each of the updated optimal reference frames is respectively used as the reference frame of the base layer of the sub-image corresponding to image frame 2 for inter-decoding of each base layer of image frame 2.
  • situation 1 If a feedback signal is sent for each layer of image frame 2, that is, a feedback signal is sent every time the bit stream of an image layer is successfully received, or every image layer is successfully decoded Send a feedback signal, and so on, buffer the reconstructed images corresponding to all image layers of all sub-images of image frame 2, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1; Case 2: If only one feedback signal is sent for image frame 1, then only the quality of image frame 2 or the reconstructed image Slice0 EL1 corresponding to the image layer with the highest resolution will be stored, Slice1 EL1. In this example, all streams of Slice2 are lost.
  • the optimal reference frame of the encoding end does not update Slice2, and the decoding end is notified through the code stream. At this time, the optimal reference frame of the decoding end does not update Slice2.
  • the updated optimal reference frames are respectively used as the reference frames of the base layer of the sub-image corresponding to the image frame 3 for inter-decoding of the base layers of the image frame 3.
  • situation 1 If a feedback signal is sent for each layer of image frame 3, that is, a feedback signal is sent every time the bit stream of an image layer is successfully received, or every image layer is successfully decoded Send a feedback signal, etc., to cache the reconstructed images corresponding to all image layers of all sub-images of image frame 3, namely Slice0 BL, Slice0 EL0, Slice0 EL1,..., Slice1 BL, Slice1 EL0, Slice1 EL1,..., Slice2 BL, Slice2 EL0, Slice2 EL1; Case 2: If only one feedback signal is sent for image frame 3, only the quality of image frame 3 or the reconstructed image Slice0 EL1 corresponding to the image layer with the highest resolution will be stored, Slice2 EL1. In this example, all streams of Slice1 are lost.
  • the optimal reference frame of the encoding end does not update Slice1, and the decoding end is notified through the code stream. At this time, the optimal reference frame of the decoding end does not update Slice1.
  • Each of the updated optimal reference frames is respectively used as the reference frame of the base layer of the sub-image corresponding to the image frame 4 for inter-frame decoding of the base layers of the image frame 4.
  • FIG. 9 is a schematic structural diagram of an embodiment of an encoding device of this application.
  • the device of this embodiment may include: a receiving module 901, an encoding module 902, a processing module 903, and a sending module 904.
  • the device in this embodiment may be an encoding device or an encoder used at the encoding end.
  • the receiving module 901 is configured to obtain an image to be encoded, and the image to be encoded is divided into a base layer and at least one enhancement layer;
  • the encoding module 902 is configured to: The reconstructed image corresponding to the frame number and the layer number indicated in the feedback information is determined to be the first reference frame, and the base layer is inter-encoded according to the first reference frame to obtain the code stream of the base layer;
  • the at least one enhancement layer is respectively encoded to obtain the code stream of the at least one enhancement layer;
  • the sending module 903 is configured to send the code stream of the base layer and the code stream of the at least one enhancement layer to the decoding end
  • the code stream of the base layer carries coding reference information, and the coding reference information includes the frame sequence number and the layer sequence number of the first reference frame.
  • the image to be encoded is an entire frame image or one of the sub-images of an entire frame image.
  • the feedback information when the image to be encoded is one of the sub-images of the entire frame image, the feedback information further includes position information, and the position information is used to indicate the sub-image to be encoded The position in the whole frame of image.
  • the frame sequence number indicates the first nth frame of the image to be encoded, and n is a positive integer; the layer sequence number corresponds to the first nth frame of the image to be encoded from the decoding end.
  • the processing module 902 is further configured to: when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure, according to the third reference frame Perform inter-frame encoding on the base layer, and the third reference frame is a reference frame of the base layer of the previous frame of the image to be encoded.
  • the processing module 902 is further configured to: when the feedback information is not received or the feedback information includes identification information for indicating reception failure or decoding failure, Perform intra-frame coding.
  • the encoding module 902 is specifically configured to perform inter-frame encoding on the first enhancement layer according to the second reference frame to obtain the code stream of the first enhancement layer, and the first enhancement layer is In any one of the at least one enhancement layer, the second reference frame is a reconstructed image corresponding to a first image layer, and the quality or resolution of the first image layer is lower than that of the any image layer Quality or resolution.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • the processing module 903 is configured to cache the reconstructed images respectively corresponding to the base layer and the at least one enhancement layer.
  • the processing module 903 is further configured to monitor the feedback information within a set time period; if the feedback information is received within the set time period, it is determined to receive The feedback information.
  • the device in this embodiment can be used to implement the technical solutions of the method embodiments shown in FIGS. 2 and 4-8, and its implementation principles and technical effects are similar, and will not be repeated here.
  • FIG. 10 is a schematic structural diagram of an embodiment of a decoding device of this application.
  • the device of this embodiment may include: a receiving module 1001, a decoding module 1002, a processing module 1003, and a sending module 1004.
  • the device in this embodiment may be a decoding device or a decoder for the decoding end.
  • the receiving module 1001 is configured to receive the code stream of the base layer of the image to be decoded and the code stream of at least one enhancement layer from the encoding end.
  • the code stream of the base layer carries coding reference information, and the coding reference information includes the first code stream.
  • the frame sequence number and the first layer sequence number; the decoding module 1002 is configured to determine a first reference frame according to the first frame sequence number and the first layer sequence number, and to determine the first reference frame for the base layer according to the first reference frame
  • the code stream is decoded between frames to obtain the reconstructed image corresponding to the base layer; the code stream of the at least one enhancement layer is decoded to obtain the reconstructed image respectively corresponding to the at least one enhancement layer;
  • the sending module 1004 Used to send feedback information to the encoding end, the feedback information includes a second frame sequence number and a second layer sequence number, the second frame sequence number corresponds to the image to be decoded, and the second layer sequence number corresponds to quality or resolution The highest image layer.
  • the image to be decoded is an entire frame image or one of the sub-images of the entire frame image.
  • the feedback information when the image to be decoded is one of the sub-images of the entire frame of image, the feedback information further includes position information, and the position information is used to indicate that the image to be decoded is The position in the entire frame image.
  • the second layer sequence number corresponds to the base layer of the image to be decoded and the image layer with the highest quality or resolution among at least one enhancement layer, specifically including: the second layer sequence number corresponds to the slave The code stream of the base layer of the image to be decoded and the image layer with the highest quality or resolution successfully decoded in the code stream of at least one enhancement layer; or, the second layer sequence number corresponds to the image layer from the base layer of the image to be decoded The image layer with the highest quality or resolution successfully received among the code streams of the at least one enhancement layer and the code streams of the at least one enhancement layer; or, the second layer sequence number corresponds to the currently determined code stream from the base layer of the image to be decoded And the image layer with the highest quality or resolution to be decoded in the code stream of at least one enhancement layer.
  • the feedback information when the reception of both the code stream of the base layer and the code stream of the at least one enhancement layer fails, the feedback information includes identification information used to indicate the reception failure; or When decoding of the code stream of the base layer and/or the code stream of the at least one enhancement layer fails, the feedback information includes identification information used to indicate the decoding failure.
  • the decoding module 1002 is further configured to obtain the image to be decoded according to the reconstructed image corresponding to the base layer and the reconstructed image corresponding to the at least one enhancement layer.
  • the decoding module 1002 is specifically configured to perform inter-frame decoding on the code stream of the first enhancement layer according to the second reference frame to obtain the reconstructed image corresponding to the first enhancement layer, and the The first enhancement layer is any one of the at least one enhancement layer, the second reference frame is a reconstructed image corresponding to the first image layer, and the quality or resolution of the first image layer is lower than that of the first image layer. The quality or resolution of an enhancement layer.
  • the first image layer is an image layer one layer lower than the first enhancement layer; or, the first image layer is the base layer.
  • the processing module 1003 is configured to cache the replays corresponding to all image layers when the feedback information includes frame numbers and layer numbers of all image layers that have been successfully decoded, about to be decoded, or successfully received. Structure image; or, when the feedback information includes the frame sequence number and layer sequence number of the image layer with the highest quality or resolution that is successfully decoded, about to be decoded, or successfully received, buffer the quality of the successfully decoded, about to be decoded, or successfully received quality or The reconstructed image corresponding to the image layer with the highest resolution.
  • the decoding module 1002 is further configured to adopt the coding mode when the code stream of the base layer and/or the code stream of the at least one enhancement layer includes coding mode indication information
  • the corresponding image layer is decoded in a manner indicated by the indication information, and the manner indicated by the encoding manner indication information includes intra-frame decoding or inter-frame decoding.
  • the device in this embodiment can be used to implement the technical solutions of the method embodiments shown in FIGS. 3-8, and its implementation principles and technical effects are similar, and will not be repeated here.
  • the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the processor can be a general-purpose processor, digital signal processor (digital signal processor, DSP), application-specific integrated circuit (ASIC), field programmable gate array (field programmable gate array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • dynamic RAM dynamic RAM
  • DRAM dynamic random access memory
  • synchronous dynamic random access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory serial DRAM, SLDRAM
  • direct rambus RAM direct rambus RAM
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供一种图像编解码方法和装置。本申请图像编解码方法,包括:获取待编码图像,待编码图像被划分成基本层和至少一个增强层;当接收到解码端发送的反馈信息时,将反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据第一参考帧对基本层进行帧间编码得到基本层的码流;对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;向解码端发送基本层的码流和至少一个增强层的码流,基本层的码流携带编码参考信息。本申请可以提高当前图像帧的质量或分辨率。

Description

图像编解码方法和装置 技术领域
本申请涉及图像编解码技术,尤其涉及一种图像编解码方法和装置。
背景技术
无线投屏技术是指将具有强处理能力的设备产生的视频数据(例如,图形处理器(graphics processing unit,GPU)渲染的游戏画面)经过编码压缩,以无线传输的方式发送给处理能力较弱但显示效果好的设备(例如,电视、虚拟现实(virtual reality,VR)头盔等)进行显示的技术。采用无线投屏技术的应用,例如,游戏投屏,VR眼镜等,具有交互的特点,因此需要极低的传输延迟。而为了避免数据丢失导致的图像质量问题,抗干扰性也是这类应用的一个重要要求。另外,由于数据量越大,传输功耗越大,因此,提高视频压缩效率,降低传输功耗也很重要。
可伸缩视频编码(scalable video coding,SVC)协议将源视频中的图像帧编码为多个图像层,该多个图像层对应不同的质量或分辨率,并且多个图像层之间具有参考关系。传输时,按照基本层、较低质量/较小分辨率图像层到较高质量/较大分辨率图像层的顺序依次传输相关数据。解码器收到的一帧图像的图像层数据越多,重构图像的质量越好。该技术可以更容易地使传输的码率匹配变化的带宽,无须切换码流也避免了因切换码流造成的延迟。
但是,上述技术针对各个图像层确定编码的参考帧需要较大的计算量,同时存在随着图像层的丢失而导致重构图像的质量下降的问题。
发明内容
本申请提供一种图像编解码方法和装置,以提高当前图像帧的质量或分辨率。
第一方面,本申请提供一种图像编码方法,包括:获取待编码图像,所述待编码图像被划分成基本层和至少一个增强层;当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据所述第一参考帧对所述基本层进行帧间编码得到所述基本层的码流;对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;向所述解码端发送所述基本层的码流和所述至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括所述第一参考帧的帧序号和层序号。
在已有方案(例如,SVC协议,可伸缩高效视频编码(scalable high-efficiency video coding,SHVC)协议)中,基本层仅可参考前第n帧图像的基本层对应的重构图像,n为大于等于1的正整数,应当理解,该前第n帧图像表述在待编码图像之前的某一帧图像。但是该前第n帧图像中比基本层更高的图像层(例如任意一个增强层),其对应的重构图像的质量或分辨率是高于基本层对应的重构图像的,然而任意一个增强层对应的重构图像不能作为基本层的参考帧,导致基本层编码得到的码流的质量较低,基于此重构得到的图像的质量或分辨率也较低,甚至于解码端基于此解码得到的重构图像的质量或分辨率也较低。本申请中,编码端基于来自解码端的反馈信息,获取解码端能得到的质量或分辨率最 高的图像帧的图像层,并将该图像层对应的重构图像作为基本层的参考帧,即编码端在编码基本层时,帧间编码所参考的图像是其前第n帧图像中被解码端成功解码、成功接收或即将解码到的质量或分辨率最高的图像层对应的重构图像。该图像层亦是符合网络传输状态和码率需求、且经由解码端反馈的最高层级的图像层了。因此编码层将这样的图像层对应的重构图像作为参考帧对基本层进行帧间编码,可以提高基本层编码得到的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率,进而整体提高当前图像帧的质量或分辨率。
另外,在已有方案(例如,SVC协议,SHVC协议)中,并不需要对每一帧图像或子图像进行反馈,由此可能产生图像错误和误差传递的问题,需要使用定期插入帧内编码帧的方法进行定期修正。本申请中,解码端可以对每一帧图像或子图像进行反馈,避免了误差传递,提升了图像质量。还避免了定期插入帧内编码帧,从而降低了码率。
在一种可能的实现方式中,所述待编码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待编码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待编码子图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述帧序号指示所述待编码图像的前第n帧图像,n为正整数;所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功解码的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功接收到的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端确定出的从所述待编码图像的前第n帧图像的码流中即将解码的质量或分辨率最高的图像层。
在已有方案(例如,SVC协议,可伸缩高效视频编码(scalable high-efficiency video coding,SHVC)协议)中,基本层仅可参考前第n帧图像的基本层对应的重构图像,n为大于等于1的正整数,应当理解,该前第n帧图像表述在待编码图像之前的某一帧图像。但是该前第n帧图像中比基本层更高的图像层(例如任意一个增强层),其对应的重构图像的质量或分辨率是高于基本层对应的重构图像的,然而任意一个增强层对应的重构图像不能作为基本层的参考帧,导致基本层编码得到的码流的质量较低,基于此重构得到的图像的质量或分辨率也较低,甚至于解码端基于此解码得到的重构图像的质量或分辨率也较低。本申请中,编码端基于来自解码端的反馈信息,获取解码端能得到的质量或分辨率最高的图像帧的图像层,并将该图像层对应的重构图像作为基本层的参考帧,即编码端在编码基本层时,帧间编码所参考的图像是其前第n帧图像中被解码端成功解码、成功接收或即将解码到的质量或分辨率最高的图像层对应的重构图像。解码端的反馈通常也反映了网络传输状态,即当前的网络状态可以满足哪一个图像层的传输需求及码率。因此编码层将这样的图像层对应的重构图像作为参考帧对基本层进行帧间编码,为待编码图像的相关区域(例如静态区域)提供了一个很好的参考基础,可以提高基本层编码得到的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率,进而整体提高当前图像帧的质量或分辨率。
在一种可能的实现方式中,所述获取待编码图像之后,还包括:当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对所述基本层进行帧间编码,所述第三参考帧为所述待编码图像的上一帧图像的基本层的参考帧。
本申请中,由于视频中相邻图像帧之间的变化很小,即使由于网络因素导致最新的反馈信息接收不到,也可以参考上一帧图像,而不会对当前图像帧的质量或分辨率造成太大的影响。
在一种可能的实现方式中,所述获取待编码图像之后,还包括:当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,对所述基本层进行帧内编码。
在一种可能的实现方式中,所述对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流,包括:根据第二参考帧对第一增强层进行帧间编码得到所述第一增强层的码流,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
在已有方案(例如,SVC协议,SHVC协议)中,增强层需要同时参考前第n帧图像的同层图像层对应的重构图像和同帧图像的低层图像层对应的重构图像,即任意一个增强层,要为其待编码的相关区域(例如静态区域)提供一个较好的参考基础,则需参考前第n帧图像的同层图像层对应的重构图像;要为其待编码的遮挡区域提供一个较好的参考,则需参考同帧图像的低层图像层对应的重构图像。而两个参考帧的相关处理过程增加了计算量。而且增强层的参考帧只能是前第n帧图像的同层图像层对应的重构图像和同帧图像的低层图像层对应的重构图像,这就导致增强层的质量或分辨率受限制。本申请中,对任意一个增强层,若是参考基本层,如上所述,基本层编码所参考的图像是其前第n帧图像中被解码端成功解码、成功接收或即将解码到的质量或分辨率最高的图像层,这已经提升了基本层的质量或分辨率,进而也提升了参考基本层的增强层编码的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率。若是参考增强层,增强层本身也是直接参考或间接参考基本层的,因此也可以提升增强层编码的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率。因此,在基本层编码时已经为待编码的相关区域(例如静态区域)提供了一个很好的参考基础之上,同帧图像的高层图像层采用低层图像层作为参考帧,进一步可以为遮挡区域提供参考,最终提高较高层次的图像质量或分辨率。另外,增强层只参考同一帧图像的低层图像层对应的重构图像,减少了计算量。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,基础层和低级增强层使用低速率的MCS可以使得信道状况较差的用户设备获得基本的视频服务,而高级增强层使用高速率的MCS可以使得信道状况好的用户设备得到更高质量、更高分辨率的视频服务。
在一种可能的实现方式中,所述对所述至少一个增强层分别进行编码得到所述至少一 个增强层的码流的过程中,所述方法还包括:缓存所述基本层和所述至少一个增强层分别对应的重构图像。
在一种可能的实现方式中,所述当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧之前,还包括:在设定的时长内监测所述反馈信息;若在所述设定的时长内接收到所述反馈信息,则确定接收到所述反馈信息。
本申请中,编码端若是没有在设定的时长内收到反馈信息,则认为没有收到反馈信息,此时编码端不会再继续监测,一方面避免了不必要的等待,降低消耗,另一方面也可以避免将收到的失效反馈信息作为有用信息处理,从而导致编码端对参考帧的错误判断。
第二方面,本申请提供一种图像解码方法,包括:接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括第一帧序号和第一层序号;根据所述第一帧序号和所述第一层序号确定第一参考帧,并根据所述第一参考帧对所述基本层的码流进行帧间解码得到所述基本层对应的重构图像;对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像;向所述编码端发送反馈信息,所述反馈信息包括第二帧序号和第二层序号,所述第二帧序号对应所述待解码图像,所述第二层序号对应待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层。
在一种可能的实现方式中,所述待解码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待解码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待解码图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层,具体包括:所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,所述第二层序号对应当前确定出的从所述待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,还包括:当所述基本层的码流和所述至少一个增强层的码流均接收失败时,所述反馈信息包括用于指示接收失败的标识信息;或者,当所述基本层的码流和/或所述至少一个增强层的码流解码失败时,所述反馈信息包括用于指示解码失败的标识信息。
在一种可能的实现方式中,所述向所述编码端发送反馈信息之后,还包括:根据所述基本层对应的重构图像和所述至少一个增强层对应的重构图像得到所述待解码图像。
在一种可能的实现方式中,所述对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像,包括:根据第二参考帧对第一增强层的码流进行帧间解码得到所述第一增强层对应的重构图像,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨 率低于所述第一增强层的质量或分辨率。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,当所述反馈信息包括成功解码、即将解码或者成功接收的所有图像层的帧序号和层序号时,缓存所有图像层对应的重构图像;或者,当所述反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,缓存所述成功解码、即将解码或者成功接收的质量或分辨率最高的图像层对应的重构图像。
在一种可能的实现方式中,所述接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流之后,还包括:当所述基本层的码流和/或所述至少一个增强层的码流包括编码方式指示信息时,采用所述编码方式指示信息指示的方式对对应的图像层进行解码,所述编码方式指示信息指示的方式包括帧内解码或者帧间解码。
第三方面,本申请提供一种编码装置,包括:接收模块,用于获取待编码图像,所述待编码图像被划分成基本层和至少一个增强层;编码模块,用于当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据所述第一参考帧对所述基本层进行帧间编码得到所述基本层的码流;对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;发送模块,用于向所述解码端发送所述基本层的码流和所述至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括所述第一参考帧的帧序号和层序号。
在一种可能的实现方式中,所述待编码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待编码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待编码子图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述帧序号指示所述待编码图像的前第n帧图像,n为正整数;所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功解码的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功接收到的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端确定出的从所述待编码图像的前第n帧图像的码流中即将解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,所述处理模块,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对所述基本层进行帧间编码,所述第三参考帧为所述待编码图像的上一帧图像的基本层的参考帧。
在一种可能的实现方式中,所述处理模块,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,对所述基本层进行帧内编码。
在一种可能的实现方式中,所述编码模块,具体用于根据第二参考帧对第一增强层进行帧间编码得到所述第一增强层的码流,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低 于所述第一增强层的质量或分辨率。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,还包括:处理模块,用于缓存所述基本层和所述至少一个增强层分别对应的重构图像。
在一种可能的实现方式中,所述处理模块,还用于在设定的时长内监测所述反馈信息;若在所述设定的时长内接收到所述反馈信息,则确定接收到所述反馈信息。
第四方面,本申请提供一种解码装置,包括:接收模块,用于接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括第一帧序号和第一层序号;解码模块,用于根据所述第一帧序号和所述第一层序号确定第一参考帧,并根据所述第一参考帧对所述基本层的码流进行帧间解码得到所述基本层对应的重构图像;对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像;发送模块,用于向所述编码端发送反馈信息,所述反馈信息包括第二帧序号和第二层序号,所述第二帧序号对应所述待解码图像,所述第二层序号对应待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层。
在一种可能的实现方式中,所述待解码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待解码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待解码图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层,具体包括:所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,所述第二层序号对应当前确定出的从所述待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,当所述基本层的码流和所述至少一个增强层的码流均接收失败时,所述反馈信息包括用于指示接收失败的标识信息;或者,当所述基本层的码流和/或所述至少一个增强层的码流解码失败时,所述反馈信息包括用于指示解码失败的标识信息。
在一种可能的实现方式中,所述解码模块,还用于根据所述基本层对应的重构图像和所述至少一个增强层对应的重构图像得到所述待解码图像。
在一种可能的实现方式中,所述解码模块,具体用于根据第二参考帧对所述任意一层图像层的码流进行帧间解码得到第一增强层对应的重构图像,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述任意一层图像层的质量或分辨率。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,还包括:处理模块,用于当所述反馈信息包括成功解码、即将解码或者成功接收的所有图像层的帧序号和层序号时,缓存所有图像层对应的重构图像;或者,当所述反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,缓存所述成功解码、即将解码或者成功接收的质量或分辨率最高的图像层对应的重构图像。
在一种可能的实现方式中,所述解码模块,还用于当所述基本层的码流和/或所述至少一个增强层的码流包括编码方式指示信息时,采用所述编码方式指示信息指示的方式对对应的图像层进行解码,所述编码方式指示信息指示的方式包括帧内解码或者帧间解码。
第五方面,本申请提供一种编码器,包括:处理器和传输接口;
所述处理器被配置为调用存储在存储器中的程序指令,以实现如上述第一方面任一项所述的方法。
第六方面,本申请提供一种解码器,包括:处理器和传输接口;
所述处理器被配置为调用存储在存储器中的程序指令,以实现如上述第二方面任一项所述的方法。
第七方面,本申请提供一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机或处理器上被执行时,使得所述计算机或所述处理器执行上述第一至二方面中任一项所述的方法。
第八方面,本申请还提供一种计算机程序产品,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码在计算机或处理器上运行时,使得计算机或处理器执行上述第一至二方面中任一项所述的方法。
附图说明
图1A是用于实现本申请实施例的视频编码及解码系统10实例的框图;
图1B是用于实现本申请实施例的视频译码系统40实例的框图;
图2为本申请图像编码方法实施例的流程图;
图3为本申请图像解码方法实施例的流程图;
图4示出了图像编解码过程的一个示例性的示意图;
图5示出了图像分层编解码的一个示例性的示意图;
图6示出了编码端的编码流程的一个示例性的示意图;
图7示出了解码端的解码流程的一个示例性的示意图;
图8示出了本申请图像编码方法的一个示例性的示意图;
图9为本申请编码装置实施例的结构示意图;
图10为本申请解码装置实施例的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
本申请实施例所涉及的技术方案不仅可能应用于现有的视频编码标准中(如H.264/先进视频编码(advanced video coding,AVC)、H.265/高效率视频编码(high efficiency video coding,HEVC)等标准),还可能应用于未来的视频编码标准中(如H.266/多功能视频编码(versatile video coding,VVC)标准)。本申请的实施方式部分使用的术语仅用于对本申请的具体实施例进行解释,而非旨在限定本申请。下面先对本申请实施例可能涉及的一些概念进行简单介绍。
在视频编码领域,术语“图片(picture)”、“帧(frame)”或“图像(image)”可以用作同义词。视频编码在源侧执行,通常包括处理(例如,通过压缩)原始视频图片以减少表示该视频图片所需的数据量,从而更高效地存储和/或传输。视频解码在目的地侧执行,通常包括相对于编码器作逆处理,以重构视频图片。实施例涉及的视频图片“编码”应理解为涉及视频序列的“编码”或“解码”。编码部分和解码部分的组合也称为编解码。
下面描述本申请实施例所应用的系统架构。参见图1A,图1A是用于实现本申请实施例的视频编码及解码系统10实例的框图。如图1A所示,视频编码及解码系统10可包括源设备12和目的设备14,源设备12产生经过编码的视频数据,因此,源设备12可被称为视频编码装置。目的设备14可对由源设备12所产生的经编码的视频数据进行解码,因此,目的设备14可被称为视频解码装置。源设备12或目的设备14的各种实施方案可包含一或多个处理器以及耦合到所述一或多个处理器的存储器。所述存储器可包含但不限于随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,如本文所描述。源设备12和目的设备14可以包括各种装置,包含桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机、无线通信设备或其类似者。
虽然图1A将源设备12和目的设备14绘示为单独的设备,但设备实施例也可以同时包括源设备12和目的设备14或同时包括两者的功能性,即源设备12或对应的功能性以及目的设备14或对应的功能性。在此类实施例中,可以使用相同硬件和/或软件,或使用 单独的硬件和/或软件,或其任何组合来实施源设备12或对应的功能性以及目的设备14或对应的功能性。
源设备12和目的设备14之间可通过链路13进行通信连接,目的设备14可经由链路13从源设备12接收经编码视频数据。链路13可包括能够将经编码视频数据从源设备12移动到目的设备14的一或多个媒体或装置。在一个实例中,链路13可包括使得源设备12能够实时将经编码视频数据直接发射到目的设备14的一或多个通信媒体。在此实例中,源设备12可根据通信标准(例如无线通信协议)来调制经编码视频数据,且可将经调制的视频数据发射到目的设备14。所述一或多个通信媒体可包含无线和/或有线通信媒体,例如射频(RF)频谱或一或多个物理传输线。所述一或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)。所述一或多个通信媒体可包含路由器、交换器、基站或促进从源设备12到目的设备14的通信的其它设备。
源设备12包括编码器20,另外可选地,源设备12还可以包括图片源16、图片预处理器18、以及通信接口22。具体实现形态中,所述编码器20、图片源16、图片预处理器18、以及通信接口22可能是源设备12中的硬件部件,也可能是源设备12中的软件程序。分别描述如下:
图片源16,可以包括或可以为任何类别的图片捕获设备,用于例如捕获现实世界图片,和/或任何类别的图片或评论(对于屏幕内容编码,屏幕上的一些文字也认为是待编码的图片或图像的一部分)生成设备,例如,用于生成计算机动画图片的计算机图形处理器,或用于获取和/或提供现实世界图片、计算机动画图片(例如,屏幕内容、虚拟现实(virtual reality,VR)图片)的任何类别设备,和/或其任何组合(例如,实景(augmented reality,AR)图片)。图片源16可以为用于捕获图片的相机或者用于存储图片的存储器,图片源16还可以包括存储先前捕获或产生的图片和/或获取或接收图片的任何类别的(内部或外部)接口。当图片源16为相机时,图片源16可例如为本地的或集成在源设备中的集成相机;当图片源16为存储器时,图片源16可为本地的或例如集成在源设备中的集成存储器。当所述图片源16包括接口时,接口可例如为从外部视频源接收图片的外部接口,外部视频源例如为外部图片捕获设备,比如相机、外部存储器或外部图片生成设备,外部图片生成设备例如为外部计算机图形处理器、计算机或服务器。接口可以为根据任何专有或标准化接口协议的任何类别的接口,例如有线或无线接口、光接口。
图片预处理器18,用于接收原始图片数据17并对原始图片数据17执行预处理,以获取经预处理的图片19或经预处理的图片数据19。例如,图片预处理器18执行的预处理可以包括整修、色彩格式转换、调色或去噪。需要说明的是,对图片数据17执行预处理并不是本申请的必选处理过程,本申请对此不做具体限定。
编码器20(或称视频编码器20),用于接收经预处理的图片数据19,采用相关预测模式(如本文各个实施例中的预测模式)对经预处理的图片数据19进行处理,从而提供经编码图片数据21。在一些实施例中,编码器20可以用于执行后文所描述的各个实施例,以实现本申请所描述的图像编码方法在编码侧的应用。
通信接口22,可用于接收经编码图片数据21,并可通过链路13将经编码图片数据21传输至目的设备14或任何其它设备(如存储器),以用于存储或直接重构,所述其它 设备可为任何用于解码或存储的设备。通信接口22可例如用于将经编码图片数据21封装成合适的格式,例如数据包,以在链路13上传输。
目的设备14包括解码器30,另外可选地,目的设备14还可以包括通信接口28、图片后处理器32和显示设备34。分别描述如下:
通信接口28,可用于从源设备12或任何其它源接收经编码图片数据21,所述任何其它源例如为存储设备,存储设备例如为经编码图片数据存储设备。通信接口28可以用于藉由源设备12和目的设备14之间的链路13或藉由任何类别的网络传输或接收经编码图片数据21,链路13例如为直接有线或无线连接,任何类别的网络例如为有线或无线网络或其任何组合,或任何类别的私网和公网,或其任何组合。通信接口28可以例如用于解封装通信接口22所传输的数据包以获取经编码图片数据21。
通信接口28和通信接口22都可以配置为单向通信接口或者双向通信接口,以及可以用于例如发送和接收消息来建立连接、确认和交换任何其它与通信链路和/或例如经编码图片数据传输的数据传输有关的信息。
解码器30(或称为解码器30),用于接收经编码图片数据21并提供经解码图片数据31或经解码图片31。在一些实施例中,解码器30可以用于执行后文所描述的各个实施例,以实现本申请所描述的图像解码方法在解码侧的应用。
图片后处理器32,用于对经解码图片数据31(也称为经重构图片数据)执行后处理,以获得经后处理图片数据33。图片后处理器32执行的后处理可以包括:色彩格式转换、调色、整修或重采样,或任何其它处理,还可用于将将经后处理图片数据33传输至显示设备34。需要说明的是,对经解码图片数据31(也称为经重构图片数据)执行后处理并不是本申请的必选处理过程,本申请对此不做具体限定。
显示设备34,用于接收经后处理图片数据33以向例如用户或观看者显示图片。显示设备34可以为或可以包括任何类别的用于呈现经重构图片的显示器,例如,集成的或外部的显示器或监视器。例如,显示器可以包括液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light emitting diode,OLED)显示器、等离子显示器、投影仪、微LED显示器、硅基液晶(liquid crystal on silicon,LCoS)、数字光处理器(digital light processor,DLP)或任何类别的其它显示器。
虽然,图1A将源设备12和目的设备14绘示为单独的设备,但设备实施例也可以同时包括源设备12和目的设备14或同时包括两者的功能性,即源设备12或对应的功能性以及目的设备14或对应的功能性。在此类实施例中,可以使用相同硬件和/或软件,或使用单独的硬件和/或软件,或其任何组合来实施源设备12或对应的功能性以及目的设备14或对应的功能性。
本领域技术人员基于描述明显可知,不同单元的功能性或图1A所示的源设备12和/或目的设备14的功能性的存在和(准确)划分可能根据实际设备和应用有所不同。源设备12和目的设备14可以包括各种设备中的任一个,包含任何类别的手持或静止设备,例如,笔记本或膝上型计算机、移动电话、智能手机、平板或平板计算机、摄像机、台式计算机、机顶盒、电视机、相机、车载设备、显示设备、数字媒体播放器、视频游戏控制台、视频流式传输设备(例如内容服务服务器或内容分发服务器)、广播接收器设备、广播发射器设备等,并可以不使用或使用任何类别的操作系统。
编码器20和解码器30都可以实施为各种合适电路中的任一个,例如,一个或多个微处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件实施所述技术,则设备可将软件的指令存储于合适的非暂时性计算机可读存储介质中,且可使用一或多个处理器以硬件执行指令从而执行本公开的技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可视为一或多个处理器。
在一些情况下,图1A中所示视频编码及解码系统10仅为示例,本申请的技术可以适用于不必包含编码和解码设备之间的任何数据通信的视频编码设置(例如,视频编码或视频解码)。在其它实例中,数据可从本地存储器检索、在网络上流式传输等。视频编码设备可以对数据进行编码并且将数据存储到存储器,和/或视频解码设备可以从存储器检索数据并且对数据进行解码。在一些实例中,由并不彼此通信而是仅编码数据到存储器和/或从存储器检索数据且解码数据的设备执行编码和解码。
参见图1B,图1B是用于实现本申请实施例的视频译码系统40实例的框图。视频译码系统40可以实现本申请实施例的各种技术的组合。在所说明的实施方式中,视频译码系统40可以包含成像设备41、编码器20、解码器30(和/或藉由处理单元46的逻辑电路47实施的视频编/解码器)、天线42、一个或多个处理器43、一个或多个存储器44和/或显示设备45。
如图1B所示,成像设备41、天线42、处理单元46、逻辑电路47、编码器20、解码器30、处理器43、存储器44和/或显示设备45能够互相通信。如所论述,虽然用编码器20和解码器30绘示视频译码系统40,但在不同实例中,视频译码系统40可以只包含编码器20或只包含解码器30。
在一些实例中,天线42可以用于传输或接收视频数据的经编码比特流。另外,在一些实例中,显示设备45可以用于呈现视频数据。在一些实例中,处理单元46可以包含专用集成电路(application-specific integrated circuit,ASIC)逻辑、图形处理器、通用处理器等。视频译码系统40也可以包含可选的处理器43,该可选处理器43类似地可以包含ASIC逻辑、图形处理器、通用处理器等。在一些实例中,处理单元46可以通过硬件实施,如视频编码专用硬件等,处理器43可以通过通用软件、操作系统等实施。另外,存储器44可以是任何类型的存储器,例如易失性存储器(例如,静态随机存取存储器(static random access memory,SRAM)、动态随机存储器(dynamic random access memory,DRAM)等)或非易失性存储器(例如,闪存等)等。在非限制性实例中,存储器44可以由超速缓存内存实施。在一些实例中,逻辑电路47可以访问存储器44(例如用于实施图像缓冲器)。在其它实例中,逻辑电路47和/或处理单元46可以包含存储器(例如,缓存等)用于实施图像缓冲器等。
在一些实例中,通过逻辑电路实施的编码器20可以包含(例如,通过处理单元46或存储器44实施的)图像缓冲器和(例如,通过处理单元46实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过逻辑电路47实施的编码器20,以实施本文中所描述的任何其它编码器系统或子系统所论述的各种模块。逻辑电路可以用于执行本文所论述的各种操作。
在一些实例中,解码器30可以以类似方式通过逻辑电路47实施,以实施本文中所描述的任何其它解码器系统或子系统所论述的各种模块。在一些实例中,逻辑电路实施的解码器30可以包含(通过处理单元2820或存储器44实施的)图像缓冲器和(例如,通过处理单元46实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过逻辑电路47实施的解码器30,以实施本文中所描述的任何其它解码器系统或子系统所论述的各种模块。
在一些实例中,天线42可以用于接收视频数据的经编码比特流。如所论述,经编码比特流可以包含本文所论述的与编码视频帧相关的数据、指示符、索引值、模式选择数据等,例如与编码分割相关的数据(例如,变换系数或经量化变换系数,(如所论述的)可选指示符,和/或定义编码分割的数据)。视频译码系统40还可包含耦合至天线42并用于解码经编码比特流的解码器30。显示设备45用于呈现视频帧。
应理解,本申请实施例中对于参考编码器20所描述的实例,解码器30可以用于执行相反过程。关于信令语法元素,解码器30可以用于接收并解析这种语法元素,相应地解码相关视频数据。在一些例子中,编码器20可以将语法元素熵编码成经编码视频比特流。在此类实例中,解码器30可以解析这种语法元素,并相应地解码相关视频数据。
需要说明的是,本申请实施例中的编码器20和解码器30可以是例如H.263、H.264、HEVC、动态图像专家组(moving picture experts group,MPEG)-2、MPEG-4、VP8、VP9等视频标准协议或者下一代视频标准协议(如H.266等)对应的编/解码器。
下面详细阐述本申请实施例的方案:
图2为本申请图像编码方法实施例的流程图。该过程200可以由源设备的编码器执行。过程200描述为一系列的步骤或操作,应当理解的是,过程200可以以各种顺序执行和/或同时发生,不限于图2所示的执行顺序。如图2所示,本实施例的方法可以包括:
步骤201、获取待编码图像。
待编码图像为整帧图像或者整帧图像的其中一个子图像,具体可以参照上述对图像帧的相关描述,此处不再赘述。本申请中待编码图像被划分成基本层和至少一个增强层,该至少一个增强层按照质量或分辨率从低到高的顺序排列。
针对图像的分层可参照可伸缩视频编码(scalable video coding,SVC)协议,SVC协议根据需求将视频中的图像帧分割为一个基础层和多个增强层,基础层为用户提供最基本的图像质量、帧率和分辨率,增强层对图像质量进行完善,提供更多的图像分辨率、灰度、像素值等信息。图像层越多,得到的图像质量越高。当在通信网络中对经SVC编码的码流进行传播时,可以对不同的图像层采用不同的调制与编码策略(modulation and coding scheme,MCS),例如,基础层和低级增强层使用低速率的MCS可以使得信道状况较差的用户设备获得基本的视频服务,而高级增强层使用高速率的MCS可以使得信道状况好的用户设备得到更高质量、更高分辨率的视频服务。
步骤202、当接收到解码端发送的反馈信息时,将反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据第一参考帧对基本层进行帧间编码得到基本层的码流。
反馈信息是解码端在接收来自编码端的码流的过程中,基于码流的接收情况或者对码流的解码情况反馈给编码端的。基于网络传输时延和解码器的处理能力等因素,编码端在 处理当前图像(即待编码图像)的同时,解码端可能正在处理该当前图像(帧序号假设为m)的前第n帧图像(帧序号为m-n),n为1表示解码端可能正在处理的是该当前图像的前一帧图像(帧序号为m-1),n为2表示解码端可能正在处理的是该当前图像的前两帧图像(帧序号为m-2),以此类推。为了让编码端得到解码端的最新处理情况,解码端在发送给编码端的反馈信息中可以携带上上述前第n帧图像的信息,包括其帧序号(m-n),以及层序号。
在一种可能的实现方式中,编码端和解码端通过预先约定或者提前设定的方式确定反馈信息中携带的层序号是以成功解码为准的,此时层序号对应解码端从前第n帧图像(帧序号为m-n)的码流中成功解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,编码端和解码端通过预先约定或者提前设定的方式确定反馈信息中携带的层序号是以成功接收为准的,此时层序号对应解码端从前第n帧图像(帧序号为m-n)的码流中成功接收到的质量或分辨率最高的图像层。
在一种可能的实现方式中,解码端可以根据接收到的码流的大小判断自身能够在预定时间内完成得解码量,根据该判断结果确定自身即将可以解码的图像层,此时层序号对应解码端确定出的从前第n帧图像(帧序号为m-n)的码流中即将解码的质量或分辨率最高的图像层。也即,将解码的图像层表示解码端能够在预定时间内成功解码该码流但还未解码的图像层;也即,解码端在接收到码流之后,会结合码流大小和自身解码能力进行判断,当判断自身能够在预定时间内成功解码该码流时,即可向编码端发送反馈信息,还无需等到解码成功之后再发送反馈信息。
在一种可能的实现方式中,当待编码图像为整帧图像的其中一个子图像时,反馈信息还包括位置信息,该位置信息用于指示待编码图像在整帧图像中的位置。例如,整帧图像的像素为64×64,其被分成4个32×32且不交叉的子图像,其位置分别位于整帧图像的左上、右上、左下或右下,位置信息用于指示待编码图像是前述四个子图像中的哪一个。
在一种可能的实现方式中,当待编码图像为整帧图像的其中一个子图像时,反馈信息还包括反映出解码端反馈的图像层在整帧图像中的位置的信息,例如,片(Slice)(子图像是一个Slice时)的起始位置,子图像的序号(子图像大小已经预先约定),子图像的宽或高等。
本申请中,编码端可以在设定的时长内监测反馈信息,若在设定的时长内接收到反馈信息,则确定接收到反馈信息。即编码端可以设置一个时长,从发出一帧图像的码流后开始计时,如果在该时长内接收到反馈信息,则认为接收到反馈信息,而如果在该时长内没有收到反馈信息,则认为没有接收到反馈信息。
编码端对待编码图像的基本层和至少一个增强层分别进行编码后,会按照各层编码对应的方法进行解码,得到各层对应的重构图像。这些重构图像将作为后续图像的参考帧被缓存起来。
在已有方案(例如,SVC协议,可伸缩高效视频编码(scalable high-efficiency video coding,SHVC)协议)中,基本层仅可参考前第n帧图像的基本层对应的重构图像,n为大于等于1的正整数,应当理解,该前第n帧图像表述在待编码图像之前的某一帧图像。但是该前第n帧图像中比基本层更高的图像层(例如任意一个增强层),其对应的重构图像的质量或分辨率是高于基本层对应的重构图像的,然而任意一个增强层对应的重构图像 不能作为基本层的参考帧,导致基本层编码得到的码流的质量较低,基于此重构得到的图像的质量或分辨率也较低,甚至于解码端基于此解码得到的重构图像的质量或分辨率也较低。本申请中,编码端基于来自解码端的反馈信息,获取解码端能得到的质量或分辨率最高的图像帧的图像层,并将该图像层对应的重构图像作为基本层的参考帧,即编码端在编码基本层时,帧间编码所参考的图像是其前第n帧图像中被解码端成功解码、成功接收或即将解码到的质量或分辨率最高的图像层对应的重构图像。解码端的反馈通常也反映了网络传输状态,即当前的网络状态可以满足哪一个图像层的传输需求及码率。因此编码层将这样的图像层对应的重构图像作为参考帧对基本层进行帧间编码,为待编码图像的相关区域(例如静态区域)提供了一个很好的参考基础,可以提高基本层编码得到的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率,进而整体提高当前图像帧的质量或分辨率。
步骤203、根据第二参考帧对第一增强层进行帧间编码得到第一增强层的码流,第一增强层为至少一个增强层中的任意一层。
第一增强层为至少一个增强层中的任意一层,第一图像层为基本层和至少一个增强层的其中之一,第一图像层的质量或分辨率低于第一增强层的质量或分辨率。在一帧待编码图像内部,高一层的图像层编码时可以参考比其低的图像层对应的重构图像。例如,待编码图像共有一个基本层和3个增强层,基本层的层序号为0,增强层按照质量或分辨率从低到高的顺序,其层序号分别为1、2和3。增强层1编码时的参考帧为基本层0对应的重构图像,增强层2编码时的参考帧为增强层1对应的重构图像或基本层0对应的重构图像,增强层3编码时的参考帧为增强层2对应的重构图像或增强层1对应的重构图像或基本层0对应的重构图像。只要符合高一层的图像层编码时可以参考比其低的图像层对应得重构图像的条件,本申请对于增强层具体参考同一帧图像的那一个图像层对应的重构图像不做具体限定。
在已有方案(例如,SVC协议,SHVC协议)中,增强层需要同时参考前第n帧图像的同层图像层对应的重构图像和同帧图像的低层图像层对应的重构图像,即任意一个增强层,要为其待编码的相关区域(例如静态区域)提供一个较好的参考基础,则需参考前第n帧图像的同层图像层对应的重构图像;要为其待编码的遮挡区域提供一个较好的参考,则需参考同帧图像的低层图像层对应的重构图像。而两个参考帧的相关处理过程增加了计算量。而且增强层的参考帧只能是前第n帧图像的同层图像层对应的重构图像和同帧图像的低层图像层对应的重构图像,这就导致增强层的质量或分辨率受限制。本申请中,对任意一个增强层,若是参考基本层,如上所述,基本层编码所参考的图像是其前第n帧图像中被解码端成功解码、成功接收或即将解码到的质量或分辨率最高的图像层,这已经提升了基本层的质量或分辨率,进而也提升了参考基本层的增强层编码的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率。若是参考增强层,增强层本身也是直接参考或间接参考基本层的,因此也可以提升增强层编码的码流的质量,还可以提高基于该码流重构得到的图像的质量或分辨率,甚至还可以提高解码端对基本层的码流解码得到的重构图像的质量或分辨率。因此,在基本层编码时已经为待编码的相关区域(例如静态区域)提供了一个很好的参考基础之上,同帧图像的高层图像层采用低层图像层作为参考帧,进一步可 以为遮挡区域提供参考,最终提高较高层次的图像质量或分辨率。
步骤204、向解码端发送基本层的码流和至少一个增强层的码流。
基本层的码流携带编码参考信息,该编码参考信息包括上述第一参考帧的帧序号和层序号。编码端可以将基本层的码流和至少一个增强层的码流打包在一起发送给解码端,也可以将基本层的码流和至少一个增强层的码流按图像层分开打包依次发送给解码端,本申请对此不做具体限定。编码端将编码基本层时采用的参考帧的帧序号和层序号发送给解码端,解码端在进行帧间解码时,可以据此直接获取对应的图像层的重构图像作为参考图像。
编码端在发送了上述码流后,会启动定时器,并在设定的时长内监测来自解码端的反馈信息,以供后续图像帧在编码时确定其基本层的参考帧。
在已有方案(例如,SVC协议,SHVC协议)中,并不需要对每一帧图像或子图像进行反馈,由此可能产生图像错误和误差传递的问题,需要使用定期插入帧内编码帧的方法进行定期修正。而本申请可以对每一帧图像或子图像进行反馈,避免了误差传递,提升了图像质量。还避免了定期插入帧内编码帧,从而降低了码率。
由此可见,本申请提供的图像编码方法中,编码端基于来自解码端的反馈信息,获取解码端能得到的质量或分辨率最高的图像帧的图像层,该图像层是最符合网络传输状态和码率需求的,因此可以提高基本层的质量或分辨率,而同帧图像的增强层是参考较低层的重构图像进行编码,可以整体提高当前图像帧的质量或分辨率。
在一种可能的实现方式中,当没有接收到反馈信息或者反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对基本层进行帧间编码,该第三参考帧为待编码图像的上一帧图像的基本层的参考帧。上述步骤202之前,如果编码端在监测反馈信息时,在设定的时长内没有接收到来自解码端的反馈信息,可以参照上一帧图像的基本层的参考帧来编码当前图像帧的基本层。由于视频中相邻图像帧之间的变化很小,即使由于网络因素导致最新的反馈信息接收不到,也可以参考上一帧图像,而不会对当前图像帧的质量或分辨率造成太大的影响。
在一种可能的实现方式中,当没有接收到反馈信息或者反馈信息包括用于指示接收失败或解码失败的标识信息时,对基本层进行帧内编码。同样的,上述步骤202之前,如果编码端在监测反馈信息时,在设定的时长内没有接收到来自解码端的反馈信息,也可以采用帧内编码的方式对当前图像帧的基本层进行编码。这样帧内编码的方式对基本层的质量或分辨率不会造成影响,进而确保当前图像帧的质量或分辨率。
图3为本申请图像解码方法实施例的流程图。该过程300可以由目的设备的解码器执行。过程300描述为一系列的步骤或操作,应当理解的是,过程300可以以各种顺序执行和/或同时发生,不限于图3所示的执行顺序。如图3所示,本实施例的方法可以包括:
步骤301、接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流。
与上述方法实施例的步骤204相对应,解码端接收来自编码端的待解码图像的基本层的码流,或者基本层和至少一个增强层的码流,并且基本层的码流携带编码参考信息,该编码参考信息包括编码端编码图像(对应待解码图像)的基本层时采用的参考帧的帧序号和层序号。待解码图像可以是整帧图像,也可以是整帧图像的其中一个子图像。可选的,当待解码图像为整帧图像的其中一个子图像时,编码参考信息还包括位置信息,位置信息用于指示编码端编码图像(对应待解码图像)的基本层时采用的参考帧在整帧图像中的位 置。
步骤302、根据帧序号和层序号确定第一参考帧,并根据第一参考帧对基本层的码流进行帧间解码得到基本层对应的重构图像。
解码端基于码流中携带的信息可以直接获取基本层的参考帧,基于该参考帧对基本层进行帧间解码。
步骤303、根据第二参考帧对第一增强层的码流进行帧间解码得到第一增强层对应的重构图像,第一增强层为至少一个增强层中的任意一层。
第一增强层为至少一个增强层中的任意一层,第二参考帧为第一图像层对应的重构图像,第一图像层为基本层和至少一个增强层的其中之一,第一图像层的质量或分辨率低于第一增强层的质量或分辨率。本申请使用与编码器对应的解码器,从基本层开始逐层解码,较低层对应的重构图像作为较高图像层的参考帧,需要说明的是,较高图像层的参考帧可以是低一层对应的重构图像,也可以是基本层对应的重构图像,还可以是低几层对应的重构图像,本申请对此不做具体限定。
在一种可能的实现方式中,当基本层的码流和/或至少一个增强层的码流包括编码方式指示信息时,解码端可以采用编码方式指示信息指示的方式对对应的图像层进行解码,该编码方式指示信息指示的方式包括帧内解码或者帧间解码。与编码端相对应,如果编码端在编码某图像层时,采用了帧内编码,那么解码端在解码该图像层时也需要采用帧内解码;如果编码端在编码某图像层时,基于某参考帧采用了帧间编码,那么解码端在解码该图像层时也需要基于该参考帧采用帧间解码。
本申请中解码端可以根据基本层对应的重构图像和至少一个增强层对应的重构图像得到待解码图像。
步骤304、向编码端发送反馈信息。
该反馈信息包括第二帧序号和第二层序号,该第二帧序号对应待解码图像,该第二层序号对应待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层。解码端在对待解码图像进行处理的过程中,可以向编码端发送与待解码图像相关的反馈信息,如上述实施例中所述,此时的反馈信息是用于让编码端确定编码后续图像帧的基本层时的参考帧。
上述反馈信息中的帧序号对应待解码图像的帧号。层序号对应从待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,层序号对应从待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,层序号对应当前确定出的从待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。与上述步骤202中的描述类似,层序号对应的是成功解码、成功接收或即将解码的其中之一与编码端和解码端的优先约定或者提前设定的方式相关,或者与解码端的处理能力相关,此处不再赘述。
在一种可能的实现方式中,当基本层的码流和至少一个增强层的码流均接收失败时,解码端可以在反馈信息中携带用于指示接收失败的标识信息;或者,当基本层的码流和/或至少一个增强层的码流解码失败时,解码端可以在反馈信息中携带用于指示解码失败的标识信息。
在一种可能的实现方式中,当反馈信息包括成功解码、即将解码或者成功接收的所有 图像层的帧序号和层序号时,解码端可以缓存待解码图像的所有图像层对应的重构图像;或者,当反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,解码端可以只缓存待解码图像中成功解码、即将解码或者成功接收的质量或分辨率最高的图像层对应的重构图像。
基于上述方法实施例的技术方案,以下采用具体的实施例进行详细的描述。
图4示出了图像编解码过程的一个示例性的示意图,如图4所示,在编码端包括编码端参考帧建立,编码以及码流发送,在解码端包括码流接收和反馈,解码端参考帧建立以及解码等处理单元。本申请提供的图像编解码方法主要涉及编码端/解码端参考帧建立,编解码以及反馈。
图5示出了图像分层编解码的一个示例性的示意图,如图5所示,源图像被划分成基本层和至少一个增强层(例如增强层1和增强层2),这些图像层分别经过编码后产生多个码流(包括基本层的码流、增强层1的码流和增强层2的码流),这些码流通过网络传输至解码端。解码端对基本层的码流、增强层1的码流和增强层2的码流逐层解码得到基本层对应的重构图像、增强层1对应的重构图像和增强层2对应的重构图像。解码端通过解码前述部分码流或者全部码流可重构得到不同分辨率或质量的图像,解码越多的码流,重构得到的图像的分辨率或质量越高。
图6示出了编码端的编码流程的一个示例性的示意图,如图6所示,源图像的基本层由基本层编码器进行编码得到基本层的码流,其帧间编码的参考帧为最优参考帧,该最优参考帧的确定与收发器接收到的来自解码端的反馈信息相关,基本层编码器还可以反构出基本层的重构图像。源图像的增强层1由增强层1编码器进行编码得到增强层1的码流,其帧间编码的参考帧为基本层的重构图像,增强层1编码器还可以反构出增强层1的重构图像。源图像的增强层2由增强层2编码器进行编码得到增强层2的码流,其帧间编码的参考帧为增强层1的重构图像,增强层2编码器还可以反构出增强层2的重构图像。以此类推。基本层的码流、增强层1的码流和增强层2的码流由收发器发送出去。
图7示出了解码端的解码流程的一个示例性的示意图,如图7所示,解码端的收发器接收来自编码端的基本层的码流、增强层1的码流和增强层2的码流。基本层解码器对基本层的码流进行帧间解码得到基本层的重构图像,其参考帧的确定依据基本层的码流中携带的信息。增强层1解码器对增强层1的码流进行帧间解码得到增强层1的重构图像,其参考帧为基本层的重构图像。增强层2解码器对增强层2的码流进行帧间解码得到增强层2的重构图像,其参考帧为增强层1的重构图像。以此类推。解码端可以存储基本层的重构图像、增强层1的重构图像和增强层2的重构图像。
图8示出了本申请图像编码方法的一个示例性的示意图,如图8所示,一帧图像分为三个子图像(Slice0、Slice1和Slice2),每一个子图像被分为基本层(BL)和多个增强层(EL0、EL1、…)分别进行编码。
在编解码过程中,根据更新信号,以Slice为单位对基本层的最优参考帧进行更新。在编码端,更新信号为新的反馈信号,即解码端成功解码、成功接收或者即将解码的质量或分辨率最高的图像层。在解码端,更新信号为基本层的码流中携带的编码参考信息,即编码器编码时所使用的图像帧的图像层。如果图像帧的所有图像层都没有被解码端收到或者成功解码,则不更新该图像帧的最优参考帧。
编码端:
1、图像帧1在编码后,缓存图像帧1的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1。
2、传输图像帧1的各子图像的各图像层的码流,并获得解码端的反馈信号,该反馈信号包括解码端成功解码、成功接收或者即将解码的质量或分辨率最高的图像层的层序号。
3、将对应每一个Slice的层序号指示的重构图像更新到对应的Slice的最优参考帧中,即图8中对应图像帧1的黑色图像层:Slice0 EL1,Slice1 EL0,Slice2 BL。
4、更新后的各个最优参考帧分别作为图像帧2的各子图像的基本层的参考帧,供图像帧2的各子图像的基本层进行帧间编码。
5、图像帧2在编码后,缓存图像帧2的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1。
6、传输图像帧2的各子图像的各图像层的码流,并获得解码端的反馈信号,该反馈信号包括解码端成功解码、成功接收或者即将解码的质量或分辨率最高的图像层的层序号。
7、将对应每一个Slice的层序号指示的重构图像更新到对应的Slice的最优参考帧中,即图8中对应图像帧2的黑色图像层:Slice0 EL1,Slice1 EL1。Slice2的所有层由于传输丢失,不更新最优参考帧,Slice2的基本层的参考帧仍然为图像帧1的Slice2的基本层的参考帧Slice2 BL。
8、更新后的各个最优参考帧分别作为图像帧3的各子图像的基本层的参考帧,供图像帧3的各子图像的基本层进行帧间编码。
9、图像帧3在编码后,缓存图像帧3的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1。
10、传输图像帧3的各子图像的各图像层的码流,并获得解码端的反馈信号,该反馈信号包括解码端成功解码、成功接收或者即将解码的质量或分辨率最高的图像层的层序号。
11、将对应每一个Slice的层序号指示的重构图像更新到对应的Slice的最优参考帧中,即图8中对应图像帧3的黑色图像层:Slice0 EL1,Slice2 EL1。Slice1的所有层由于传输丢失,不更新最优参考帧,Slice1的基本层的参考帧仍然为图像帧2的Slice1的基本层的参考帧Slice1 EL1。
12、更新后的各个最优参考帧分别作为图像帧4的各对应子图像的基本层的参考帧,供图像帧4的各子图像的基本层进行帧间编码。
以此类推。
解码端:
1、接收图像帧1的码流并解码。
2、图像帧1在解码后,情况1:如果针对图像帧1的每一层都发送一个反馈信号, 即每成功接收一个图像层的码流就发送一个反馈信号,或者每成功解码一个图像层的码流就发送一个反馈信号,等等,则缓存图像帧1的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1;情况2:如果针对图像帧1只发送一个反馈信号,则只储存图像帧1的质量或分辨率最高的图像层对应的重构图像Slice0 EL1,Slice1 EL0,Slice2 BL。
3、根据图像帧1的基本层的码流中的编码参考信息,将图像帧1的每一个Slice的基本层的参考帧更新到对应的最优参考帧中,例如Slice0 EL1,Slice1 EL0,Slice2 BL。
4、更新后的各个最优参考帧分别作为图像帧2对应子图像的基本层的参考帧,供图像帧2的各基本层进行帧间解码。
5、接收图像帧2的码流并解码。
6、图像帧2在解码后,情况1:如果针对图像帧2的每一层都发送一个反馈信号,即每成功接收一个图像层的码流就发送一个反馈信号,或者每成功解码一个图像层的码流就发送一个反馈信号,等等,则缓存图像帧2的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1;情况2:如果针对图像帧1只发送一个反馈信号,则只储存图像帧2的质量或分辨率最高的图像层对应的重构图像Slice0 EL1,Slice1 EL1。此例中,Slice2的所有码流均丢失。
7、根据图像帧2的基本层的码流中的编码参考信息,将图像帧2的每一个Slice的基本层的参考帧更新到对应的最优参考帧中,例如Slice0 EL1,Slice1 EL1,Slice2的所有码流均丢失,已通过反馈信号告知编码端,因此编码端的最优参考帧不更新Slice2,并通过码流告知解码端,这时解码端的最优参考帧也不更新Slice2。
8、更新后的各个最优参考帧分别作为图像帧3对应子图像的基本层的参考帧,供图像帧3的各基本层进行帧间解码。
9、接收图像帧3的码流并解码。
10、图像帧3在解码后,情况1:如果针对图像帧3的每一层都发送一个反馈信号,即每成功接收一个图像层的码流就发送一个反馈信号,或者每成功解码一个图像层的码流就发送一个反馈信号,等等,则缓存图像帧3的所有子图像的所有图像层对应的重构图像,即Slice0 BL,Slice0 EL0,Slice0 EL1,…,Slice1 BL,Slice1 EL0,Slice1 EL1,…,Slice2 BL,Slice2 EL0,Slice2 EL1;情况2:如果针对图像帧3只发送一个反馈信号,则只储存图像帧3的质量或分辨率最高的图像层对应的重构图像Slice0 EL1,Slice2 EL1。此例中,Slice1的所有码流均丢失。
11、根据图像帧3的基本层的码流中的编码参考信息,将图像帧3的每一个Slice的基本层的参考帧更新到对应的最优参考帧中,例如Slice0 EL1,Slice2 EL1,Slice1的所有码流均丢失,已通过反馈信号告知编码端,因此编码端的最优参考帧不更新Slice1,并通过码流告知解码端,这时解码端的最优参考帧也不更新Slice1。
12、更新后的各个最优参考帧分别作为图像帧4对应子图像的基本层的参考帧,供图像帧4的各基本层进行帧间解码。
以此类推。
图9为本申请编码装置实施例的结构示意图,如图9所示,本实施例的装置可以包括: 接收模块901、编码模块902、处理模块903和发送模块904。本实施例的装置可以是用于编码端的编码装置或编码器。
所述接收模块901,用于获取待编码图像,所述待编码图像被划分成基本层和至少一个增强层;所述编码模块902,用于当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据所述第一参考帧对所述基本层进行帧间编码得到所述基本层的码流;对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;所述发送模块903,用于向所述解码端发送所述基本层的码流和所述至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括所述第一参考帧的帧序号和层序号。
在一种可能的实现方式中,所述待编码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待编码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待编码子图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述帧序号指示所述待编码图像的前第n帧图像,n为正整数;所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功解码的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功接收到的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端确定出的从所述待编码图像的前第n帧图像的码流中即将解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,所述处理模块902,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对所述基本层进行帧间编码,所述第三参考帧为所述待编码图像的上一帧图像的基本层的参考帧。
在一种可能的实现方式中,所述处理模块902,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,对所述基本层进行帧内编码。
在一种可能的实现方式中,所述编码模块902,具体用于根据第二参考帧对第一增强层进行帧间编码得到所述第一增强层的码流,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述任意一层图像层的质量或分辨率。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,所述处理模块903,用于缓存所述基本层和所述至少一个增强层分别对应的重构图像。
在一种可能的实现方式中,所述处理模块903,还用于在设定的时长内监测所述反馈信息;若在所述设定的时长内接收到所述反馈信息,则确定接收到所述反馈信息。
本实施例的装置,可以用于执行图2、4-8所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
图10为本申请解码装置实施例的结构示意图,如图10所示,本实施例的装置可以包括:接收模块1001、解码模块1002、处理模块1003和发送模块1004。本实施例的装置可以是用于解码端的解码装置或解码器。
所述接收模块1001,用于接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括第一帧序号和第一层序号;所述解码模块1002,用于根据所述第一帧序号和所述第一层序号确定第一参考帧,并根据所述第一参考帧对所述基本层的码流进行帧间解码得到所述基本层对应的重构图像;对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像;所述发送模块1004,用于向所述编码端发送反馈信息,所述反馈信息包括第二帧序号和第二层序号,所述第二帧序号对应所述待解码图像,所述第二层序号对应质量或分辨率最高的图像层。
在一种可能的实现方式中,所述待解码图像为整帧图像或者整帧图像的其中一个子图像。
在一种可能的实现方式中,当所述待解码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待解码图像在所述整帧图像中的位置。
在一种可能的实现方式中,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层,具体包括:所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,所述第二层序号对应当前确定出的从所述待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。
在一种可能的实现方式中,当所述基本层的码流和所述至少一个增强层的码流均接收失败时,所述反馈信息包括用于指示接收失败的标识信息;或者,当所述基本层的码流和/或所述至少一个增强层的码流解码失败时,所述反馈信息包括用于指示解码失败的标识信息。
在一种可能的实现方式中,所述解码模块1002,还用于根据所述基本层对应的重构图像和所述至少一个增强层对应的重构图像得到所述待解码图像。
在一种可能的实现方式中,所述解码模块1002,具体用于根据第二参考帧对第一增强层的码流进行帧间解码得到所述第一增强层对应的重构图像,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
在一种可能的实现方式中,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
在一种可能的实现方式中,所述处理模块1003,用于当所述反馈信息包括成功解码、即将解码或者成功接收的所有图像层的帧序号和层序号时,缓存所有图像层对应的重构图像;或者,当所述反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,缓存所述成功解码、即将解码或者成功接收的质量或分辨率 最高的图像层对应的重构图像。
在一种可能的实现方式中,所述解码模块1002,还用于当所述基本层的码流和/或所述至少一个增强层的码流包括编码方式指示信息时,采用所述编码方式指示信息指示的方式对对应的图像层进行解码,所述编码方式指示信息指示的方式包括帧内解码或者帧间解码。
本实施例的装置,可以用于执行图3-8所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组 件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (43)

  1. 一种图像编码方法,其特征在于,包括:
    获取待编码图像,所述待编码图像被划分成基本层和至少一个增强层;
    当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据所述第一参考帧对所述基本层进行帧间编码得到所述基本层的码流;
    对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;
    向所述解码端发送所述基本层的码流和所述至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括所述第一参考帧的帧序号和层序号。
  2. 根据权利要求1所述的方法,其特征在于,所述待编码图像为整帧图像或者整帧图像的其中一个子图像。
  3. 根据权利要求2所述的方法,其特征在于,当所述待编码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待编码子图像在所述整帧图像中的位置。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述帧序号指示所述待编码图像的前第n帧图像,n为正整数;
    所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功解码的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功接收到的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端确定出的从所述待编码图像的前第n帧图像的码流中即将解码的质量或分辨率最高的图像层。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述获取待编码图像之后,还包括:
    当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对所述基本层进行帧间编码,所述第三参考帧为所述待编码图像的上一帧图像的基本层的参考帧。
  6. 根据权利要求1-4中任一项所述的方法,其特征在于,所述获取待编码图像之后,还包括:
    当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,对所述基本层进行帧内编码。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流,包括:
    根据第二参考帧对第一增强层进行帧间编码得到所述第一增强层的码流,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
  8. 根据权利要求7所述的方法,其特征在于,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
  9. 根据权利要求7或8所述的方法,其特征在于,所述对所述至少一个增强层分别 进行编码得到所述至少一个增强层的码流的过程中,所述方法还包括:
    缓存所述基本层和所述至少一个增强层分别对应的重构图像。
  10. 根据权利要求1-9中任一项所述的方法,其特征在于,所述当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧之前,还包括:
    在设定的时长内监测所述反馈信息;
    若在所述设定的时长内接收到所述反馈信息,则确定接收到所述反馈信息。
  11. 一种图像解码方法,其特征在于,包括:
    接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括第一帧序号和第一层序号;
    根据所述第一帧序号和所述第一层序号确定第一参考帧,并根据所述第一参考帧对所述基本层的码流进行帧间解码得到所述基本层对应的重构图像;
    对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像;
    向所述编码端发送反馈信息,所述反馈信息包括第二帧序号和第二层序号,所述第二帧序号对应所述待解码图像,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层。
  12. 根据权利要求11所述的方法,其特征在于,所述待解码图像为整帧图像或者整帧图像的其中一个子图像。
  13. 根据权利要求12所述的方法,其特征在于,当所述待解码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待解码图像在所述整帧图像中的位置。
  14. 根据权利要求11-13中任一项所述的方法,其特征在于,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层,具体包括:
    所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,
    所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,
    所述第二层序号对应当前确定出的从所述待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。
  15. 根据权利要求11-14中任一项所述的方法,其特征在于,还包括:
    当所述基本层的码流和所述至少一个增强层的码流均接收失败时,所述反馈信息包括用于指示接收失败的标识信息;或者,
    当所述基本层的码流和/或所述至少一个增强层的码流解码失败时,所述反馈信息包括用于指示解码失败的标识信息。
  16. 根据权利要求11-15中任一项所述的方法,其特征在于,所述向所述编码端发送反馈信息之后,还包括:
    根据所述基本层对应的重构图像和所述至少一个增强层对应的重构图像得到所述待解码图像。
  17. 根据权利要求11-16中任一项所述的方法,其特征在于,所述对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像,包括:
    根据第二参考帧对第一增强层的码流进行帧间解码得到所述第一增强层对应的重构图像,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
  18. 根据权利要求17所述的方法,其特征在于,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
  19. 根据权利要求14所述的方法,其特征在于,当所述反馈信息包括成功解码、即将解码或者成功接收的所有图像层的帧序号和层序号时,缓存所有图像层对应的重构图像;或者,
    当所述反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,缓存所述成功解码、即将解码或者成功接收的质量或分辨率最高的图像层对应的重构图像。
  20. 根据权利要求11-19中任一项所述的方法,其特征在于,所述接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流之后,还包括:
    当所述基本层的码流和/或所述至少一个增强层的码流包括编码方式指示信息时,采用所述编码方式指示信息指示的方式对对应的图像层进行解码,所述编码方式指示信息指示的方式包括帧内解码或者帧间解码。
  21. 一种编码装置,其特征在于,包括:
    接收模块,用于获取待编码图像,所述待编码图像被划分成基本层和至少一个增强层;
    编码模块,用于当接收到解码端发送的反馈信息时,将所述反馈信息中指示的帧序号和层序号对应的重构图像确定为第一参考帧,根据所述第一参考帧对所述基本层进行帧间编码得到所述基本层的码流;对所述至少一个增强层分别进行编码得到所述至少一个增强层的码流;
    发送模块,用于向所述解码端发送所述基本层的码流和所述至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括所述第一参考帧的帧序号和层序号。
  22. 根据权利要求21所述的装置,其特征在于,所述待编码图像为整帧图像或者整帧图像的其中一个子图像。
  23. 根据权利要求22所述的装置,其特征在于,当所述待编码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待编码子图像在所述整帧图像中的位置。
  24. 根据权利要求21-23中任一项所述的装置,其特征在于,所述帧序号指示所述待编码图像的前第n帧图像,n为正整数;所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功解码的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端从所述待编码图像的前第n帧图像的码流中成功接收到的质量或分辨率最高的图像层;或者,所述层序号对应所述解码端确定出的从所述待编码图像的前第n帧图像的码流中即将解码的质量或分辨率最高的图像层。
  25. 根据权利要求21-24中任一项所述的装置,其特征在于,所述编码模块,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,根据第三参考帧对所述基本层进行帧间编码,所述第三参考帧为所述待编码图像的上一帧图像的基本层的参考帧。
  26. 根据权利要求21-24中任一项所述的装置,其特征在于,所述编码模块,还用于当没有接收到所述反馈信息或者所述反馈信息包括用于指示接收失败或解码失败的标识信息时,对所述基本层进行帧内编码。
  27. 根据权利要求21-26中任一项所述的装置,其特征在于,所述编码模块,具体用于根据第二参考帧对第一增强层进行帧间编码得到所述第一增强层的码流,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
  28. 根据权利要求27所述的装置,其特征在于,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
  29. 根据权利要求27或28所述的装置,其特征在于,还包括:
    处理模块,用于缓存所述基本层和所述至少一个增强层分别对应的重构图像。
  30. 根据权利要求29所述的装置,其特征在于,所述处理模块,还用于在设定的时长内监测所述反馈信息;若在所述设定的时长内接收到所述反馈信息,则确定接收到所述反馈信息。
  31. 一种解码装置,其特征在于,包括:
    接收模块,用于接收来自编码端的待解码图像的基本层的码流和至少一个增强层的码流,所述基本层的码流携带编码参考信息,所述编码参考信息包括第一帧序号和第一层序号;
    解码模块,用于根据所述第一帧序号和所述第一层序号确定第一参考帧,并根据所述第一参考帧对所述基本层的码流进行帧间解码得到所述基本层对应的重构图像;对所述至少一个增强层的码流分别进行解码得到所述至少一个增强层分别对应的重构图像;
    发送模块,用于向所述编码端发送反馈信息,所述反馈信息包括第二帧序号和第二层序号,所述第二帧序号对应所述待解码图像,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层。
  32. 根据权利要求31所述的装置,其特征在于,所述待解码图像为整帧图像或者整帧图像的其中一个子图像。
  33. 根据权利要求32所述的装置,其特征在于,当所述待解码图像为所述整帧图像的其中一个子图像时,所述反馈信息还包括位置信息,所述位置信息用于指示所述待解码图像在所述整帧图像中的位置。
  34. 根据权利要求31-33中任一项所述的装置,其特征在于,所述第二层序号对应所述待解码图像的基本层和至少一个增强层中质量或分辨率最高的图像层,具体包括:
    所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功解码的质量或分辨率最高的图像层;或者,
    所述第二层序号对应从所述待解码图像的基本层的码流和至少一个增强层的码流中成功接收到的质量或分辨率最高的图像层;或者,
    所述第二层序号对应当前确定出的从所述待解码图像的基本层的码流和至少一个增强层的码流中即将解码的质量或分辨率最高的图像层。
  35. 根据权利要求31-34中任一项所述的装置,其特征在于,当所述基本层的码流和所述至少一个增强层的码流均接收失败时,所述反馈信息包括用于指示接收失败的标识信息;或者,当所述基本层的码流和/或所述至少一个增强层的码流解码失败时,所述反馈信息包括用于指示解码失败的标识信息。
  36. 根据权利要求31-35中任一项所述的装置,其特征在于,所述解码模块,还用于根据所述基本层对应的重构图像和所述至少一个增强层对应的重构图像得到所述待解码图像。
  37. 根据权利要求31-36中任一项所述的装置,其特征在于,所述解码模块,具体用于根据第二参考帧对第一增强层的码流进行帧间解码得到所述第一增强层对应的重构图像,所述第一增强层为所述至少一个增强层中的任意一层,所述第二参考帧为第一图像层对应的重构图像,所述第一图像层的质量或分辨率低于所述第一增强层的质量或分辨率。
  38. 根据权利要求37所述的装置,其特征在于,所述第一图像层为比所述第一增强层低一层的图像层;或者,所述第一图像层为所述基本层。
  39. 根据权利要求34所述的装置,其特征在于,还包括:
    处理模块,用于当所述反馈信息包括成功解码、即将解码或者成功接收的所有图像层的帧序号和层序号时,缓存所有图像层对应的重构图像;或者,当所述反馈信息包括成功解码、即将解码或者成功接收的质量或分辨率最高的图像层的帧序号和层序号时,缓存所述成功解码、即将解码或者成功接收的质量或分辨率最高的图像层对应的重构图像。
  40. 根据权利要求31-39中任一项所述的装置,其特征在于,所述解码模块,还用于当所述基本层的码流和/或所述至少一个增强层的码流包括编码方式指示信息时,采用所述编码方式指示信息指示的方式对对应的图像层进行解码,所述编码方式指示信息指示的方式包括帧内解码或者帧间解码。
  41. 一种编码器,其特征在于,包括:
    处理器和传输接口;
    所述处理器被配置为调用存储在存储器中的程序指令,以实现如权利要求1-10任一项所述的方法。
  42. 一种解码器,其特征在于,包括:
    处理器和传输接口;
    所述处理器被配置为调用存储在存储器中的程序指令,以实现如权利要求11-20任一项所述的方法。
  43. 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机或处理器上被执行时,使得所述计算机或所述处理器执行权利要求1-10或11-20中任一项所述的方法。
PCT/CN2020/092408 2020-05-26 2020-05-26 图像编解码方法和装置 WO2021237475A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2020/092408 WO2021237475A1 (zh) 2020-05-26 2020-05-26 图像编解码方法和装置
CN202080101374.6A CN115699745A (zh) 2020-05-26 2020-05-26 图像编解码方法和装置
EP20937790.2A EP4156686A4 (en) 2020-05-26 2020-05-26 IMAGE ENCODING/DECODING METHOD AND APPARATUS
US17/993,533 US20230103928A1 (en) 2020-05-26 2022-11-23 Image Encoding and Decoding Method and Apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092408 WO2021237475A1 (zh) 2020-05-26 2020-05-26 图像编解码方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/993,533 Continuation US20230103928A1 (en) 2020-05-26 2022-11-23 Image Encoding and Decoding Method and Apparatus

Publications (1)

Publication Number Publication Date
WO2021237475A1 true WO2021237475A1 (zh) 2021-12-02

Family

ID=78745241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092408 WO2021237475A1 (zh) 2020-05-26 2020-05-26 图像编解码方法和装置

Country Status (4)

Country Link
US (1) US20230103928A1 (zh)
EP (1) EP4156686A4 (zh)
CN (1) CN115699745A (zh)
WO (1) WO2021237475A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101360243A (zh) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 基于反馈参考帧的视频通信系统及方法
US20140086328A1 (en) * 2012-09-25 2014-03-27 Qualcomm Incorporated Scalable video coding in hevc
CN104469369A (zh) * 2014-11-17 2015-03-25 何震宇 一种利用解码端信息提高svc性能的方法
CN106101709A (zh) * 2016-07-08 2016-11-09 上海大学 一种联合增强层的shvc质量可分级的基本层帧间预测方法
CN110087069A (zh) * 2013-07-12 2019-08-02 佳能株式会社 图像编码设备和方法、图像解码设备和方法及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130223524A1 (en) * 2012-02-29 2013-08-29 Microsoft Corporation Dynamic insertion of synchronization predicted video frames
WO2013158293A1 (en) * 2012-04-19 2013-10-24 Vid Scale, Inc. System and method for error-resilient video coding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101360243A (zh) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 基于反馈参考帧的视频通信系统及方法
US20140086328A1 (en) * 2012-09-25 2014-03-27 Qualcomm Incorporated Scalable video coding in hevc
CN110087069A (zh) * 2013-07-12 2019-08-02 佳能株式会社 图像编码设备和方法、图像解码设备和方法及存储介质
CN104469369A (zh) * 2014-11-17 2015-03-25 何震宇 一种利用解码端信息提高svc性能的方法
CN106101709A (zh) * 2016-07-08 2016-11-09 上海大学 一种联合增强层的shvc质量可分级的基本层帧间预测方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP4156686A4 *
WANG, ZHONGGANG: "An Inter-Frame Algorithm Research Based on SHVC", MICROCOMPUTER & ITS APPLICATIONS, vol. 36, no. 21, 31 December 2017 (2017-12-31), pages 35 - 38, XP055872037 *

Also Published As

Publication number Publication date
CN115699745A (zh) 2023-02-03
EP4156686A4 (en) 2023-04-12
US20230103928A1 (en) 2023-04-06
EP4156686A1 (en) 2023-03-29

Similar Documents

Publication Publication Date Title
US11558631B2 (en) Super-resolution loop restoration
JP2017208849A (ja) 小待ち時間レート制御システムおよび方法
US20220188976A1 (en) Image processing method and apparatus
US20170094294A1 (en) Video encoding and decoding with back channel message management
US10506245B2 (en) Video data processing using a ring buffer
KR20230098717A (ko) 인코딩 방법, 인코딩된 비트스트림 및 인코딩 디바이스
US20220210469A1 (en) Method For Transmitting Video Picture, Device For Sending Video Picture, And Video Call Method And Device
CN113132728B (zh) 编码方法及编码器
US20160044317A1 (en) System and method for determining buffer fullness for display stream compression
US20210337189A1 (en) Prediction mode determining method and apparatus
US20140321556A1 (en) Reducing amount of data in video encoding
US10506283B2 (en) Video decoding and rendering using combined jitter and frame buffer
WO2021237475A1 (zh) 图像编解码方法和装置
WO2021056575A1 (zh) 一种低延迟信源信道联合编码方法及相关设备
CN106131565B (zh) 使用联合抖动-帧缓冲区的视频解码及渲染
WO2021180220A1 (zh) 图像编码和解码方法及装置
WO2021168624A1 (zh) 视频图像编码方法、设备及可移动平台
WO2021237474A1 (zh) 视频传输方法、装置和系统
US20240187611A1 (en) Image encoding and decoding method and apparatus
CN113747099B (zh) 视频传输方法和设备
CN116489380B (zh) 一种微控制处理器及视频码流的解码方法
CN117149123A (zh) 数据处理方法、装置、电子设备
CN104243989A (zh) 视频编解码系统及视频流传输方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937790

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020937790

Country of ref document: EP

Effective date: 20221223