WO2022262602A1 - Video coding and decoding method and apparatus - Google Patents

Video coding and decoding method and apparatus Download PDF

Info

Publication number
WO2022262602A1
WO2022262602A1 PCT/CN2022/097097 CN2022097097W WO2022262602A1 WO 2022262602 A1 WO2022262602 A1 WO 2022262602A1 CN 2022097097 W CN2022097097 W CN 2022097097W WO 2022262602 A1 WO2022262602 A1 WO 2022262602A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
video
encoded
video frame
frames
Prior art date
Application number
PCT/CN2022/097097
Other languages
French (fr)
Chinese (zh)
Inventor
要瑞宵
张凯明
Original Assignee
百果园技术(新加坡)有限公司
要瑞宵
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 要瑞宵 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2022262602A1 publication Critical patent/WO2022262602A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/65Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience

Definitions

  • the present application relates to the field of computer technology, for example, to a video encoding and decoding method and device.
  • the video sender usually encodes multiple video frames in the video stream to be sent before sending the video stream to obtain an encoded video stream. And send the encoded video stream to the video receiving end.
  • the encoded video stream mainly includes two types of video frames: intra-frame coding frames (also called Intra frames, I frames) and inter-frame predictive coding frames (also called Inter frames, P frames) arranged at intervals.
  • I frame is a frame that can be decoded independently, that is, when the I frame is decoded at the video receiving end, the decoded video frame can be obtained without referring to other frame data.
  • P frames cannot be decoded independently. That is, when a P frame is decoded at the video receiving end, it needs to rely on the decoding of its previous video frame to obtain a decoded video frame. That is, the correct decoding of a P frame depends on the correct decoding of its previous frame.
  • the present application provides a video encoding and decoding method and device, electronic equipment, and a storage medium.
  • This application provides a video encoding and decoding method, which is applied to the encoding end, including:
  • the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set can be related to the decoding end
  • the video frame corresponding to the successfully decoded coded video frame is a reliable frame
  • the frame loss rate and the video encoding rule determine the target reference frame corresponding to the video frame to be encoded from the reference frame set, and the video encoding rule includes: the larger the frame loss rate, the more video frames to be encoded
  • the target reference frame corresponding to the number of video frames is the reliable frame
  • This application provides a video encoding and decoding method, which is applied to the decoding end, including:
  • the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to the number of video frames included in the set of reference frames at the decoding end;
  • the set of decoded frames includes a target reference frame corresponding to the encoded video frame
  • This application provides a video codec device, which is applied to the encoding end, including:
  • the acquisition module is configured to acquire the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set
  • the video frames corresponding to the encoded video frames that can be successfully decoded by the decoder are reliable frames;
  • the determination module is configured to determine the target reference frame corresponding to the video frame to be encoded from the set of reference frames according to the frame loss rate and video coding rules, the video coding rules include: the larger the frame loss rate, The target reference frame corresponding to the larger number of video frames in the video to be encoded is the reliable frame; and it is also set to determine the distance between the video frame to be encoded and the target reference frame in display timing as the the reference distance of the encoded video frame;
  • An encoding module configured to use the target reference frame to encode the video frame to be encoded to obtain an encoded video frame
  • a sending module configured to send the coded video frame and the reference distance to the decoding end.
  • This application provides a video codec device, which is applied to the decoding end, including:
  • the receiving module is configured to receive the encoded video frame and the reference distance sent by the encoding end according to the above-mentioned video encoding and decoding method;
  • An acquisition module configured to acquire a set of decoded frames, the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to that included in the set of reference frames at the decoding end the number of video frames;
  • the determining module is configured to obtain the target reference frame corresponding to the encoded video frame when determining that the set of decoded frames includes the target reference frame corresponding to the encoded video frame according to the reference distance;
  • the decoding module is configured to use the target reference frame to decode the coded video frame to obtain a decoded video frame.
  • the present application provides an electronic device, including a processor, a memory, and a computer program stored on the memory and operable on the processor.
  • the computer program is executed by the processor, the above-mentioned video coding decoding method.
  • the present application provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above video encoding and decoding method is implemented.
  • FIG. 1 is a schematic structural diagram of a video processing system provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of another video processing system provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an IPPP frame structure provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of another IPPP frame structure provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of a time-domain scalable type of scalable video coding (Scalable Video Coding, SVC) coded video frame structure provided by an embodiment of the present application;
  • SVC Scalable Video Coding
  • FIG. 6 is a flowchart of a video encoding and decoding method provided by an embodiment of the present application.
  • FIG. 7 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application.
  • FIG. 8 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application.
  • FIG. 9 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a set of reference frames provided by an embodiment of the present application.
  • Fig. 11 is a schematic diagram of the principle of an encoding sub-rule provided by the embodiment of the present application.
  • Fig. 12 is a schematic diagram of the principle of another coding sub-rule provided by the embodiment of the present application.
  • Fig. 13 is a schematic diagram of the principle of another coding sub-rule provided by the embodiment of the present application.
  • FIG. 14 is a schematic diagram of the principle of acquiring a target reference frame provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of another method for obtaining a target reference frame provided by an embodiment of the present application.
  • Fig. 16 is a schematic diagram of a video frame reference relationship provided by an embodiment of the present application.
  • Fig. 17 is a block diagram of a video encoding and decoding device provided by an embodiment of the present application.
  • FIG. 18 is a block diagram of another video codec device provided by an embodiment of the present application.
  • Fig. 19 is a block diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 1 shows a schematic structural diagram of a video processing system provided by an embodiment of the present application.
  • the video processing system is the implementation environment involved in the video codec method.
  • the video processing system may include: an encoding end 101 and at least one decoding end 102 .
  • a decoding terminal 102 is taken as an example for illustration.
  • the encoding end 101 and the decoding end 102 may be connected through a wired network or a wireless network. Both the encoding end 101 and the decoding end 102 may be located on the electronic device.
  • the electronic device may be a mobile terminal, and the mobile terminal may be a mobile phone, a computer, a multimedia player, an electronic reader, a wearable device, and the like.
  • the encoding end and the decoding end can realize their functions through the operating system of the electronic device, or realize their functions through the client installed on the electronic device.
  • the encoding end 101 includes: a feedback sorting module 1011 , an encoding module 1012 , a sending module 1013 and a decoding module 1014 .
  • the coding module 1012 is connected to the feedback sorting module 1011 and the sending module 1013 respectively.
  • the decoding end 102 includes: a feedback sorting module 1021 , an encoding module 1022 , a sending module 1023 and a decoding module 1024 .
  • the coding module 1022 is connected to the feedback sorting module 1021 and the sending module 1023 respectively.
  • the feedback sorting module is configured to sort out the decoded feedback information sent by the opposite end.
  • the coding module is configured to generate coded video frames based on the video to be coded to form a video stream.
  • the sending modules are all configured to send encoded video frames to the decoding module at the opposite end.
  • the decoding module is configured to decode the received coded video frame to obtain the decoded video frame.
  • the video encoding and decoding method provided can be applied to a real-time communication (Real-Time Communication, RTC) scenario.
  • the real-time communication scene may include a video communication scene, a live broadcast scene, and the like.
  • the encoding end is located at the anchor terminal where the anchor user performs live video broadcasting.
  • the host terminal generates a video stream corresponding to a video with a certain definition by performing a video encoding method on the video to be encoded, and sends the generated video stream to the audience terminal.
  • the audience terminal refers to the terminal of the user who watches the live video of the anchor user.
  • the decoding end is located at the audience terminal, and can decode the coded video frames in the video stream to obtain the decoded video frames, thereby obtaining a video with a certain definition.
  • an intra-frame predictive encoding mode or an inter-frame predictive encoding mode may be used.
  • an intra-frame predictive coding mode when used to perform intra-frame predictive coding on a video frame, there is no need to use other video frames to generate an I frame.
  • the inter-frame predictive coding mode is used to perform inter-frame predictive coding on video frames, the previous video frame in the display sequence can be used as a reference frame to generate a P frame.
  • Inter-frame predictive coding is forward predictive coding.
  • FIG. 3 shows the frame reference relationship of multiple video frames under the IPPP frame structure.
  • the frame reference relationship of multiple video frames is shown as: I frame, P frame, P frame...P frame, I frame, P frame, P frame...P frame... ..I frame, P frame, P frame...P frame, etc.
  • I represents an I frame
  • P represents a P frame.
  • the arrows between the video frames in FIG. 3 indicate the reference frames of the video frames.
  • the reference frame of each P frame is only its previous frame.
  • Figure 4 shows that under the frame structure shown in Figure 3, if the X frame is lost, all P frames before the next I frame, that is, the P-failure frame set identified in Figure 4 cannot be decoded correctly. In this case, even if the network condition is poor, only a few frames are lost, but because many frames cannot be decoded correctly on the decoder side, it will cause the video playback on the decoder side to freeze.
  • SVC is a mainstream video codec standard, such as an extension of H.264.
  • SVC adopts a hierarchical prediction structure, which can be divided into three types: Temporal scalability (Temporal scalability) SVC, spatial scalability (Spatial scalability) SVC and quality scalability (Quality scalability) SVC.
  • Temporal scalability Temporal scalability
  • spatial scalability spatial scalability
  • Quality scalability Quality scalability
  • FIG. 5 shows a schematic diagram of a time-domain scalable SVC encoded video frame structure.
  • the video frames included in the video to be encoded can be divided into a base layer (Layer0) and an enhancement layer (Layer1) in the temporal domain.
  • the video frames of the base layer may adopt intra-frame predictive coding or inter-frame predictive coding to obtain a coded frame structure in the periodically arranged IPPP mode. That is, in FIG. 5 , the encoded video frames corresponding to the video frames of the base layer include multiple P frames encoded after an I frame, and multiple P frames encoded after an I frame.
  • the video frame of the enhancement layer (higher layer) can be coded by using the video frame of the base layer as a reference frame to obtain a coded video frame.
  • p represents the coded video frame obtained after the video frame of the enhancement layer is coded by using the video frame of the base layer as a reference frame. Arrows between video frames in FIG. 5 indicate reference frames for that video frame.
  • FIG. 6 shows a flowchart of a video encoding and decoding method provided by an embodiment of the present application.
  • the video encoding and decoding method is applied to the encoding end shown in Fig. 1 and Fig. 2 .
  • the video encoding and decoding methods include:
  • Step 601. Obtain the non-first video frame to be encoded, the transmission frame loss rate between the encoding end and the decoding end, and a reference frame set.
  • the reference frame set includes at least one video frame corresponding to the encoded video frame.
  • the decoding end can successfully
  • the video frame corresponding to the decoded coded video frame is a reliable frame.
  • the video encoding method may be applied to an entire video to be encoded, or may also be applied to an encoding cycle of the video to be encoded.
  • the video to be encoded may include multiple encoding periods, and each encoding period may include multiple video frames. If the video coding method is applied to a whole section of video to be coded, the first video frame to be coded is the first video frame to be coded arranged according to display timing among the multiple video frames to be coded included in the video to be coded.
  • the corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the video to be encoded.
  • the first video frame to be coded is the first video frame to be coded arranged according to the display time sequence among the multiple video frames to be coded included in the coding cycle.
  • the corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the encoding cycle.
  • the transmission frame loss rate between the encoding end and the decoding end refers to the ratio of the number of video frames not received by the decoding end to the number of video frames transmitted from the encoding end to the decoding end within the set time period.
  • the video frames not received by the decoding end are Video frames lost on the decoding side.
  • the encoded video frame refers to a video frame obtained after encoding the video frame to be encoded.
  • the video frame corresponding to the coded video frame is the video frame before coding corresponding to the coded video frame, or the reconstructed frame is obtained after the coded video frame is reconstructed.
  • the encoding end may receive the number of encoded video frames received within each set time period sent by the decoding end.
  • the ratio of the number of encoded video frames not received by the decoding end to the number of encoded video frames transmitted from the encoding end to the decoding end is used as the transmission frame loss rate between the encoding end and the decoding end, and the number of encoded video frames not received by the decoding end is the difference between the number of encoded video frames transmitted by the encoder within the set duration and the number of encoded video frames received by the decoder within the set duration.
  • the set of reference frames includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded. Therefore, after encoding the video frame to be encoded to obtain the encoded video frame, the encoder can store the video frame corresponding to the encoded video frame to obtain the reference frame set. For example, when the encoder acquires the first non-first video frame to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame, and the first encoded video frame is the first video frame to be encoded Encoded to get.
  • the acquired reference frame set includes the video frame corresponding to the first encoded video frame and the video frame corresponding to the second encoded video frame
  • the second The encoded video frame is obtained by encoding the first not the first video frame to be encoded.
  • the reference frame set includes reliable frames
  • the reliable frames refer to video frames corresponding to coded video frames that can be successfully decoded by the decoding end.
  • a successfully decodable encoded video frame may refer to an encoded video frame that has been successfully decoded.
  • the successfully decodable encoded video frames may also refer to the successfully received encoded video frames. For example, I frames that are successfully received.
  • a successfully decodable video frame may also refer to a successfully received coded video frame, and a reference frame of the coded video frame is also successfully received.
  • a reference frame refers to a frame that needs to be referred to when encoding a video frame.
  • the reference frame set also known as the first decoded picture buffer (Decoded Picture Buffer, DPB), which may include video frames respectively corresponding to the first target number of encoded video frames, that is, may include the first target number of video frames , the value of the first target quantity may be greater than 1.
  • the number of video frames that can be included in the reference frame set is related to the number of short-term reference frames or the number of long-term reference frames set by the encoder.
  • the first target number of video frames included in the reference frame set may be 8, 16, or 32, and so on.
  • the encoding end may include a reconstructed frame buffer, and the reconstructed frame buffer may be used to store the reference frame set.
  • the video frames included in the reference frame set may be reference frames of most video frames to be encoded.
  • the video frames included in the reference frame set may be video frames corresponding to any coded video frame.
  • the video to be encoded adopts temporally scalable SVC encoding the video frames included in the reference frame set can only be encoded for video frames at the base layer to obtain video frames corresponding to the encoded video frames.
  • Step 602 Determine the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule.
  • the video encoding rule includes: the larger the frame loss rate, the greater the number of video frames in the video to be encoded.
  • the target reference frame of is a reliable frame.
  • the multiple video frames to be encoded included in the video to be encoded may have frame numbers in order of display timing.
  • the video coding rules may include: when the frame loss rate is less than or equal to the first quantity threshold, the target reference frames corresponding to the video frames to be coded included in the video to be coded may all have the closest display distance to the video frames to be coded, That is, the video frame closest to the frame sequence number of the video frame to be encoded.
  • target reference frames corresponding to a set number of video frames to be encoded at each interval may be reliable frames.
  • the target reference frames corresponding to the remaining video frames to be encoded may all be the video frames with the closest display distance to the video frame to be encoded, that is, the video frame with the closest distance to the frame sequence number of the video frame to be encoded.
  • the video to be encoded includes five video frames, that is, the first video frame to the fifth video frame.
  • the first video frame and the second video frame arranged according to the display sequence have been coded, and the reference frame set includes: the video frame A corresponding to the coded video frame obtained by coding the first video frame, and the coded video frame A obtained by coding the second video frame Frame corresponds to video frame B, and video frame B is a reliable frame.
  • the first quantity threshold may be 0, that is, when the frame loss rate is 0, the target reference frame corresponding to the third video frame is: video frame B.
  • the target reference frame corresponding to the fourth video frame is: the video frame corresponding to the encoded video frame obtained from the third video frame.
  • the target reference frame corresponding to the fifth video frame is: the video frame corresponding to the encoded video frame obtained from the fourth video frame.
  • the target reference frame corresponding to the third video frame is: video frame B.
  • the target reference frame corresponding to the fourth video frame is: the video frame corresponding to the encoded video frame obtained from the third video frame.
  • the target reference frame corresponding to the fifth video frame is: video frame B.
  • the video coding rule is applicable to all non-first video frames to be coded at the base layer.
  • the video to be encoded includes eight video frames, that is, the first video frame to the eighth video frame, the odd video frames are in the base layer, and the even video frames are in the enhancement layer.
  • the first video frame, the second video frame, and the third video frame arranged according to the display sequence have all been coded
  • the reference frame set includes: the video frame A corresponding to the coded video frame obtained by coding the first video frame, and the third video frame Frame coding obtains a video frame C corresponding to the coded video frame, and the video frame C is a reliable frame.
  • the first quantity threshold may be 0, that is, when the frame loss rate is 0, the target reference frame corresponding to the fourth video frame is: video frame C.
  • the target reference frame corresponding to the fifth video frame is: video frame C.
  • the target reference frame corresponding to the sixth video frame is: the video frame corresponding to the coded video frame obtained from the fifth video frame.
  • the target reference frame corresponding to the seventh video frame is: the video frame corresponding to the encoded video frame obtained from the fifth video frame.
  • the target reference frame corresponding to the eighth video frame is: the video frame corresponding to the encoded video frame obtained from the seventh video frame.
  • the target reference frame corresponding to the fourth video frame is: video frame C.
  • the target reference frame corresponding to the fifth video frame is: video frame C.
  • the target reference frame corresponding to the sixth video frame is: the video frame corresponding to the coded video frame obtained from the fifth video frame.
  • the target reference frame corresponding to the seventh video frame is: video frame C.
  • the target reference frame corresponding to the eighth video frame is: the video frame corresponding to the encoded video frame obtained from the seventh video frame.
  • Step 603 Determine the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded.
  • the encoding end may determine the number of video frames between the video frame to be encoded and the target reference frame as the reference distance (RefDelta) of the video frame to be encoded.
  • the encoding end can combine the frame numbers of the video frames to be encoded with the frame numbers of the target reference frame The difference is determined as the reference distance of the video frame to be encoded.
  • the frame number of the video frame to be encoded is 3, that is, the third displayed video frame.
  • the frame number of the target reference frame is 1.
  • the reference distance is 2.
  • Step 604 Encode the video frame to be encoded by using the target reference frame to obtain the encoded video frame.
  • the coding end may use the target reference frame to obtain the coded video frame by using predictive coding on the video frame to be coded.
  • the coding end may use the target reference frame to obtain the coded video frame by using inter-frame predictive coding on the video frame to be coded.
  • Step 605 Send the coded video frame and the reference distance to the decoder.
  • the encoding end sends the encoded video frame and the reference distance to the decoding end through a network connected between the encoding end and the decoding end. So that after receiving the transmitted encoded video frame and the reference distance, the decoding end obtains the target reference frame corresponding to the encoded video frame according to the reference distance. And the coded video frame is decoded by using the target reference frame to obtain the decoded video frame.
  • the video encoding and decoding method obtains the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a set of reference frames.
  • the target reference frame corresponding to the video frame to be coded is determined from the reference frame set.
  • the distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded.
  • the video frame to be coded is coded by using the target reference frame to obtain the coded video frame. To send the encoded video frame and the reference distance to the decoder.
  • the reference frame set includes at least one video frame corresponding to the encoded video frame
  • the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame.
  • the video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be decoded correctly due to the loss of video frames is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freeze at the decoding end is reduced.
  • FIG. 7 shows a flowchart of a video encoding and decoding method provided by an embodiment of the present application.
  • the video coding and decoding method is applied to the decoding end shown in Fig. 1 and Fig. 2 .
  • video encoding and decoding methods include:
  • Step 701 Receive the coded video frame and the reference distance sent by the coder.
  • the encoded video frame received by the decoding end is an encoded video frame generated by the encoding end according to a video encoding and decoding method provided by an embodiment of the present application.
  • Step 702 Obtain a decoded frame set, which includes at least one decoded video frame, and the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set at the decoding end.
  • the set of decoded frames includes at least one decoded video frame.
  • the decoding end decodes the encoded video frames received by the encoding end, and after obtaining the decoded video frames, the decoded video frames may be stored to obtain a set of decoded frames.
  • the decoded frame set is also called a second decoded picture buffer (Decoded Picture Buffer, DPB), which may include a second target number of decoded video frames, and the value of the second target number may be greater than 1.
  • DPB Decoded Picture Buffer
  • the second target number of video frames included in the decoded frame set is the same as the first target number of video frames included in the reference frame set.
  • Step 703 when it is determined according to the reference distance that the set of decoded frames includes the target reference frame corresponding to the encoded video frame, acquire the target reference frame corresponding to the encoded video frame.
  • the reference distance of the video frames to be encoded can be the frame number of the video frames to be encoded and the target The difference between the frame numbers of the reference frames.
  • the decoding end may determine the target frame number that differs from the frame number of the encoded video frame by the difference according to the reference distance.
  • the decoded frame set When it is determined that the decoded frame set does not include the decoded video frame corresponding to the target frame number, it indicates that the decoder cannot obtain the decoded video frame corresponding to the target frame number from the decoded frame set, that is, the coded frame cannot be obtained.
  • the target reference frame corresponding to the video frame Even if the decoding end successfully receives the coded video frame, it cannot be decoded correctly because it cannot obtain its corresponding target reference frame.
  • the decoding end determines that the decoded frame set does not include the decoded video frame corresponding to the target frame number, the encoded video frame is discarded.
  • the target reference frame corresponding to the coded video frame is the video frame whose frame number is 1 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the decoded frame set includes a video frame with a frame number of 1, the video frame with a frame number of 1 is used as a target reference frame corresponding to the encoded video frame. If it is determined that the decoded frame set does not include the video frame whose frame number is 1, the received coded video frame is discarded.
  • Step 704 Use the target reference frame to decode the coded video frame to obtain a decoded video frame.
  • the decoding end may use the target reference frame to perform predictive decoding on the coded video frame to obtain the decoded video frame.
  • the decoding end may use the target reference frame to perform inter-frame predictive decoding on the coded video frame to obtain the decoded video frame.
  • the video encoding and decoding method provided by the embodiment of the present application receives the coded video frame and the reference distance generated by the encoding terminal according to a video encoding and decoding method provided in the embodiment of the present application.
  • the target reference frame corresponding to the coded video frame can be obtained from the decoded frame set according to the reference distance. Therefore, the coded video frame is decoded by using the target reference frame to obtain a decoded video frame.
  • the encoded video frame is generated by the encoding end according to a video encoding and decoding method provided by the embodiment of the present application.
  • the network status between the video sending end and the video receiving end is poor, resulting in the loss of video frames during transmission, the greater the frame loss rate, the more encoded video frames are used and the decoding end can be successfully decoded
  • the encoded video frame corresponds to the reliable frame encoding. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
  • the video to be encoded can be encoded in IPPP mode to obtain the encoded video frame structure of IPPP mode, or can be encoded by SVC to obtain the SVC encoded video frame structure, and of course other modes can also be used.
  • coding The embodiments shown in Fig. 8 and Fig. 9 below take the video to be coded using time-domain scalable SVC coding as an example for illustration, then the video to be coded consists of multiple video frames, and the multiple video frames may include a base layer and At least one enhancement layer.
  • FIG. 8 and FIG. 9 show a flowchart of a video encoding and decoding method provided by an embodiment of the present application.
  • the video encoding and decoding method is applied to the video processing system shown in Fig. 1 and Fig. 2 .
  • the video encoding and decoding methods include:
  • Step 801 the encoder obtains the first video frame to be encoded.
  • the video encoding method may be applied to an entire video to be encoded, or may also be applied to an encoding cycle of the video to be encoded.
  • the video to be encoded may include multiple encoding periods, and each encoding period may include multiple video frames. If the video coding method is applied to a whole section of video to be coded, the first video frame to be coded is the first video frame to be coded arranged according to display timing among the multiple video frames to be coded included in the video to be coded.
  • the corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the video to be encoded.
  • the first video frame to be coded is the first video frame to be coded arranged according to the display time sequence among the multiple video frames to be coded included in the coding cycle.
  • the corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the encoding cycle.
  • step 802 the encoding end performs intra-frame predictive encoding on the first video frame to be encoded to obtain the first encoded video frame.
  • the encoding end performs intra-frame predictive encoding on the first video frame to be encoded to obtain the first encoded video frame, and the first encoded video frame is an I frame.
  • Step 803 The encoder adds the video frame corresponding to the first encoded video frame to the reference frame set, and selects the video frame corresponding to the first encoded video frame as a reliable frame.
  • the reference frame set includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded.
  • the first video frame is the video frame at the base layer.
  • the encoding end encodes the first video frame to obtain a video frame corresponding to the first encoded video frame and adds it to the reference frame set. Since the first coded video frame is an I frame, and the I frame is received by the decoding end, it can be guaranteed to be successfully unlocked. Therefore, the first coded video frame can be selected as the first reliable frame in the reference frame set.
  • Step 804 the encoding end sends the first encoded video frame to the decoding end.
  • Step 805 The decoding end performs intra-frame predictive decoding on the first coded video frame to obtain the first decoded video frame.
  • the decoder after receiving the first coded video frame sent by the coder, can use intra-frame predictive decoding for the first I frame to obtain the first decoded video frame corresponding to the first coded video frame .
  • Step 806 the decoder adds the first decoded video frame to the decoded frame set.
  • the set of decoded frames may include at least one decoded video frame.
  • the decoding end may decode the first coded video frame and add the first decoded video frame to the decoded frame set, so as to facilitate subsequent decoding of coded video frames received based on the video frame pair included in the decoded frame set.
  • the video encoding and decoding method may also include:
  • Step 901 the encoding end acquires the non-first video frame to be encoded, the transmission frame loss rate between the encoding end and the decoding end, and a set of reference frames.
  • the transmission frame loss rate between the encoding end and the decoding end refers to the ratio of the number of video frames not received by the decoding end to the number of video frames transmitted from the encoding end to the decoding end within the set time period.
  • the video frames not received by the decoding end are Video frames lost on the decoding side.
  • the encoded video frame refers to a video frame obtained after encoding the video frame to be encoded.
  • the video frame corresponding to the coded video frame is the video frame before coding corresponding to the coded video frame, or the reconstructed frame is obtained after the coded video frame is reconstructed.
  • the encoding end may receive the number of encoded video frames received within each set time period sent by the decoding end.
  • the ratio of the number of encoded video frames not received by the decoding end to the number of encoded video frames transmitted from the encoding end to the decoding end is used as the transmission frame loss rate between the encoding end and the decoding end, and the number of encoded video frames not received by the decoding end is the difference between the number of encoded video frames transmitted by the encoder within the set duration and the number of encoded video frames received by the decoder within the set duration.
  • the set of reference frames includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded. Therefore, after encoding the video frame to be encoded to obtain the encoded video frame, the encoder can store the video frame corresponding to the encoded video frame to obtain the reference frame set. For example, when the encoder acquires the first non-first video frame to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame, and the first encoded video frame is the first video frame to be encoded Encoded to get.
  • the acquired reference frame set includes the video frame corresponding to the first encoded video frame and the video frame corresponding to the second encoded video frame
  • the second The encoded video frame is obtained by encoding the first not the first video frame to be encoded.
  • the reference frame set includes reliable frames
  • the reliable frames refer to video frames corresponding to coded video frames that can be successfully decoded by the decoding end.
  • a successfully decodable encoded video frame may refer to an encoded video frame that has been successfully decoded.
  • the successfully decodable encoded video frames may also refer to the successfully received encoded video frames. For example, I frames that are successfully received.
  • a successfully decodable video frame may also refer to a successfully received coded video frame, and a reference frame of the coded video frame is also successfully received.
  • a reference frame refers to a frame that needs to be referred to when encoding a video frame.
  • the reference frame set also known as the first DPB, may include video frames corresponding to the first target number of encoded video frames, that is, may include the first target number of video frames, and the value of the first target number may be Greater than 1.
  • the number of video frames that can be included in the reference frame set is related to the number of short-term reference frames or the number of long-term reference frames set by the encoder.
  • the first target number of video frames included in the reference frame set may be 8, 16, or 32, and so on.
  • the encoding end may include a reconstructed frame buffer, and the reconstructed frame buffer may be used to store the reference frame set. Exemplarily, as shown in FIG.
  • the reference frame set DPB may include 16 video frames, that is, the first target number is 16. Assuming that the frame number of the current non-first video frame to be encoded is 29, the frame numbers of the video frames that may be included in the reference frame set are 1, 3...19, 21, 23, 25 and 27 respectively.
  • the video frames included in the reference frame set may be reference frames of most video frames to be encoded.
  • the video frames included in the reference frame set can only be encoded for video frames at the base layer to obtain video frames corresponding to the encoded video frames.
  • Step 902 the encoding end determines the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule.
  • the video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames.
  • the video encoding rules may include: encoding sub-rules corresponding to multiple different frame loss rate intervals one-to-one. When the target reference frame is determined for each video frame in the same video to be encoded according to different encoding sub-rules, the number of video frames whose corresponding target reference frame is a reliable frame is different.
  • the encoding end determines the target reference frame corresponding to the video frame to be encoded may include: determining the corresponding target encoding sub-rule according to the target frame loss rate interval to which the frame loss rate belongs. According to the target encoding sub-rules, determine the target reference frame corresponding to the video frame to be encoded.
  • the target reference frames corresponding to a set number of video frames to be encoded at every interval may be reliable frames.
  • the target reference frames corresponding to the remaining video frames to be encoded may all be the video frames with the closest display distance to the video frame to be encoded, that is, the video frame with the closest distance to the frame sequence number of the video frame to be encoded.
  • the value of the set number of intervals is different, and the larger the frame loss rate corresponding to the frame loss rate interval, the smaller the value of the set number of intervals.
  • the video encoding rule may include: a first encoding sub-rule, a second encoding sub-rule, and a third encoding sub-rule.
  • the first encoding sub-rule is also called unreliable reference rule
  • the second encoding sub-rule is also called incompletely reliable reference
  • the third encoding sub-rule is also called completely reliable reference.
  • the first encoding sub-rule is used to use the video frame with the closest frame number to the video frame to be encoded in the reference frame set as the target reference frame corresponding to the video frame to be encoded ;
  • the second coding sub-rule is used to use the reliable frame as the target reference frame corresponding to the video frames to be coded at each interval in all video frames to be coded;
  • the third coding sub-rule is used to use the reliable frame, As the target reference frame corresponding to each frame to be encoded.
  • any encoding sub-rule is used to set the frame number closest to the video frame to be encoded in the reference frame set
  • the video frame of is used as the target reference frame corresponding to the video frame to be encoded.
  • the video frame whose frame sequence number is 29 is the video frame that is not the first video frame to be encoded currently acquired.
  • the video to be encoded also includes frame numbers of non-first video frames to be encoded are 30, 31, 32, 33, and so on.
  • the number of video frames that can be included in the reference frame set is 16, and the frame numbers of the video frames included in the reference frame set are 1, 3...19, 21, 23, 25 and 27.
  • the video frame whose frame sequence number is 21 is a reliable frame as an example for illustration.
  • FIG. 11 shows a schematic diagram of the principle of the first encoding sub-rule provided by the embodiment of the present application.
  • the first encoding subrule is used to use the video frame whose frame number is closest to the video frame to be encoded in the set of reference frames as the target reference frame of the video frame to be encoded.
  • the target reference frame corresponding to the video frame to be encoded with frame number 29 is the video frame with frame number 27 found in the reference frame set.
  • Target reference frames corresponding to video frames with frame numbers 30, 31, 32, and 33 to be encoded are video frames with frame numbers 29, 29, 31, and 31 in sequence.
  • FIG. 12 shows a principle example diagram of the second coding sub-rule provided by the embodiment of the present application.
  • the second encoding sub-rule is used to use the reliable frame as the target reference frame corresponding to the interval between the video frames to be encoded among all the video frames to be encoded, that is, the set number of intervals is 1.
  • the target reference frames corresponding to the video frames to be encoded with frame numbers 29 and 33 are all reliable frames.
  • Target reference frames corresponding to video frames with frame numbers 30, 31, and 32 to be encoded are video frames with frame numbers 29, 29, and 31 in sequence.
  • FIG. 13 shows a principle example diagram of the third coding sub-rule provided by the embodiment of the present application.
  • the third encoding sub-rule is used to use reliable frames as the target reference frames corresponding to each frame to be encoded.
  • reliable frame Target reference frames corresponding to video frames with frame numbers 30 and 32 to be encoded are video frames with frame numbers 29 and 31 in sequence.
  • the arrows in FIG. 11 to FIG. 13 indicate that the video frame pointed by the arrow is the target video frame corresponding to the video frame to be encoded at the beginning of the arrow.
  • the frame number of the target reference frame used by the encoding end during encoding processing the more video frames to be encoded that are closest to the frame number of the video frame to be encoded, when the decoding end plays the decoded video frame, The higher the quality of the resulting video. Therefore, the quality of the video obtained by using the first encoding sub-rule, the second encoding sub-rule, and the third encoding sub-rule is from high to low. And because the higher the quality of the video, the higher the requirements for network transmission performance between the encoding end and the decoding end, so when the network status between the encoding end and the decoding end is poor (for example, a weak network), the first encoding subclass is used.
  • the rules, the second coding sub-rule, and the third coding sub-rule correspond to the smooth performance of video under real-time communication from low to high.
  • Step 903 the encoding end determines the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded.
  • the encoding end may determine the number of video frames between the video frame to be encoded and the target reference frame as the reference distance (RefDelta) of the video frame to be encoded.
  • the encoding end can combine the frame numbers of the video frames to be encoded with the frame numbers of the target reference frame The difference is determined as the reference distance of the video frame to be encoded.
  • the frame number of the video frame to be encoded is 3, that is, the third displayed video frame.
  • the frame number of the target reference frame is 1.
  • the reference distance is 2.
  • Step 904 The encoding end uses the target reference frame to encode the video frame to be encoded to obtain the encoded video frame.
  • the coding end may use the target reference frame to obtain the coded video frame by using predictive coding on the video frame to be coded.
  • the coding end may use the target reference frame to obtain the coded video frame by using inter-frame predictive coding on the video frame to be coded.
  • Step 905 When the coded video frame is obtained by coding the video frame at the base layer, the coder adds the video frame corresponding to the coded video frame to the reference frame set to obtain a new reference frame set.
  • the video frames included in the reference frame set are video frames at the base layer.
  • the encoding end may determine whether the encoded video frame is the encoded video frame corresponding to the video frame to be encoded at the base layer. When it is determined that the encoded video frame is not obtained by encoding the video frame to be encoded at the base layer, the video frame corresponding to the encoded video frame does not need to be added to the reference frame set.
  • the video frame corresponding to the coded video frame is added to the reference frame set to obtain a new reference frame set.
  • the new reference frame set can be obtained.
  • the encoding end determines the target reference frame corresponding to the video frame to be encoded from the new reference frame set according to the frame loss rate and the video encoding rule, so as to facilitate subsequent encoding of the video frame to be encoded by using the target reference frame.
  • the coded video frame may have a hierarchical identification.
  • the level identifier is used to indicate that the video frame to be encoded corresponding to the encoded video frame is at the base layer or at the enhancement layer.
  • the coder can add the video frame corresponding to the coded video frame to the reference frame set when it is determined that the layer identifier of the coded video frame indicates that the video frame to be coded corresponding to the coded video frame is at the base layer .
  • the decoded video frame of the coded video frame does not need to be added to the reference frame set.
  • the maximum number of video frames included in the reference frame set may be the first target number. Then, when the coded video frame is obtained by coding the video frame at the base layer, the coder can compare the number of video frames currently included in the reference frame set with the first target number. When the number of video frames currently included in the reference frame set is less than the first target number, the encoding end may directly add video frames corresponding to the encoded video frames to the reference frame set to obtain a new reference frame set.
  • the encoder can delete the video frame with the smallest frame number among all the video frames included in the reference frame set, and add the video frame corresponding to the encoded video frame to Reference frame set to get a new reference frame set.
  • the reference frame sets obtained in step 901 are all: when the encoding end executes the video encoding and decoding method for the previous video frame that is not the first to be encoded, it obtains a new reference frame set through step 905 .
  • Step 906 the encoding end sends the encoded video frame and the reference distance to the decoding end.
  • Step 907 The decoding end obtains a decoded frame set, which includes at least one decoded video frame, and the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set of the decoding end.
  • the decoding end after receiving the encoded video frame and the reference distance sent by the encoding end, the decoding end can obtain the set of decoded frames.
  • the decoded frame set is also referred to as the second DPB, which may include a second target number of decoded video frames, and the value of the second target number may be greater than 1.
  • the second target number of video frames included in the decoded frame set is the same as the first target number of video frames included in the reference frame set.
  • the set of decoded frames may include at least one decoded video frame.
  • the decoding end may decode the encoded video frames received from each encoding end, and after obtaining the decoded video frames, store the decoded video frames sequentially according to the receiving order to obtain a set of decoded frames.
  • Step 908 when the decoding end determines that the decoded frame set includes the target reference frame corresponding to the encoded video frame according to the reference distance, obtain the target reference frame corresponding to the encoded video frame.
  • the reference distance of the video frames to be encoded can be the frame number of the video frames to be encoded and the target The difference between the frame numbers of the reference frames.
  • the decoding end may determine the target frame number that differs from the frame number of the encoded video frame by the difference according to the reference distance.
  • the decoded frame set does not include the decoded video frame corresponding to the target frame number, it indicates that the decoder Wang Fuan has obtained the decoded video frame corresponding to the target frame number from the decoded frame set, that is, the decoded video frame cannot be obtained.
  • the target reference frame corresponding to the encoded video frame Even if the decoding end successfully receives the coded video frame, it cannot be decoded correctly because it cannot obtain its corresponding target reference frame.
  • the decoding end determines that the decoded frame set does not include the decoded video frame corresponding to the target frame number, the encoded video frame is discarded.
  • the frame sequence number of the currently received coded video frame is 29.
  • the frame numbers of the video frames included in the decoded frame set are 1, 3...19, 23, 25 and 27, and the reference distance is 8.
  • the target reference frame corresponding to the coded video frame is the video frame whose frame number is 21 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the video frame with frame number 21 is not included in the set of decoded frames, the received coded video frame with frame number 29 is discarded.
  • the frame sequence number of the currently received coded video frame is 29.
  • the frame numbers of the video frames included in the decoded frame set are 1, 3...19, 21, 23 and 27, and the reference distance is 8.
  • the target reference frame corresponding to the coded video frame is the video frame whose frame number is 21 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the decoded frame set includes the video frame whose frame number is 21, then the video frame whose frame number is 21 in the decoded frame set is used as the target reference frame corresponding to the encoded video frame.
  • Step 909 the decoding end uses the target reference frame to decode the coded video frame to obtain a decoded video frame.
  • the decoding end may use the target reference frame to perform predictive decoding on the coded video frame to obtain the decoded video frame.
  • the decoding end may use the target reference frame to perform inter-frame predictive decoding on the coded video frame to obtain the decoded video frame.
  • step 910 if the decoded video frame is a video frame in the base layer, the decoder adds the decoded video frame to the decoded frame set to obtain a new decoded frame set.
  • the video frame included in the decoded frame set is the decoded video frame corresponding to the video frame to be encoded at the base layer video frame.
  • the decoder determines whether the coded video frame is a coded video frame corresponding to a video frame to be coded at the base layer. When it is determined that the coded video frame is not the coded video frame corresponding to the video frame to be coded at the base layer, it is not necessary to store the decoded video frame of the coded video frame in the decoded frame set, and the decoded video frame of the coded video frame can be displayed video frames.
  • the decoded video frame of the coded video frame is stored in the decoded frame set to obtain a new decoded frame set. And display the decoded video frame of the coded video frame. Afterwards, when the decoding end receives the encoded video frame sent by the encoding end again, it can acquire the new set of decoded frames.
  • the target reference frame corresponding to the coded video frame is obtained, so as to facilitate subsequent use of the target reference frame Frame to decode encoded video frames.
  • the coded video frame received by the decoding end may have a layer identifier.
  • the level identifier is used to indicate that the video frame to be encoded corresponding to the encoded video frame is at the base layer or at the enhancement layer.
  • the decoded video frame of the coded video frame Store to the set of decoded frames and display the decoded video frame of the encoded video frame.
  • the decoded video frame of the coded video frame is displayed.
  • the maximum number of video frames included in the set of decoded frames may be the second target number. Then, when the encoded video frame is obtained by encoding the video frame at the base layer, the decoder may compare the number of video frames currently included in the decoded frame set with the second target number. When the number of video frames currently included in the decoded frame set is less than the second target number, the decoding end may directly add the decoded video frames of the coded video frame to the decoded frame set to obtain a new decoded frame set.
  • the decoder can delete the video frame with the smallest frame number among all the video frames included in the decoded frame set, and add the decoded video frame to the decoded frame Set to get a new set of decoded frames.
  • the decoded frame sets obtained in step 907 are all: when the decoder executes the video encoding and decoding method for the previously received video frame that is not the first to be encoded, a new decoded frame set is obtained through step 910 .
  • Step 911 the decoding end sends decoding feedback information to the encoding end, and the decoding feedback information includes: frame number and loss flag.
  • the lost flag is used to reflect whether the decoder successfully receives the coded video frame corresponding to the frame number and decodes the coded video frame corresponding to the frame number.
  • the lost flag may include not lost status and lost status.
  • the not-lost state may indicate that the coded video frame is successfully received by the decoding end, and the coded video frame is successfully decoded. That is, the loss flag in the not-lost state is used to reflect that the decoding end decodes the coded video frame corresponding to the frame number.
  • a lost state may indicate that the encoded video frame was successfully received at the decoder, but the encoded video frame was not successfully decoded. Alternatively, the lost state may also indicate that the decoding end has not successfully received the encoded video frame.
  • the decoding end may confirm that the loss mark of the coded video frame is not lost.
  • the decoding end determines that the set of decoded frames does not include the target reference frame corresponding to the received coded video frame, it can confirm that the loss of the coded video frame is marked as a lost state.
  • the decoding end determines that the encoded video frame corresponding to the frame sequence number has not been received, it may determine that the loss flag of the encoded video frame corresponding to the frame sequence number is in a lost state.
  • the frame number is allocated according to the display timing of multiple video frames in the video to be encoded. Therefore, the video frame to be encoded in the video to be encoded, the encoded video frame after encoding the video frame to be encoded, and the decoded video frame obtained after decoding the encoded video frame all have the same frame number.
  • Step 912 the encoder updates the reliable frames in the new reference frame set based on the decoding feedback information.
  • the encoding end may update the reliable frames in the new reference frame set obtained in step 905 based on the encoding feedback information.
  • the process of updating the reliable frames in the new reference frame set at the encoding end may include: the encoding end selects the lost flag as the unlost state, frame A video frame whose serial number is the maximum value, whose corresponding target reference frame is a reliable frame, and is in the base layer is regarded as a reliable frame.
  • the decoding end may send decoding feedback information to the encoding end each time step 909 is completed, that is, each time the encoded video frame is decoded to obtain a decoded video frame.
  • the decoding end may send decoding feedback information to the encoding end once after performing step 909 multiple times, that is, decoding the encoded video frame multiple times to obtain the decoded video frame.
  • the decoding feedback information received by the encoder each time may only include a frame sequence number and a corresponding loss flag.
  • the decoding feedback information received by the encoding end each time may include multiple frame numbers and loss flags corresponding to the multiple frame numbers.
  • the decoding end may also send decoding feedback information to the encoding end once at a set time interval.
  • the decoding feedback information received by the encoding end may include the frame sequence number of the encoded video frame transmitted between the encoding end and the decoding end within the set time period and a loss flag corresponding to the frame sequence number.
  • the frame numbers of the coded video frames sent by the encoding end to the decoding end include: frame number X1, frame number X2, and frame number X3.
  • the decoding feedback information sent by the decoding end to the encoding end after setting the interval includes: frame number X1 and the loss flag corresponding to frame number X1, frame number X2 and the loss flag corresponding to frame number X2, frame number X3 and frame number X3 Corresponding missing markers.
  • the decoding feedback information includes multiple frame numbers and the missing flags corresponding to the multiple frame numbers
  • the multiple frame numbers in the decoding feedback information can be arranged monotonically increasing according to the display order of the corresponding video frames, that is, according to the frame numbers from small to Arranged in big order.
  • the encoder can perform reliable frame judgment processing for each frame number in sequence in the order of frame numbers in the decoding feedback information from small to large, until the reliable frame judgment processing is completed for multiple frame numbers in the decoding feedback information.
  • the lost flag is not lost
  • the frame number is the maximum value
  • the corresponding target reference frame is a reliable frame
  • the video frame in the base layer is regarded as a reliable frame.
  • the reliable frame judging process includes: judging whether the loss flag corresponding to the frame number is not lost, judging whether the frame number is greater than the frame number of the current reliable frame in the new reference frame set, judging whether the video frame corresponding to the frame number is in the basic Layer video frames, and determine whether the target reference frame corresponding to the frame number is a reliable frame.
  • the frame number is greater than the frame number of the current reliable frame in the new reference frame set
  • the video frame corresponding to the frame number is a video frame in the basic layer
  • the target reference corresponding to the frame number When the frame is a reliable frame, the video frame corresponding to the frame number in the new reference frame set is taken as a reliable frame. That is, the video frame corresponding to the frame number in the new reference frame set is updated as a new reliable frame.
  • the decoding feedback information sent by the decoding end includes frame numbers 22 , 23 , 24 and 25 .
  • the frame reference relationship corresponding to the frame numbers 21, 22, 23, 24 and 25 is shown in FIG. 16 .
  • video frames with frame numbers 21, 23 and 25 are video frames in the base layer;
  • video frames with frame numbers 22 and 24 are video frames in the enhancement layer.
  • Arrow marks reference relationship among Fig. 16, and the frame sequence number of its corresponding target reference frame is 21 for the video frame of frame sequence number 22 and 23;
  • the frame sequence number of its corresponding target reference frame of frame sequence number is 24 and 25 is 23.
  • the encoding end may perform reliable frame determination processing for the frame number 22 first.
  • the video frame corresponding to the frame number 22 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 22 in the new reference frame set cannot be used as a reliable frame. Reliable frame judgment processing is then performed for frame number 23.
  • the loss mark corresponding to the frame number 23 is not lost, the frame number 23 is greater than the frame number 21 of the current reliable frame in the new reference frame set, and the video frame corresponding to the frame number 23 is a video frame in the basic layer, and the frame number 23 corresponds to The target reference frame 21 of is a reliable frame. Therefore, the video frame with frame number 23 in the new reference frame set is updated as a reliable frame. After that, reliable frame judgment processing is executed for frame number 24 .
  • the video frame corresponding to the frame number 24 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 24 in the new reference frame set cannot be used as a reliable frame. After that, reliable frame judgment processing is executed for frame number 25.
  • the loss mark corresponding to the frame number 25 is not lost, the frame number 25 is greater than the frame number 23 of the current reliable frame in the new reference frame set, and the video frame corresponding to the frame number 25 is a video frame in the basic layer, and the frame number 25 corresponds to The target reference frame 23 is a reliable frame. Therefore, the video frame with frame number 25 in the new reference frame set is updated as a reliable frame. Finally, based on the decoding feedback information received this time, the decoding end updates and obtains that the reliable frame in the new reference frame set is the video frame with frame number 25.
  • the encoding end may perform reliable frame determination processing for the frame number 22 first.
  • the video frame corresponding to the frame number 22 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 22 in the new reference frame set cannot be used as a reliable frame.
  • Reliable frame judgment processing is then performed for frame number 23. If the loss flag corresponding to frame number 23 is in the lost state, then the video frame corresponding to frame number 23 in the new reference frame set cannot be used as a reliable frame.
  • reliable frame judgment processing is executed for frame number 24 .
  • the video frame corresponding to the frame number 24 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 24 in the new reference frame set cannot be used as a reliable frame.
  • reliable frame judgment processing is executed for frame number 25.
  • the loss mark corresponding to frame number 25 is not lost, but its reference frame 23 is in a lost state, then the video frame corresponding to frame number 25 in the new reference frame set cannot be used as a reliable frame.
  • the decoding end based on the decoding feedback information received this time, the decoding end updates and obtains that the reliable frame in the new reference frame set is still the video frame with frame number 21.
  • the encoding end when the decoding end sends decoding feedback information, the encoding end may obtain the transmission frame loss rate between the encoding end and the decoding end based on the decoding feedback information sent by the decoding end.
  • the process for the encoding end to obtain the transmission frame loss rate of the encoding end and the decoding end includes: the encoding end determines the transmission frame loss rate according to the loss flag received within a set time period.
  • the encoding end may count the received decoding feedback information sent by the decoding end within a set time period t closest to the current moment of the encoding end. Count the total number of frame numbers N ack included in the decoding feedback information, that is, the total number of video frames fed back by the decoding feedback information, and count the number of all lost flags N loss , that is, the number of video frames that are marked as lost Number N loss . The encoding end determines the ratio of the number N loss of lost markers in the lost state to the total number N ack as the transmission frame loss rate P -loss between the encoding end and the decoding end.
  • the encoding end may include a sending buffer, and the sending buffer is used to store encoded video frames to be sent and a reference distance. Then, before the encoding end sends the encoded video frame and the reference distance to the decoding end, the video encoding and decoding method further includes: the encoding end writes the encoded video frame into the sending buffer. Then the process of sending the encoded video frame and the reference distance from the encoding end to the decoding end may include: when the occupancy of the sending buffer is greater than the data volume threshold, and the encoding video frame is the target encoding video frame, sending the encoding video frame to the decoding end, the target The encoded video frame is obtained by encoding the video frame at the base layer. Alternatively, when the occupancy of the sending buffer is less than or equal to the data volume threshold, the encoded video frame is sent to the decoding end.
  • the coding end encodes the video frame to be coded to obtain the coded video frame and may first store it in the sending buffer (SendBuffer).
  • the sending buffer SendBuffer
  • the occupancy of the sending buffer is greater than the data volume threshold, it indicates that there are too many encoded video frames to be sent to the decoding end stored in the sending buffer, and the network transmission capacity between the encoding end and the decoding end is insufficient, resulting in the failure to transfer the sending buffer
  • the encoded video frames stored in the internal storage are sent out in time.
  • the encoding end can delete the encoded video frames with relatively low importance stored in the sending buffer, so as to ensure that the encoded video frames with high importance can be sent to the decoding end in time through the limited network transmission capacity.
  • the occupancy of the sending buffer is less than or equal to the data volume threshold, it indicates that there are not too many encoded video frames to be sent to the decoding end stored in the sending buffer, and the network transmission capacity between the encoding end and the decoding end is sufficient.
  • the encoding end can send all the encoded video frames stored in the sending buffer to the decoding end, so as to ensure the quality of the transmitted video.
  • the encoding end may only send the target encoded video frame obtained by encoding the video frame at the base layer to the decoding end, so that in consideration of the network transmission capacity, Ensure that the decoding end receives encoded video frames that can be decoded continuously.
  • the encoding end may send the encoded video frames obtained by encoding the video frames in the base layer and the enhancement layer to the decoding end.
  • the encoding end can also determine the current network transmission capability between the encoding end and the decoding end by whether the total time range of encoded video frames accumulated and stored in the sending buffer is greater than the time threshold T -drop . When the total time range of encoded video frames accumulated and stored in the sending buffer is greater than the time threshold, it indicates that the current network transmission capacity between the encoding end and the decoding end is insufficient.
  • the encoding end determines that the encoded video frame is a target encoded video frame, it sends the encoded video frame to the decoding end, and the target encoded video frame is obtained by encoding a video frame at the base layer.
  • the encoder sends all encoded video frames to the decoder.
  • the decoding end may determine the number of encoded video frames to be sent according to the current network transmission capability between the encoding end and the decoding end. Therefore, when the network transmission capability is poor, on the basis of ensuring that the decoding end can be continuously decoded, a small number of encoded video frames can be transmitted between the encoding end and the decoding end. Therefore, the probability of frame loss caused by insufficient network transmission capacity is reduced, which in turn reduces the probability that some encoded video frames cannot be decoded correctly due to video frame loss, improves the correct decoding efficiency of video frames, and reduces playback stuttering at the decoding end, etc. probability of the problem.
  • the order of the steps of the video encoding and decoding method provided in the embodiment of the present application can be adjusted appropriately, and the steps can also be increased or decreased accordingly according to the situation.
  • the following are device embodiments of the present application, which can be used to implement the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
  • step 905 can be located before any step from step 907 to step 912, as long as it is ensured that after the encoding end obtains the encoded video frame, the encoding end updates the reference frame set based on the video frame corresponding to the encoded video frame, so that the next to-be-encoded frame can be determined
  • the reference frame set used by the video frame is the updated reference frame set.
  • the video encoding and decoding method obtains the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a set of reference frames.
  • the target reference frame corresponding to the video frame to be coded is determined from the reference frame set.
  • the distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded.
  • the video frame to be coded is coded by using the target reference frame to obtain the coded video frame. To send encoded video frames and reference distances to the decoder.
  • the reference frame set includes at least one video frame corresponding to the encoded video frame
  • the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame.
  • the video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
  • a video encoding and decoding device provided in the embodiment of the present application can execute the video encoding and decoding method applied to any one of the multiple microservice nodes of the server provided in any embodiment of the application, and has the ability to execute the video encoding and decoding method applied to the client The corresponding functional modules and effects of the video codec method.
  • Fig. 17 is a flowchart showing a video codec device according to an exemplary embodiment, and the video codec device is applied to an encoding end.
  • a video codec device 1700 includes: an acquisition module 1701 , a determination module 1702 , an encoding module 1703 and a sending module 1704 .
  • the acquisition module 1701 is configured to acquire the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set is related to the decoding
  • the video frame corresponding to the coded video frame that can be successfully decoded by the end is a reliable frame
  • the determination module 1702 is configured to determine the target reference frame corresponding to the video frame to be coded from the reference frame set according to the frame loss rate and the video coding rule, and video coding
  • the rules include: the larger the frame loss rate, the larger the target reference frame corresponding to the larger number of video frames in the video to be encoded is a reliable frame; and it is also set to determine the distance between the video frame to be encoded and the target reference frame in display timing as The reference distance of the video frame to be encoded; the encoding module 1703 is configured to use the target reference frame to encode the video frame to be encoded to obtain
  • the video encoding and decoding device acquires the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and the reference frame set through the acquisition module.
  • the determination module determines the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule.
  • the distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded.
  • the encoding module uses the target reference frame to encode the video frame to be encoded to obtain the encoded video frame. So that the sending module sends the coded video frame and the reference distance to the decoding end.
  • the reference frame set includes at least one video frame corresponding to the encoded video frame
  • the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame.
  • the video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
  • a video codec device provided in the embodiment of the present application can execute the video codec method applied to the microservice management device provided in any embodiment of the present application, and has the corresponding functions for executing the video codec method applied to the microservice management device Function modules and effects.
  • Fig. 18 is a flowchart showing a video codec device according to an exemplary embodiment, and the video codec device is applied to a decoding end.
  • a video codec device 1800 includes: a receiving module 1801 , an acquiring module 1802 and a decoding module 1803 .
  • the receiving module 1801 is configured to receive the encoded video frame and the reference distance sent by the encoding end according to any video codec device provided by the embodiment of the present application; the obtaining module 1802 is configured to obtain a set of decoded frames, and the set of decoded frames includes at least one decoding After the video frame, the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set at the decoding end; and it is also set to determine the encoded video frame in the decoded frame set according to the reference distance When the corresponding target reference frame is obtained, the target reference frame corresponding to the coded video frame is obtained; the decoding module 1803 is configured to decode the coded video frame by using the target reference frame to obtain a decoded video frame.
  • the video codec device provided in the embodiment of the present application receives, through the receiving module, the coded video frame and the reference distance generated by the coder according to a video codec method provided in the embodiment of the present application.
  • the acquiring module acquires a decoded frame set, and the decoded frame set includes at least one decoded video frame.
  • the target reference frame corresponding to the coded video frame can be obtained from the decoded frame set according to the reference distance. Therefore, the decoding module uses the target reference frame to decode the coded video frame to obtain the decoded video frame.
  • the encoded video frame is generated by the encoding end according to a video encoding and decoding method provided by the embodiment of the present application.
  • the network status between the video sending end and the video receiving end is poor, resulting in the loss of video frames during transmission, the greater the frame loss rate, the more encoded video frames are used and the decoding end can be successfully decoded
  • the encoded video frame corresponds to the reliable frame encoding. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
  • Fig. 19 is a block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device provided in this embodiment of the present application includes a processor 1901, a memory 1902, and a computer program stored on the memory 1902 and operable on the processor 1901, and the computer program is implemented when executed by the processor 1901.
  • the embodiment of the present application also provides a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer-readable storage medium is, for example, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • the computer-readable storage medium may be a non-transitory storage medium.

Abstract

Disclosed are a video coding and decoding method and apparatus. The video coding and decoding method comprises: acquiring a non-first video frame to be coded, a transmission frame loss rate between a coding end and a decoding end, and a reference frame set, wherein the reference frame set comprises a video frame corresponding to at least one coded video frame, and in the reference frame set, a video frame corresponding to a coded video frame, which can be successfully decoded by the decoding end, is a reliable frame; according to the frame loss rate and a video coding rule, determining, from the reference frame set, a target reference frame corresponding to the video frame to be coded, wherein the video coding rule comprises: the greater the frame loss rate, the more video frames there are in a video to be coded that correspond to target reference frames, which are reliable frames; determining the distance between the video frame to be coded and the target reference frame in a display time sequence to be a reference distance of the video frame to be coded; coding, by using the target reference frame, the video frame to be coded, so as to obtain a coded video frame; and sending the coded video frame and the reference distance to the decoding end.

Description

视频编解码方法及装置Video encoding and decoding method and device
本申请要求在2021年06月16日提交中国专利局、申请号为202110667857.9的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202110667857.9 submitted to the China Patent Office on June 16, 2021, the entire content of which is incorporated herein by reference.
技术领域technical field
本申请涉及计算机技术领域,例如涉及一种视频编解码方法及装置。The present application relates to the field of computer technology, for example, to a video encoding and decoding method and device.
背景技术Background technique
为了便于视频传输,视频发送端通常在发送视频流之前,对待发送的视频流中的多个视频帧编码,得到编码后的视频流。并将编码后的视频流发送至视频接收端。In order to facilitate video transmission, the video sender usually encodes multiple video frames in the video stream to be sent before sending the video stream to obtain an encoded video stream. And send the encoded video stream to the video receiving end.
编码后的视频流主要包括:间隔排布的帧内编码帧(又称Intra帧、I帧)和帧间预测编码帧(又称Inter帧、P帧)两种类型的视频帧。其中,I帧为可独立解码的帧,即I帧在视频接收端解码时,无需参考其他帧数据即可得到解码后的视频帧。P帧无法独立解码。即P帧在视频接收端解码时,需要依赖其前一视频帧解码,才可以得到解码后的视频帧。即P帧的正确解码依赖其前一帧的正确解码。The encoded video stream mainly includes two types of video frames: intra-frame coding frames (also called Intra frames, I frames) and inter-frame predictive coding frames (also called Inter frames, P frames) arranged at intervals. Wherein, the I frame is a frame that can be decoded independently, that is, when the I frame is decoded at the video receiving end, the decoded video frame can be obtained without referring to other frame data. P frames cannot be decoded independently. That is, when a P frame is decoded at the video receiving end, it needs to rely on the decoding of its previous video frame to obtain a decoded video frame. That is, the correct decoding of a P frame depends on the correct decoding of its previous frame.
但是,由于P帧仅能依赖其前一视频帧解码,而当视频发送端和视频接收端之间网络状态较差时,可能会出现传输过程中视频帧丢失的情况。因此容易导致部分P帧无法正确解码,降低了视频帧的正确解码效率,提高了视频接收端出现播放卡顿等问题的概率。However, since the P frame can only be decoded depending on its previous video frame, when the network status between the video sending end and the video receiving end is poor, video frames may be lost during transmission. Therefore, it is easy to cause some P frames to be unable to be correctly decoded, which reduces the efficiency of correct decoding of video frames, and increases the probability of problems such as playback freezes at the video receiving end.
发明内容Contents of the invention
本申请提供一种视频编解码方法及装置、电子设备、存储介质。The present application provides a video encoding and decoding method and device, electronic equipment, and a storage medium.
本申请提供一种视频编解码方法,应用于编码端,包括:This application provides a video encoding and decoding method, which is applied to the encoding end, including:
获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,所述参考帧集合包括至少一个编码视频帧对应的视频帧,所述参考帧集合中与解码端可成功解码的编码视频帧对应的视频帧为可靠帧;Obtain the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set can be related to the decoding end The video frame corresponding to the successfully decoded coded video frame is a reliable frame;
根据所述丢帧率和视频编码规则,从所述参考帧集合中确定所述待编码的视频帧对应的目标参考帧,所述视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为所述可靠帧;According to the frame loss rate and the video encoding rule, determine the target reference frame corresponding to the video frame to be encoded from the reference frame set, and the video encoding rule includes: the larger the frame loss rate, the more video frames to be encoded The target reference frame corresponding to the number of video frames is the reliable frame;
将所述待编码的视频帧与所述目标参考帧在显示时序上的距离,确定为所述待编码的视频帧的参考距离;determining the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded;
利用所述目标参考帧对所述待编码的视频帧编码,得到编码视频帧;Encoding the video frame to be encoded by using the target reference frame to obtain an encoded video frame;
向所述解码端发送所述编码视频帧以及所述参考距离。sending the coded video frame and the reference distance to the decoding end.
本申请提供一种视频编解码方法,应用于解码端,包括:This application provides a video encoding and decoding method, which is applied to the decoding end, including:
接收编码端根据上述视频编解码方法发送的编码视频帧以及参考距离;Receive the encoded video frame and the reference distance sent by the encoding end according to the above video encoding and decoding method;
获取解码帧集合,所述解码帧集合包括至少一个解码后的视频帧,所述解码帧集合所包括的视频帧的数量,大于或者等于所述解码端的参考帧集合所包括的视频帧的数量;Obtain a set of decoded frames, the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to the number of video frames included in the set of reference frames at the decoding end;
在根据所述参考距离,确定所述解码帧集合中包括所述编码视频帧对应的目标参考帧时,获取所述编码视频帧对应的目标参考帧;When it is determined according to the reference distance that the set of decoded frames includes a target reference frame corresponding to the encoded video frame, acquiring the target reference frame corresponding to the encoded video frame;
利用所述目标参考帧对所述编码视频帧解码,得到解码后的视频帧。Decoding the coded video frame by using the target reference frame to obtain a decoded video frame.
本申请提供一种视频编解码装置,应用于编码端,包括:This application provides a video codec device, which is applied to the encoding end, including:
获取模块,设置为获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,所述参考帧集合包括至少一个编码视频帧对应的视频帧,所述参考帧集合中与解码端可成功解码的编码视频帧对应的视频帧为可靠帧;The acquisition module is configured to acquire the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set The video frames corresponding to the encoded video frames that can be successfully decoded by the decoder are reliable frames;
确定模块,设置为根据所述丢帧率和视频编码规则,从所述参考帧集合中确定所述待编码的视频帧对应的目标参考帧,所述视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为所述可靠帧;以及还设置为将所述待编码的视频帧与所述目标参考帧在显示时序上的距离,确定为所述待编码的视频帧的参考距离;The determination module is configured to determine the target reference frame corresponding to the video frame to be encoded from the set of reference frames according to the frame loss rate and video coding rules, the video coding rules include: the larger the frame loss rate, The target reference frame corresponding to the larger number of video frames in the video to be encoded is the reliable frame; and it is also set to determine the distance between the video frame to be encoded and the target reference frame in display timing as the the reference distance of the encoded video frame;
编码模块,设置为利用所述目标参考帧对所述待编码的视频帧编码,得到编码视频帧;An encoding module, configured to use the target reference frame to encode the video frame to be encoded to obtain an encoded video frame;
发送模块,设置为向所述解码端发送所述编码视频帧以及所述参考距离。A sending module, configured to send the coded video frame and the reference distance to the decoding end.
本申请提供一种视频编解码装置,应用于解码端,包括:This application provides a video codec device, which is applied to the decoding end, including:
接收模块,设置为接收编码端根据上述视频编解码方法发送的编码视频帧以及参考距离;The receiving module is configured to receive the encoded video frame and the reference distance sent by the encoding end according to the above-mentioned video encoding and decoding method;
获取模块,设置为获取解码帧集合,所述解码帧集合包括至少一个解码后的视频帧,所述解码帧集合所包括的视频帧的数量,大于或者等于所述解码端的参考帧集合所包括的视频帧的数量;An acquisition module, configured to acquire a set of decoded frames, the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to that included in the set of reference frames at the decoding end the number of video frames;
确定模块,设置为在根据所述参考距离,确定所述解码帧集合中包括所述 编码视频帧对应的目标参考帧时,获取所述编码视频帧对应的目标参考帧;The determining module is configured to obtain the target reference frame corresponding to the encoded video frame when determining that the set of decoded frames includes the target reference frame corresponding to the encoded video frame according to the reference distance;
解码模块,设置为利用所述目标参考帧对所述编码视频帧解码,得到解码后的视频帧。The decoding module is configured to use the target reference frame to decode the coded video frame to obtain a decoded video frame.
本申请提供了一种电子设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述的视频编解码方法。The present application provides an electronic device, including a processor, a memory, and a computer program stored on the memory and operable on the processor. When the computer program is executed by the processor, the above-mentioned video coding decoding method.
本申请提供了一种计算机可读存储介质,其中,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述的视频编解码方法。The present application provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above video encoding and decoding method is implemented.
附图说明Description of drawings
图1是本申请实施例提供的一种视频处理系统的结构示意图;FIG. 1 is a schematic structural diagram of a video processing system provided by an embodiment of the present application;
图2是本申请实施例提供的另一种视频处理系统的结构示意图;FIG. 2 is a schematic structural diagram of another video processing system provided by an embodiment of the present application;
图3是本申请实施例提供的一种IPPP的帧结构示意图;FIG. 3 is a schematic diagram of an IPPP frame structure provided by an embodiment of the present application;
图4是本申请实施例提供的另一种IPPP的帧结构示意图;FIG. 4 is a schematic diagram of another IPPP frame structure provided by an embodiment of the present application;
图5是本申请实施例提供的一种时域可分级类型的可伸缩视频编码(Scalable Video Coding,SVC)编码视频帧结构的示意图;Fig. 5 is a schematic diagram of a time-domain scalable type of scalable video coding (Scalable Video Coding, SVC) coded video frame structure provided by an embodiment of the present application;
图6是本申请实施例提供的一种视频编解码方法的流程图;FIG. 6 is a flowchart of a video encoding and decoding method provided by an embodiment of the present application;
图7是本申请实施例提供的另一种视频编解码方法的流程图;FIG. 7 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application;
图8是本申请实施例提供的又一种视频编解码方法的流程图;FIG. 8 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application;
图9是本申请实施例提供的再一种视频编解码方法的流程图;FIG. 9 is a flow chart of another video encoding and decoding method provided by an embodiment of the present application;
图10是本申请实施例提供的一种参考帧集合的示意图;FIG. 10 is a schematic diagram of a set of reference frames provided by an embodiment of the present application;
图11是本申请实施例提供的一种编码子规则的原理示意图;Fig. 11 is a schematic diagram of the principle of an encoding sub-rule provided by the embodiment of the present application;
图12是本申请实施例提供的另一种编码子规则的原理示意图;Fig. 12 is a schematic diagram of the principle of another coding sub-rule provided by the embodiment of the present application;
图13是本申请实施例提供的又一种编码子规则的原理示意图;Fig. 13 is a schematic diagram of the principle of another coding sub-rule provided by the embodiment of the present application;
图14是本申请实施例提供的一种获取目标参考帧的原理示意图;FIG. 14 is a schematic diagram of the principle of acquiring a target reference frame provided by an embodiment of the present application;
图15是本申请实施例提供的另一种获取目标参考帧的原理示意图;FIG. 15 is a schematic diagram of another method for obtaining a target reference frame provided by an embodiment of the present application;
图16是本申请实施例提供的一种视频帧参考关系的示意图;Fig. 16 is a schematic diagram of a video frame reference relationship provided by an embodiment of the present application;
图17是本申请实施例提供的一种视频编解码装置的框图;Fig. 17 is a block diagram of a video encoding and decoding device provided by an embodiment of the present application;
图18是本申请实施例提供的另一种视频编解码装置的框图;FIG. 18 is a block diagram of another video codec device provided by an embodiment of the present application;
图19是本申请实施例提供的一种电子设备的框图。Fig. 19 is a block diagram of an electronic device provided by an embodiment of the present application.
具体实施方式detailed description
这里将对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式仅是与本申请的一些方面相一致的装置和方法的例子。Reference will now be made to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments are merely examples of apparatuses and methods consistent with aspects of the present application.
请参考图1,其示出了本申请实施例提供的一种视频处理系统的结构示意图。该视频处理系统为视频编解码方法所涉及的实施环境。如图1所示,该视频处理系统可以包括:编码端101以及至少一个解码端102。图1中以一个解码端102为例进行说明。其中,编码端101和解码端102之间可以通过有线网络或者无线网络连接。编码端101和解码端102可以均位于电子设备上。该电子设备可以为移动终端,该移动终端可以为手机、电脑、多媒体播放器、电子阅读器、可穿戴设备等。编码端和解码端可以通过电子设备的操作系统实现其功能,或者通过安装于电子设备上的客户端实现其功能。Please refer to FIG. 1 , which shows a schematic structural diagram of a video processing system provided by an embodiment of the present application. The video processing system is the implementation environment involved in the video codec method. As shown in FIG. 1 , the video processing system may include: an encoding end 101 and at least one decoding end 102 . In FIG. 1, a decoding terminal 102 is taken as an example for illustration. Wherein, the encoding end 101 and the decoding end 102 may be connected through a wired network or a wireless network. Both the encoding end 101 and the decoding end 102 may be located on the electronic device. The electronic device may be a mobile terminal, and the mobile terminal may be a mobile phone, a computer, a multimedia player, an electronic reader, a wearable device, and the like. The encoding end and the decoding end can realize their functions through the operating system of the electronic device, or realize their functions through the client installed on the electronic device.
可选的,在图1所示的视频处理系统的基础上,请参考图2,其示出了本申请实施例提供的另一种视频处理系统的结构示意图。如图2所示,编码端101包括:反馈整理模块1011、编码模块1012、发送模块1013以及解码模块1014。编码模块1012分别与反馈整理模块1011和发送模块1013连接。解码端102包括:反馈整理模块1021、编码模块1022、发送模块1023以及解码模块1024。编码模块1022分别与反馈整理模块1021和发送模块1023连接。其中,编码端101和编码端102中相同模块的功能相同。反馈整理模块设置为整理对端发送的解码反馈信息。编码模块设置为基于待编码视频生成编码视频帧,构成视频流。发送模块均设置为向对端的解码模块发送编码视频帧。解码模块设置为对接收到编码视频帧解码,得到解码后的视频帧。Optionally, on the basis of the video processing system shown in FIG. 1 , please refer to FIG. 2 , which shows a schematic structural diagram of another video processing system provided by an embodiment of the present application. As shown in FIG. 2 , the encoding end 101 includes: a feedback sorting module 1011 , an encoding module 1012 , a sending module 1013 and a decoding module 1014 . The coding module 1012 is connected to the feedback sorting module 1011 and the sending module 1013 respectively. The decoding end 102 includes: a feedback sorting module 1021 , an encoding module 1022 , a sending module 1023 and a decoding module 1024 . The coding module 1022 is connected to the feedback sorting module 1021 and the sending module 1023 respectively. Wherein, the functions of the same modules in the encoding end 101 and the encoding end 102 are the same. The feedback sorting module is configured to sort out the decoded feedback information sent by the opposite end. The coding module is configured to generate coded video frames based on the video to be coded to form a video stream. The sending modules are all configured to send encoded video frames to the decoding module at the opposite end. The decoding module is configured to decode the received coded video frame to obtain the decoded video frame.
本申请实施例中,提供的视频编解码方法可以应用于实时通信(Real-Time Communication,RTC)场景。该实时通信场景可以包括视频通信场景、直播场景等。示例的,针对直播场景,当一主播用户在视频直播时,编码端位于该主播用户进行视频直播的主播终端。该主播终端通过对待编码视频执行视频编码方法生成一定清晰度的视频所对应的视频流,并将生成的视频流发送至观众终端。观众终端指的是观看主播用户的直播视频的用户所在终端。解码端位于观众终端,可以对视频流中的编码视频帧进行解码,得到解码后的视频帧,从而得到一定清晰度的视频。In the embodiment of the present application, the video encoding and decoding method provided can be applied to a real-time communication (Real-Time Communication, RTC) scenario. The real-time communication scene may include a video communication scene, a live broadcast scene, and the like. For example, for a live broadcast scenario, when an anchor user is performing live video broadcasting, the encoding end is located at the anchor terminal where the anchor user performs live video broadcasting. The host terminal generates a video stream corresponding to a video with a certain definition by performing a video encoding method on the video to be encoded, and sends the generated video stream to the audience terminal. The audience terminal refers to the terminal of the user who watches the live video of the anchor user. The decoding end is located at the audience terminal, and can decode the coded video frames in the video stream to obtain the decoded video frames, thereby obtaining a video with a certain definition.
为了便于读者理解,本申请实施例将下文涉及的部分专业名词在此进行说 明。In order to facilitate the reader's understanding, some of the professional terms involved in the following are described here in the embodiments of the present application.
1、帧内预测编码和帧间预测编码1. Intra-frame predictive coding and inter-frame predictive coding
对视频帧进行编码时,可以使用帧内预测编码模式或者帧间预测编码模式。其中,在采用帧内预测编码模式对视频帧进行帧内预测编码时,无需利用其它视频帧生成I帧。在采用帧间预测编码模式对视频帧进行帧间预测编码时,可以利用显示顺序上前一帧视频帧作为参考帧,生成P帧。帧间预测编码为前向预测编码。When encoding a video frame, an intra-frame predictive encoding mode or an inter-frame predictive encoding mode may be used. Wherein, when an intra-frame predictive coding mode is used to perform intra-frame predictive coding on a video frame, there is no need to use other video frames to generate an I frame. When the inter-frame predictive coding mode is used to perform inter-frame predictive coding on video frames, the previous video frame in the display sequence can be used as a reference frame to generate a P frame. Inter-frame predictive coding is forward predictive coding.
2、IPPP模式的编码视频帧结构2. Encoded video frame structure in IPPP mode
在实时视频通信的应用场景中,为了提升视频压缩效率,一般会使用IPPP的帧结构,即I帧之后编码多个P帧。如图3所示,图3示出了IPPP的帧结构下,多个视频帧的帧参考关系。其中,多个视频帧的帧参考关系示意为:I帧、P帧、P帧......P帧、I帧、P帧、P帧......P帧......I帧、P帧、P帧......P帧等。图3中I表示I帧,P表示P帧。其中,图3中视频帧之间的箭头指示该视频帧的参考帧。每个P帧的参考帧只为其前一帧。In the application scenario of real-time video communication, in order to improve video compression efficiency, the frame structure of IPPP is generally used, that is, multiple P frames are encoded after an I frame. As shown in FIG. 3 , FIG. 3 shows the frame reference relationship of multiple video frames under the IPPP frame structure. Among them, the frame reference relationship of multiple video frames is shown as: I frame, P frame, P frame...P frame, I frame, P frame, P frame...P frame... ..I frame, P frame, P frame...P frame, etc. In Fig. 3, I represents an I frame, and P represents a P frame. Wherein, the arrows between the video frames in FIG. 3 indicate the reference frames of the video frames. The reference frame of each P frame is only its previous frame.
但当网络情况不好时(如抖动,丢包,限速等),这种帧结构容易造成视频卡顿。请参考图4,图4为在图3所示的帧结构下,若X帧丢失,则在下一个I帧之前的所有P帧,即图4中标识的P-failure帧集合均无法正确解码。这种情况下,即使网络情况较差只造成个别帧的丢失,但由于解码端很多帧不能正确解码,会造成解码端播放视频的卡顿。But when the network condition is not good (such as jitter, packet loss, speed limit, etc.), this frame structure is easy to cause video freeze. Please refer to Figure 4, which shows that under the frame structure shown in Figure 3, if the X frame is lost, all P frames before the next I frame, that is, the P-failure frame set identified in Figure 4 cannot be decoded correctly. In this case, even if the network condition is poor, only a few frames are lost, but because many frames cannot be decoded correctly on the decoder side, it will cause the video playback on the decoder side to freeze.
3、SVC3. SVC
SVC是主流的视频编解码标准,如H.264的一个扩展。SVC采用分层预测结构,可以分为时域可分级(Temporal scalability)SVC、空间可分级(Spatial scalability)SVC和质量可分级(Quality scalability)SVC三种类型。以时域可分级类型为例,SVC编码可以得到原视频的基本层和多个增强层。SVC is a mainstream video codec standard, such as an extension of H.264. SVC adopts a hierarchical prediction structure, which can be divided into three types: Temporal scalability (Temporal scalability) SVC, spatial scalability (Spatial scalability) SVC and quality scalability (Quality scalability) SVC. Taking the time-domain scalable type as an example, SVC coding can obtain the base layer and multiple enhancement layers of the original video.
请参考图5,其示出了一种时域可分级类型的SVC编码视频帧结构的示意图。如图5所示,待编码视频所包括的视频帧可以划分为时域上的基本层(Layer0)和一个增强层(Layer1)。其中,基本层的视频帧可以采用帧内预测编码或者帧间预测编码,得到周期排布的IPPP模式下的编码帧结构。即图5中,基本层的视频帧对应的编码视频帧包括一个I帧之后编码的多个P帧后,再次一个I帧之后编码的多个P帧。增强层(高层)的视频帧可以利用基本层的视频帧作为参考帧进行编码,得到编码视频帧。图5中p表示增强层的视频帧利用基本层的视频帧作为参考帧进行编码后得到的编码视频帧。图5中视频帧之间的箭头指示该视频帧的参考帧。Please refer to FIG. 5 , which shows a schematic diagram of a time-domain scalable SVC encoded video frame structure. As shown in FIG. 5 , the video frames included in the video to be encoded can be divided into a base layer (Layer0) and an enhancement layer (Layer1) in the temporal domain. Wherein, the video frames of the base layer may adopt intra-frame predictive coding or inter-frame predictive coding to obtain a coded frame structure in the periodically arranged IPPP mode. That is, in FIG. 5 , the encoded video frames corresponding to the video frames of the base layer include multiple P frames encoded after an I frame, and multiple P frames encoded after an I frame. The video frame of the enhancement layer (higher layer) can be coded by using the video frame of the base layer as a reference frame to obtain a coded video frame. In FIG. 5 , p represents the coded video frame obtained after the video frame of the enhancement layer is coded by using the video frame of the base layer as a reference frame. Arrows between video frames in FIG. 5 indicate reference frames for that video frame.
在SVC帧结构下,若增强层中的编码视频帧丢失,则不会影响到基本层中的编码视频帧的解码。例如,图5中X帧丢失,则仅X帧无法正确解码。其余编码视频帧均可以正确解码,可以减少因编码视频帧无法解码造成的解码端播放视频卡顿的概率。Under the SVC frame structure, if the coded video frame in the enhancement layer is lost, it will not affect the decoding of the coded video frame in the base layer. For example, if frame X is missing in Figure 5, only frame X cannot be decoded correctly. The rest of the encoded video frames can be decoded correctly, which can reduce the probability of video freeze at the decoding end caused by the inability to decode the encoded video frames.
请参考图6,其示出了本申请实施例提供的一种视频编解码方法的流程图。视频编解码方法应用于图1和图2所示的编码端。如图6所示,视频编解码方法包括:Please refer to FIG. 6 , which shows a flowchart of a video encoding and decoding method provided by an embodiment of the present application. The video encoding and decoding method is applied to the encoding end shown in Fig. 1 and Fig. 2 . As shown in Figure 6, the video encoding and decoding methods include:
步骤601、获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,参考帧集合包括至少一个编码视频帧对应的视频帧,参考帧集合中与解码端可成功解码的编码视频帧对应的视频帧为可靠帧。 Step 601. Obtain the non-first video frame to be encoded, the transmission frame loss rate between the encoding end and the decoding end, and a reference frame set. The reference frame set includes at least one video frame corresponding to the encoded video frame. In the reference frame set, the decoding end can successfully The video frame corresponding to the decoded coded video frame is a reliable frame.
本申请实施例中,视频编码方法可以应用于一整段待编码视频内,或者,也可以应用于待编码视频中的一个编码周期内。待编码视频可以包括多个编码周期,每个编码周期可以包括多个视频帧。若视频编码方法应用于一整段待编码视频内,则首个待编码的视频帧为待编码视频包括的多个待编码的视频帧中,按照显示时序排列的首个待编码的视频帧。相应的非首个待编码的视频帧则为待编码视频包括的多个待编码的视频帧中,除首个待编码的视频帧之外的视频帧。若视频编码方法应用于一个编码周期内,则首个待编码视频帧为编码周期包括的多个待编码的视频帧中,按照显示时序排列的首个待编码的视频帧。相应非首个待编码的视频帧则为编码周期包括的多个待编码的视频帧中,除首个待编码的视频帧之外的视频帧。In the embodiment of the present application, the video encoding method may be applied to an entire video to be encoded, or may also be applied to an encoding cycle of the video to be encoded. The video to be encoded may include multiple encoding periods, and each encoding period may include multiple video frames. If the video coding method is applied to a whole section of video to be coded, the first video frame to be coded is the first video frame to be coded arranged according to display timing among the multiple video frames to be coded included in the video to be coded. The corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the video to be encoded. If the video coding method is applied in one coding cycle, the first video frame to be coded is the first video frame to be coded arranged according to the display time sequence among the multiple video frames to be coded included in the coding cycle. The corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the encoding cycle.
编码端与解码端的传输丢帧率指的是设定时长内,解码端未接收到的视频帧的数量与编码端向解码端传输的视频帧的数量比,解码端未接收的视频帧即为解码端丢失的视频帧。编码视频帧指的是对待编码的视频帧编码后得到的视频帧。编码视频帧对应的视频帧即为该编码视频帧对应的编码前的视频帧,或者,对编码视频帧进行重建处理后得到重建帧。示例的,编码端可以接收解码端发送的每设定时长内接收到的编码视频帧的数量。并获取编码端在该设定时长内传输的编码视频帧的数量。将解码端未接收到的编码视频帧的数量与编码端向解码端传输的编码视频帧的数量之比作为编码端与解码端的传输丢帧率,该解码端未接收到的编码视频帧的数量为编码端在该设定时长内传输的编码视频帧的数量与解码端在设定时长内接收到的编码视频帧的数量之差。The transmission frame loss rate between the encoding end and the decoding end refers to the ratio of the number of video frames not received by the decoding end to the number of video frames transmitted from the encoding end to the decoding end within the set time period. The video frames not received by the decoding end are Video frames lost on the decoding side. The encoded video frame refers to a video frame obtained after encoding the video frame to be encoded. The video frame corresponding to the coded video frame is the video frame before coding corresponding to the coded video frame, or the reconstructed frame is obtained after the coded video frame is reconstructed. For example, the encoding end may receive the number of encoded video frames received within each set time period sent by the decoding end. And obtain the number of encoded video frames transmitted by the encoding end within the set duration. The ratio of the number of encoded video frames not received by the decoding end to the number of encoded video frames transmitted from the encoding end to the decoding end is used as the transmission frame loss rate between the encoding end and the decoding end, and the number of encoded video frames not received by the decoding end is the difference between the number of encoded video frames transmitted by the encoder within the set duration and the number of encoded video frames received by the decoder within the set duration.
参考帧集合包括至少一个编码视频帧对应的视频帧。也即是参考帧集合包括:待编码视频中编码后得到的编码视频帧所对应的视频帧。因而,编码端在对待编码的视频帧编码得到编码视频帧后,可以将该编码视频帧对应的视频帧 存储,得到参考帧集合。例如,在编码端获取到第一个非首个待编码的视频帧时,获取到的参考帧集合包括首个编码视频帧对应的视频帧,该首个编码视频帧为首个待编码的视频帧编码得到。在编码端获取到第二个非首个待编码的视频帧时,获取到的参考帧集合包括首个编码视频帧对应的视频帧以及第二个编码视频帧对应的视频帧,该第二个编码视频帧为第一个非首个待编码的视频帧编码得到。其中,参考帧集合包括可靠帧,该可靠帧指的是与解码端可成功解码的编码视频帧对应的视频帧。可成功解码的编码视频帧可以指的是已成功解码的编码视频帧。或者,可成功解码的编码视频帧也可以指的是被成功接收的编码视频帧。例如,被成功接收的I帧。或者,可成功解码的视频帧也可以指的是被成功接收的编码视频帧,且该编码视频帧的参考帧也被成功接收。参考帧指的是视频帧编码时所需参考的帧。The set of reference frames includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded. Therefore, after encoding the video frame to be encoded to obtain the encoded video frame, the encoder can store the video frame corresponding to the encoded video frame to obtain the reference frame set. For example, when the encoder acquires the first non-first video frame to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame, and the first encoded video frame is the first video frame to be encoded Encoded to get. When the encoder obtains the second video frame that is not the first to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame and the video frame corresponding to the second encoded video frame, the second The encoded video frame is obtained by encoding the first not the first video frame to be encoded. Wherein, the reference frame set includes reliable frames, and the reliable frames refer to video frames corresponding to coded video frames that can be successfully decoded by the decoding end. A successfully decodable encoded video frame may refer to an encoded video frame that has been successfully decoded. Alternatively, the successfully decodable encoded video frames may also refer to the successfully received encoded video frames. For example, I frames that are successfully received. Alternatively, a successfully decodable video frame may also refer to a successfully received coded video frame, and a reference frame of the coded video frame is also successfully received. A reference frame refers to a frame that needs to be referred to when encoding a video frame.
可选的,参考帧集合,又称第一解码图像缓存(Decoded Picture Buffer,DPB),其可以包括第一目标数量个编码视频帧分别对应的视频帧,即可以包括第一目标数量个视频帧,第一目标数量的取值可以大于1。参考帧集合中可包括的视频帧的数量与编码端设定的短期参考帧数或者长期参考帧数相关。示例的,参考帧集合包括的视频帧的第一目标数量可以为8、16或者32等。其中,编码端可以包括重建帧缓存器,该重建帧缓存器可以用于存储参考帧集合。Optionally, the reference frame set, also known as the first decoded picture buffer (Decoded Picture Buffer, DPB), which may include video frames respectively corresponding to the first target number of encoded video frames, that is, may include the first target number of video frames , the value of the first target quantity may be greater than 1. The number of video frames that can be included in the reference frame set is related to the number of short-term reference frames or the number of long-term reference frames set by the encoder. Exemplarily, the first target number of video frames included in the reference frame set may be 8, 16, or 32, and so on. Wherein, the encoding end may include a reconstructed frame buffer, and the reconstructed frame buffer may be used to store the reference frame set.
本申请实施例中,参考帧集合中包括的视频帧可以为大多数待编码的视频帧的参考帧。示例的,在待编码视频采用IPPP模式的编码视频帧结构时,参考帧集合包括的视频帧可以为任一编码视频帧对应的视频帧。在待编码视频采用时域可分级类型的SVC编码时,参考帧集合包括的视频帧仅可以为处于基本层的视频帧编码得到编码视频帧对应的视频帧。In this embodiment of the present application, the video frames included in the reference frame set may be reference frames of most video frames to be encoded. For example, when the coded video frame structure of the IPPP mode is used for the video to be coded, the video frames included in the reference frame set may be video frames corresponding to any coded video frame. When the video to be encoded adopts temporally scalable SVC encoding, the video frames included in the reference frame set can only be encoded for video frames at the base layer to obtain video frames corresponding to the encoded video frames.
步骤602、根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧,视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧。Step 602: Determine the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule. The video encoding rule includes: the larger the frame loss rate, the greater the number of video frames in the video to be encoded. The target reference frame of is a reliable frame.
可选的,待编码视频包括的多个待编码的视频帧可以按照显示时序的顺序具有帧序号。视频编码规则可以包括:在丢帧率小于或者等于第一数量阈值时,待编码视频包括的待编码的视频帧所对应的目标参考帧可以均为与该待编码的视频帧显示距离最近的,即与该待编码的视频帧的帧序号距离最近的视频帧。在丢帧率大于第一数量阈值时,待编码视频包括的待编码的视频帧中,每间隔设定数量的待编码的视频帧所对应的目标参考帧可以为可靠帧。剩余待编码的视频帧所对应的目标参考帧可以均为与该待编码的视频帧显示距离最近的,即与该待编码的视频帧的帧序号距离最近的视频帧。其中,丢帧率越大,间隔的设定数量的取值越小。Optionally, the multiple video frames to be encoded included in the video to be encoded may have frame numbers in order of display timing. The video coding rules may include: when the frame loss rate is less than or equal to the first quantity threshold, the target reference frames corresponding to the video frames to be coded included in the video to be coded may all have the closest display distance to the video frames to be coded, That is, the video frame closest to the frame sequence number of the video frame to be encoded. When the frame loss rate is greater than the first number threshold, among the video frames to be encoded included in the video to be encoded, target reference frames corresponding to a set number of video frames to be encoded at each interval may be reliable frames. The target reference frames corresponding to the remaining video frames to be encoded may all be the video frames with the closest display distance to the video frame to be encoded, that is, the video frame with the closest distance to the frame sequence number of the video frame to be encoded. Wherein, the larger the frame loss rate is, the smaller the value of the set number of intervals is.
示例的,在待编码视频采用IPPP模式的编码视频帧结构的情况下,假设待编码视频包括五帧视频帧,即第一视频帧至第五视频帧。其中,按照显示时序排列的第一视频帧、第二视频帧均已编码,则参考帧集合包括:第一视频帧编码得到编码视频帧对应的视频帧A,以及第二视频帧编码得到编码视频帧对应的视频帧B,视频帧B为可靠帧。第一数量阈值可以为0,即在丢帧率为0时,第三视频帧对应的目标参考帧为:视频帧B。第四视频帧对应的目标参考帧为:第三视频帧得到编码视频帧对应的视频帧。第五视频帧对应的目标参考帧为:第四视频帧得到编码视频帧对应的视频帧。For example, in the case that the video to be encoded adopts the encoding video frame structure of the IPPP mode, it is assumed that the video to be encoded includes five video frames, that is, the first video frame to the fifth video frame. Wherein, the first video frame and the second video frame arranged according to the display sequence have been coded, and the reference frame set includes: the video frame A corresponding to the coded video frame obtained by coding the first video frame, and the coded video frame A obtained by coding the second video frame Frame corresponds to video frame B, and video frame B is a reliable frame. The first quantity threshold may be 0, that is, when the frame loss rate is 0, the target reference frame corresponding to the third video frame is: video frame B. The target reference frame corresponding to the fourth video frame is: the video frame corresponding to the encoded video frame obtained from the third video frame. The target reference frame corresponding to the fifth video frame is: the video frame corresponding to the encoded video frame obtained from the fourth video frame.
在丢帧率大于0时,且假设设定数量为1,则第三视频帧对应的目标参考帧为:视频帧B。第四视频帧对应的目标参考帧为:第三视频帧得到编码视频帧对应的视频帧。第五视频帧对应的目标参考帧为:视频帧B。When the frame loss rate is greater than 0, and assuming that the set number is 1, the target reference frame corresponding to the third video frame is: video frame B. The target reference frame corresponding to the fourth video frame is: the video frame corresponding to the encoded video frame obtained from the third video frame. The target reference frame corresponding to the fifth video frame is: video frame B.
又一示例的,在待编码视频采用时域可分级类型的SVC编码的情况下,视频编码规则适用于所有处于基本层的非首个待编码的视频帧。假设待编码视频包括八帧视频帧,即第一视频帧至第八视频帧,奇数视频帧处于基本层,偶数视频帧处于增强层。其中,按照显示时序排列的第一视频帧、第二视频帧、第三视频帧均已编码,则参考帧集合包括:第一视频帧编码得到编码视频帧对应的视频帧A,以及第三视频帧编码得到编码视频帧对应的视频帧C,视频帧C为可靠帧。As another example, when the video to be coded adopts temporally scalable SVC coding, the video coding rule is applicable to all non-first video frames to be coded at the base layer. Assume that the video to be encoded includes eight video frames, that is, the first video frame to the eighth video frame, the odd video frames are in the base layer, and the even video frames are in the enhancement layer. Wherein, the first video frame, the second video frame, and the third video frame arranged according to the display sequence have all been coded, and the reference frame set includes: the video frame A corresponding to the coded video frame obtained by coding the first video frame, and the third video frame Frame coding obtains a video frame C corresponding to the coded video frame, and the video frame C is a reliable frame.
第一数量阈值可以为0,即在丢帧率为0时,第四视频帧对应的目标参考帧为:视频帧C。第五视频帧对应的目标参考帧为:视频帧C。第六视频帧对应的目标参考帧为:第五视频帧得到编码视频帧对应的视频帧。第七视频帧对应的目标参考帧为:第五视频帧得到编码视频帧对应的视频帧。第八视频帧对应的目标参考帧为:第七视频帧得到编码视频帧对应的视频帧。The first quantity threshold may be 0, that is, when the frame loss rate is 0, the target reference frame corresponding to the fourth video frame is: video frame C. The target reference frame corresponding to the fifth video frame is: video frame C. The target reference frame corresponding to the sixth video frame is: the video frame corresponding to the coded video frame obtained from the fifth video frame. The target reference frame corresponding to the seventh video frame is: the video frame corresponding to the encoded video frame obtained from the fifth video frame. The target reference frame corresponding to the eighth video frame is: the video frame corresponding to the encoded video frame obtained from the seventh video frame.
在丢帧率大于0时,且假设设定数量为0,则第四视频帧对应的目标参考帧为:视频帧C。第五视频帧对应的目标参考帧为:视频帧C。第六视频帧对应的目标参考帧为:第五视频帧得到编码视频帧对应的视频帧。第七视频帧对应的目标参考帧为:视频帧C。第八视频帧对应的目标参考帧为:第七视频帧得到编码视频帧对应的视频帧。When the frame loss rate is greater than 0, and assuming that the set number is 0, the target reference frame corresponding to the fourth video frame is: video frame C. The target reference frame corresponding to the fifth video frame is: video frame C. The target reference frame corresponding to the sixth video frame is: the video frame corresponding to the coded video frame obtained from the fifth video frame. The target reference frame corresponding to the seventh video frame is: video frame C. The target reference frame corresponding to the eighth video frame is: the video frame corresponding to the encoded video frame obtained from the seventh video frame.
步骤603、将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离。Step 603: Determine the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded.
本申请实施例中,编码端可以将待编码的视频帧与目标参考帧之间间隔的视频帧的数量确定为待编码的视频帧的参考距离(RefDelta)。可选的,在待编码视频包括的多个待编码的视频帧可以按照显示时序的顺序具有帧序号的情况下,编码端可以将待编码的视频帧的帧序号与目标参考帧的帧序号的差值,确 定为待编码的视频帧的参考距离。示例的,假设若待编码的视频帧的帧序号为3,即第三个显示的视频帧。目标参考帧的帧序号为1。则参考距离为2。In the embodiment of the present application, the encoding end may determine the number of video frames between the video frame to be encoded and the target reference frame as the reference distance (RefDelta) of the video frame to be encoded. Optionally, in the case that the multiple video frames to be encoded included in the video to be encoded can have frame numbers in order of display timing, the encoding end can combine the frame numbers of the video frames to be encoded with the frame numbers of the target reference frame The difference is determined as the reference distance of the video frame to be encoded. As an example, assume that if the frame number of the video frame to be encoded is 3, that is, the third displayed video frame. The frame number of the target reference frame is 1. Then the reference distance is 2.
步骤604、利用目标参考帧对待编码的视频帧编码,得到编码视频帧。Step 604: Encode the video frame to be encoded by using the target reference frame to obtain the encoded video frame.
可选的,编码端可以利用目标参考帧对待编码的视频帧采用预测编码得到编码视频帧。示例的,编码端可以利用目标参考帧对待编码的视频帧采用帧间预测编码得到编码视频帧。Optionally, the coding end may use the target reference frame to obtain the coded video frame by using predictive coding on the video frame to be coded. For example, the coding end may use the target reference frame to obtain the coded video frame by using inter-frame predictive coding on the video frame to be coded.
步骤605、向解码端发送编码视频帧以及参考距离。Step 605: Send the coded video frame and the reference distance to the decoder.
本申请实施例中,编码端通过其与解码端之间连接的网络,向解码端发送编码视频帧以及参考距离。以使得解码端在接收到发送的编码视频帧以及参考距离后,根据参考距离,获取编码视频帧对应的目标参考帧。并利用目标参考帧对编码视频帧解码,得到解码后的视频帧。In the embodiment of the present application, the encoding end sends the encoded video frame and the reference distance to the decoding end through a network connected between the encoding end and the decoding end. So that after receiving the transmitted encoded video frame and the reference distance, the decoding end obtains the target reference frame corresponding to the encoded video frame according to the reference distance. And the coded video frame is decoded by using the target reference frame to obtain the decoded video frame.
综上所述,本申请实施例提供的视频编解码方法,通过获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合。根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧。将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离。利用目标参考帧对待编码的视频帧编码,得到编码视频帧。以向解码端发送编码视频帧以及参考距离。其中,参考帧集合包括至少一个编码视频帧对应的视频帧,且参考帧集合包括与解码端可成功解码的编码视频帧对应的视频帧作为可靠帧。而视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧。因而,在视频发送端和视频接收端之间网络状态较差,导致出现传输过程中视频帧丢失的情况下,即出现丢帧率大于0的情况下,由于丢帧率越大,待编码视频帧中越多待编码的视频帧采用可靠帧编码。因此,降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,进而降低了解码端出现播放卡顿等问题的概率。To sum up, the video encoding and decoding method provided by the embodiment of the present application obtains the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a set of reference frames. According to the frame loss rate and the video coding rule, the target reference frame corresponding to the video frame to be coded is determined from the reference frame set. The distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded. The video frame to be coded is coded by using the target reference frame to obtain the coded video frame. To send the encoded video frame and the reference distance to the decoder. Wherein, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame. The video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be decoded correctly due to the loss of video frames is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freeze at the decoding end is reduced.
请参考图7,其示出了本申请实施例提供的一种视频编解码方法的流程图。视频编解码方法应用于图1和图2所示的解码端。如图7所示,视频编解码方法包括:Please refer to FIG. 7 , which shows a flowchart of a video encoding and decoding method provided by an embodiment of the present application. The video coding and decoding method is applied to the decoding end shown in Fig. 1 and Fig. 2 . As shown in Figure 7, video encoding and decoding methods include:
步骤701、接收编码端发送的编码视频帧以及参考距离。Step 701: Receive the coded video frame and the reference distance sent by the coder.
解码端接收的编码视频帧为编码端根据本申请实施例提供的一种视频编解码方法生成的编码视频帧。The encoded video frame received by the decoding end is an encoded video frame generated by the encoding end according to a video encoding and decoding method provided by an embodiment of the present application.
步骤702、获取解码帧集合,解码帧集合包括至少一个解码后的视频帧,解码帧集合所包括的视频帧的数量,大于或者等于解码端的参考帧集合所包括的 视频帧的数量。Step 702: Obtain a decoded frame set, which includes at least one decoded video frame, and the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set at the decoding end.
本申请实施例中,解码帧集合包括至少一个解码后的视频帧。解码端对接收到编码端的编码视频帧解码,得到解码后的视频帧后,可以将该解码后的视频帧存储,得到解码帧集合。In this embodiment of the present application, the set of decoded frames includes at least one decoded video frame. The decoding end decodes the encoded video frames received by the encoding end, and after obtaining the decoded video frames, the decoded video frames may be stored to obtain a set of decoded frames.
可选的,解码帧集合又称为第二解码图像缓存(Decoded Picture Buffer,DPB),其可以包括第二目标数量个解码后的视频帧,第二目标数量的取值可以大于1。示例的,该解码帧集合所包括的视频帧的第二目标数量与参考帧集合所包括的视频帧的第一目标数量相同。Optionally, the decoded frame set is also called a second decoded picture buffer (Decoded Picture Buffer, DPB), which may include a second target number of decoded video frames, and the value of the second target number may be greater than 1. Exemplarily, the second target number of video frames included in the decoded frame set is the same as the first target number of video frames included in the reference frame set.
步骤703、在根据参考距离,确定解码帧集合中包括编码视频帧对应的目标参考帧时,获取编码视频帧对应的目标参考帧。 Step 703, when it is determined according to the reference distance that the set of decoded frames includes the target reference frame corresponding to the encoded video frame, acquire the target reference frame corresponding to the encoded video frame.
可选的,在待编码视频包括的多个待编码的视频帧可以按照显示时序的顺序具有帧序号的情况下,待编码的视频帧的参考距离可以为待编码的视频帧的帧序号与目标参考帧的帧序号的差值。解码端可以根据参考距离,确定与编码视频帧的帧序号相差该差值的目标帧序号。在确定解码帧集合中包括与目标帧序号对应的解码后的视频帧时,获取解码帧集合中与目标帧序号对应的解码后的视频帧,将该视频帧作为编码视频帧对应的目标参考帧。在确定解码帧集合中不包括与目标帧序号对应的解码后的视频帧时,表明解码端无法从解码帧集合中获取到与目标帧序号对应的解码后的视频帧,即无法获取到该编码视频帧对应的目标参考帧。则解码端即使在成功接收到该编码视频帧也会因无法获取其对应的目标参考帧,而无法正确解码。解码端在确定解码帧集合中不包括与目标帧序号对应的解码后的视频帧时,将该编码视频帧丢弃。Optionally, in the case that a plurality of video frames to be encoded included in the video to be encoded can have frame numbers in order of display timing, the reference distance of the video frames to be encoded can be the frame number of the video frames to be encoded and the target The difference between the frame numbers of the reference frames. The decoding end may determine the target frame number that differs from the frame number of the encoded video frame by the difference according to the reference distance. When it is determined that the decoded frame set includes the decoded video frame corresponding to the target frame number, obtain the decoded video frame corresponding to the target frame number in the decoded frame set, and use the video frame as the target reference frame corresponding to the coded video frame . When it is determined that the decoded frame set does not include the decoded video frame corresponding to the target frame number, it indicates that the decoder cannot obtain the decoded video frame corresponding to the target frame number from the decoded frame set, that is, the coded frame cannot be obtained. The target reference frame corresponding to the video frame. Even if the decoding end successfully receives the coded video frame, it cannot be decoded correctly because it cannot obtain its corresponding target reference frame. When the decoding end determines that the decoded frame set does not include the decoded video frame corresponding to the target frame number, the encoded video frame is discarded.
示例的,若编码视频帧的帧序号为3,且参考距离为2。则该编码视频帧对应的目标参考帧为解码帧集合中帧序号为1的视频帧。遍历解码帧集合中包括的视频帧的帧序号。若确定解码帧集合中包括帧序号为1的视频帧,则将该帧序号为1的视频帧作为编码视频帧对应的目标参考帧。若确定解码帧集合中不包括帧序号为1的视频帧,则丢弃接收到的该编码视频帧。For example, if the frame number of the coded video frame is 3, and the reference distance is 2. Then the target reference frame corresponding to the coded video frame is the video frame whose frame number is 1 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the decoded frame set includes a video frame with a frame number of 1, the video frame with a frame number of 1 is used as a target reference frame corresponding to the encoded video frame. If it is determined that the decoded frame set does not include the video frame whose frame number is 1, the received coded video frame is discarded.
步骤704、利用目标参考帧对编码视频帧解码,得到解码后的视频帧。Step 704: Use the target reference frame to decode the coded video frame to obtain a decoded video frame.
可选的,解码端可以利用目标参考帧对编码视频帧采用预测解码得到解码后的视频帧。示例的,解码端可以利用目标参考帧对编码视频帧采用帧间预测解码得到解码后的视频帧。Optionally, the decoding end may use the target reference frame to perform predictive decoding on the coded video frame to obtain the decoded video frame. For example, the decoding end may use the target reference frame to perform inter-frame predictive decoding on the coded video frame to obtain the decoded video frame.
综上所述,本申请实施例提供的视频编解码方法,通过接收编码端根据本申请实施例提供的一种视频编解码方法生成的编码视频帧以及参考距离。获取解码帧集合,解码帧集合包括至少一个解码后的视频帧。使得可以根据参考距 离,从解码帧集合中获取编码视频帧对应的目标参考帧。从而利用目标参考帧对编码视频帧解码,得到解码后的视频帧。其中,编码视频帧是编码端根据本申请实施例提供的一种视频编解码方法生成的。因而,在视频发送端和视频接收端之间网络状态较差,导致出现传输过程中视频帧丢失的情况下,由于丢帧率越大,越多的编码视频帧是采用与解码端可成功解码的编码视频帧对应的可靠帧编码。因此,降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,进而降低了解码端出现播放卡顿等问题的概率。To sum up, the video encoding and decoding method provided by the embodiment of the present application receives the coded video frame and the reference distance generated by the encoding terminal according to a video encoding and decoding method provided in the embodiment of the present application. Acquire a decoded frame set, where the decoded frame set includes at least one decoded video frame. The target reference frame corresponding to the coded video frame can be obtained from the decoded frame set according to the reference distance. Therefore, the coded video frame is decoded by using the target reference frame to obtain a decoded video frame. Wherein, the encoded video frame is generated by the encoding end according to a video encoding and decoding method provided by the embodiment of the present application. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in the loss of video frames during transmission, the greater the frame loss rate, the more encoded video frames are used and the decoding end can be successfully decoded The encoded video frame corresponds to the reliable frame encoding. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
本申请实施例提供的视频编解码方法中,待编码视频可以采用IPPP模式编码,得到IPPP模式的编码视频帧结构,也可以采用SVC编码,得到SVC编码视频帧结构,当然也可以采用其他模式的编码。以下图8和图9所示的实施例以待编码视频采用时域可分级类型的SVC编码为例进行说明,则待编码视频由多个视频帧组成,多个视频帧可以包括一个基本层和至少一个增强层。In the video encoding and decoding method provided in the embodiment of the present application, the video to be encoded can be encoded in IPPP mode to obtain the encoded video frame structure of IPPP mode, or can be encoded by SVC to obtain the SVC encoded video frame structure, and of course other modes can also be used. coding. The embodiments shown in Fig. 8 and Fig. 9 below take the video to be coded using time-domain scalable SVC coding as an example for illustration, then the video to be coded consists of multiple video frames, and the multiple video frames may include a base layer and At least one enhancement layer.
请参考图8和图9,其示出了本申请实施例提供的一种视频编解码方法的流程图。视频编解码方法应用于图1和图2所示的视频处理系统。Please refer to FIG. 8 and FIG. 9 , which show a flowchart of a video encoding and decoding method provided by an embodiment of the present application. The video encoding and decoding method is applied to the video processing system shown in Fig. 1 and Fig. 2 .
如图8所示,针对首个待编码的视频帧,视频编解码方法包括:As shown in Figure 8, for the first video frame to be encoded, the video encoding and decoding methods include:
步骤801、编码端获取首个待编码的视频帧。 Step 801, the encoder obtains the first video frame to be encoded.
本申请实施例中,视频编码方法可以应用于一整段待编码视频内,或者,也可以应用于待编码视频中的一个编码周期内。待编码视频可以包括多个编码周期,每个编码周期可以包括多个视频帧。若视频编码方法应用于一整段待编码视频内,则首个待编码的视频帧为待编码视频包括的多个待编码的视频帧中,按照显示时序排列的首个待编码的视频帧。相应的非首个待编码的视频帧则为待编码视频包括的多个待编码的视频帧中,除首个待编码的视频帧之外的视频帧。若视频编码方法应用于一个编码周期内,则首个待编码视频帧为编码周期包括的多个待编码的视频帧中,按照显示时序排列的首个待编码的视频帧。相应非首个待编码的视频帧则为编码周期包括的多个待编码的视频帧中,除首个待编码的视频帧之外的视频帧。In the embodiment of the present application, the video encoding method may be applied to an entire video to be encoded, or may also be applied to an encoding cycle of the video to be encoded. The video to be encoded may include multiple encoding periods, and each encoding period may include multiple video frames. If the video coding method is applied to a whole section of video to be coded, the first video frame to be coded is the first video frame to be coded arranged according to display timing among the multiple video frames to be coded included in the video to be coded. The corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the video to be encoded. If the video coding method is applied in one coding cycle, the first video frame to be coded is the first video frame to be coded arranged according to the display time sequence among the multiple video frames to be coded included in the coding cycle. The corresponding non-first video frame to be encoded is a video frame except the first video frame to be encoded among the plurality of video frames to be encoded included in the encoding cycle.
步骤802、编码端对首个待编码的视频帧采用帧内预测编码,得到首个编码视频帧。In step 802, the encoding end performs intra-frame predictive encoding on the first video frame to be encoded to obtain the first encoded video frame.
本申请实施例中,编码端对首个待编码的视频帧采用帧内预测编码,得到首个编码视频帧,该首个编码视频帧为I帧。In the embodiment of the present application, the encoding end performs intra-frame predictive encoding on the first video frame to be encoded to obtain the first encoded video frame, and the first encoded video frame is an I frame.
步骤803、编码端将首个编码视频帧对应的视频帧添加至参考帧集合,选取 首个编码视频帧对应的视频帧作为可靠帧。Step 803: The encoder adds the video frame corresponding to the first encoded video frame to the reference frame set, and selects the video frame corresponding to the first encoded video frame as a reliable frame.
本申请实施例中,参考帧集合包括至少一个编码视频帧对应的视频帧。也即是参考帧集合包括:待编码视频中编码后得到的编码视频帧所对应的视频帧。首个视频帧为处于基本层的视频帧。编码端将该首个视频帧编码得到首个编码视频帧所对应的视频帧添加至参考帧集合。由于首个编码视频帧为I帧,且I帧被解码端接收到即可保证被成功解锁。因此,可以选取该首个编码视频帧作为参考帧集合中的首个可靠帧。In the embodiment of the present application, the reference frame set includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded. The first video frame is the video frame at the base layer. The encoding end encodes the first video frame to obtain a video frame corresponding to the first encoded video frame and adds it to the reference frame set. Since the first coded video frame is an I frame, and the I frame is received by the decoding end, it can be guaranteed to be successfully unlocked. Therefore, the first coded video frame can be selected as the first reliable frame in the reference frame set.
步骤804、编码端向解码端发送首个编码视频帧。 Step 804, the encoding end sends the first encoded video frame to the decoding end.
步骤805、解码端对首个编码视频帧采用帧内预测解码,得到首个解码后的视频帧。Step 805: The decoding end performs intra-frame predictive decoding on the first coded video frame to obtain the first decoded video frame.
本申请实施例中,解码端在接收到编码端发送的首个编码视频帧后,可以对首个I帧采用帧内预测解码,得到该首个编码视频帧对应的首个解码后的视频帧。In the embodiment of the present application, after receiving the first coded video frame sent by the coder, the decoder can use intra-frame predictive decoding for the first I frame to obtain the first decoded video frame corresponding to the first coded video frame .
步骤806、解码端将首个解码后的视频帧添加至解码帧集合。 Step 806, the decoder adds the first decoded video frame to the decoded frame set.
本申请实施例中,解码帧集合可以包括至少一个解码后的视频帧。解码端可以将对首个编码视频帧解码,得到的首个解码后的视频帧添加至解码帧集合,以便于后续基于解码帧集合所包括视频帧对之后接收的编码视频帧解码。In this embodiment of the present application, the set of decoded frames may include at least one decoded video frame. The decoding end may decode the first coded video frame and add the first decoded video frame to the decoded frame set, so as to facilitate subsequent decoding of coded video frames received based on the video frame pair included in the decoded frame set.
如图9所示,针对每个非首个待编码的视频帧,视频编解码方法还可以包括:As shown in Figure 9, for each non-first video frame to be encoded, the video encoding and decoding method may also include:
步骤901、编码端获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合。Step 901, the encoding end acquires the non-first video frame to be encoded, the transmission frame loss rate between the encoding end and the decoding end, and a set of reference frames.
编码端与解码端的传输丢帧率指的是设定时长内,解码端未接收到的视频帧的数量与编码端向解码端传输的视频帧的数量比,解码端未接收的视频帧即为解码端丢失的视频帧。编码视频帧指的是对待编码的视频帧编码后得到的视频帧。编码视频帧对应的视频帧即为该编码视频帧对应的编码前的视频帧,或者,对编码视频帧进行重建处理后得到重建帧。示例的,编码端可以接收解码端发送的每设定时长内接收到的编码视频帧的数量。并获取编码端在该设定时长内传输的编码视频帧的数量。将解码端未接收到的编码视频帧的数量与编码端向解码端传输的编码视频帧的数量之比作为编码端与解码端的传输丢帧率,该解码端未接收到的编码视频帧的数量为编码端在该设定时长内传输的编码视频帧的数量与解码端在设定时长内接收到的编码视频帧的数量之差。The transmission frame loss rate between the encoding end and the decoding end refers to the ratio of the number of video frames not received by the decoding end to the number of video frames transmitted from the encoding end to the decoding end within the set time period. The video frames not received by the decoding end are Video frames lost on the decoding side. The encoded video frame refers to a video frame obtained after encoding the video frame to be encoded. The video frame corresponding to the coded video frame is the video frame before coding corresponding to the coded video frame, or the reconstructed frame is obtained after the coded video frame is reconstructed. For example, the encoding end may receive the number of encoded video frames received within each set time period sent by the decoding end. And obtain the number of encoded video frames transmitted by the encoding end within the set duration. The ratio of the number of encoded video frames not received by the decoding end to the number of encoded video frames transmitted from the encoding end to the decoding end is used as the transmission frame loss rate between the encoding end and the decoding end, and the number of encoded video frames not received by the decoding end is the difference between the number of encoded video frames transmitted by the encoder within the set duration and the number of encoded video frames received by the decoder within the set duration.
参考帧集合包括至少一个编码视频帧对应的视频帧。也即是参考帧集合包括:待编码视频中编码后得到的编码视频帧所对应的视频帧。因而,编码端在 对待编码的视频帧编码得到编码视频帧后,可以将该编码视频帧对应的视频帧存储,得到参考帧集合。例如,在编码端获取到第一个非首个待编码的视频帧时,获取到的参考帧集合包括首个编码视频帧对应的视频帧,该首个编码视频帧为首个待编码的视频帧编码得到。在编码端获取到第二个非首个待编码的视频帧时,获取到的参考帧集合包括首个编码视频帧对应的视频帧以及第二个编码视频帧对应的视频帧,该第二个编码视频帧为第一个非首个待编码的视频帧编码得到。其中,参考帧集合包括可靠帧,该可靠帧指的是与解码端可成功解码的编码视频帧对应的视频帧。可成功解码的编码视频帧可以指的是已成功解码的编码视频帧。或者,可成功解码的编码视频帧也可以指的是被成功接收的编码视频帧。例如,被成功接收的I帧。或者,可成功解码的视频帧也可以指的是被成功接收的编码视频帧,且该编码视频帧的参考帧也被成功接收。参考帧指的是视频帧编码时所需参考的帧。The set of reference frames includes at least one video frame corresponding to the coded video frame. That is, the reference frame set includes: video frames corresponding to encoded video frames obtained after encoding in the video to be encoded. Therefore, after encoding the video frame to be encoded to obtain the encoded video frame, the encoder can store the video frame corresponding to the encoded video frame to obtain the reference frame set. For example, when the encoder acquires the first non-first video frame to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame, and the first encoded video frame is the first video frame to be encoded Encoded to get. When the encoder obtains the second video frame that is not the first to be encoded, the acquired reference frame set includes the video frame corresponding to the first encoded video frame and the video frame corresponding to the second encoded video frame, the second The encoded video frame is obtained by encoding the first not the first video frame to be encoded. Wherein, the reference frame set includes reliable frames, and the reliable frames refer to video frames corresponding to coded video frames that can be successfully decoded by the decoding end. A successfully decodable encoded video frame may refer to an encoded video frame that has been successfully decoded. Alternatively, the successfully decodable encoded video frames may also refer to the successfully received encoded video frames. For example, I frames that are successfully received. Alternatively, a successfully decodable video frame may also refer to a successfully received coded video frame, and a reference frame of the coded video frame is also successfully received. A reference frame refers to a frame that needs to be referred to when encoding a video frame.
可选的,参考帧集合,又称第一DPB,其可以包括第一目标数量个编码视频帧分别对应的视频帧,即可以包括第一目标数量个视频帧,第一目标数量的取值可以大于1。参考帧集合中可包括的视频帧的数量与编码端设定的短期参考帧数或者长期参考帧数相关。示例的,参考帧集合包括的视频帧的第一目标数量可以为8、16或者32等。其中,编码端可以包括重建帧缓存器,该重建帧缓存器可以用于存储参考帧集合。示例的,如图10所示,参考帧集合DPB可以包括16个视频帧,即第一目标数量为16。假设当前非首个待编码的视频帧的帧序号为29,则参考帧集合可以包括的视频帧的帧序号分别为1、3...19、21、23、25和27。Optionally, the reference frame set, also known as the first DPB, may include video frames corresponding to the first target number of encoded video frames, that is, may include the first target number of video frames, and the value of the first target number may be Greater than 1. The number of video frames that can be included in the reference frame set is related to the number of short-term reference frames or the number of long-term reference frames set by the encoder. Exemplarily, the first target number of video frames included in the reference frame set may be 8, 16, or 32, and so on. Wherein, the encoding end may include a reconstructed frame buffer, and the reconstructed frame buffer may be used to store the reference frame set. Exemplarily, as shown in FIG. 10 , the reference frame set DPB may include 16 video frames, that is, the first target number is 16. Assuming that the frame number of the current non-first video frame to be encoded is 29, the frame numbers of the video frames that may be included in the reference frame set are 1, 3...19, 21, 23, 25 and 27 respectively.
本申请实施例中,参考帧集合中包括的视频帧可以为大多数待编码的视频帧的参考帧。示例的,在待编码视频采用时域可分级类型的SVC编码时,参考帧集合包括的视频帧仅可以为处于基本层的视频帧编码得到编码视频帧对应的视频帧。In this embodiment of the present application, the video frames included in the reference frame set may be reference frames of most video frames to be encoded. For example, when the video to be encoded adopts temporally scalable SVC encoding, the video frames included in the reference frame set can only be encoded for video frames at the base layer to obtain video frames corresponding to the encoded video frames.
步骤902、编码端根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧。Step 902, the encoding end determines the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule.
视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧。可选的,可以存在多个丢帧率区间。视频编码规则可以包括:与多个不同丢帧率区间一一对应的编码子规则。根据不同编码子规则为同一待编码视频中的每个视频帧确定目标参考帧时,对应的目标参考帧为可靠帧的视频帧的数量不同。则编码端根据丢帧率和视频编码规则,确定待编码的视频帧对应的目标参考帧的过程可以包括:根据丢帧率所属的目标丢帧率区间,确定对应的目标编码子规则。根据目标编码子规则,确定待编码的视频帧对应 的目标参考帧。The video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Optionally, there may be multiple frame loss rate intervals. The video encoding rules may include: encoding sub-rules corresponding to multiple different frame loss rate intervals one-to-one. When the target reference frame is determined for each video frame in the same video to be encoded according to different encoding sub-rules, the number of video frames whose corresponding target reference frame is a reliable frame is different. Then, according to the frame loss rate and the video encoding rules, the encoding end determines the target reference frame corresponding to the video frame to be encoded may include: determining the corresponding target encoding sub-rule according to the target frame loss rate interval to which the frame loss rate belongs. According to the target encoding sub-rules, determine the target reference frame corresponding to the video frame to be encoded.
示例的,待编码视频包括的待编码的视频帧中,每间隔设定数量的待编码的视频帧所对应的目标参考帧可以为可靠帧。剩余待编码的视频帧所对应的目标参考帧可以均为与该待编码的视频帧显示距离最近的,即与该待编码的视频帧的帧序号距离最近的视频帧。其中,针对不同丢帧率区间对应的编码子规则下,间隔的设定数量的取值不同,且丢帧率区间对应的丢帧率越大,间隔的设定数量的取值越小。For example, among the video frames to be encoded included in the video to be encoded, the target reference frames corresponding to a set number of video frames to be encoded at every interval may be reliable frames. The target reference frames corresponding to the remaining video frames to be encoded may all be the video frames with the closest display distance to the video frame to be encoded, that is, the video frame with the closest distance to the frame sequence number of the video frame to be encoded. Wherein, under the encoding sub-rules corresponding to different frame loss rate intervals, the value of the set number of intervals is different, and the larger the frame loss rate corresponding to the frame loss rate interval, the smaller the value of the set number of intervals.
本申请实施例中,视频编码规则可以包括:第一编码子规则、第二编码子规则以及第三编码子规则。其中,第一编码子规则又称为非可靠参考规则,第二编码子规则又称为不完全可靠参考,第三编码子规则又称为完全可靠参考。针对处于基本层的待编码的视频帧,第一编码子规则用于将参考帧集合中,与待编码的视频帧的帧序号最接近的视频帧,作为待编码的视频帧对应的目标参考帧;第二编码子规则用于将可靠帧,作为所有待编码的视频帧中,每间隔设定数量的待编码的视频帧所对应的目标参考帧;第三编码子规则用于将可靠帧,作为每个待编码帧对应的目标参考帧。可选的,针对其他待编码的视频帧(包括:针对处于增强层的待编码的视频帧),任一编码子规则用于将参考帧集合中,与待编码的视频帧的帧序号最接近的视频帧作为该待编码的视频帧对应的目标参考帧。In this embodiment of the present application, the video encoding rule may include: a first encoding sub-rule, a second encoding sub-rule, and a third encoding sub-rule. Wherein, the first encoding sub-rule is also called unreliable reference rule, the second encoding sub-rule is also called incompletely reliable reference, and the third encoding sub-rule is also called completely reliable reference. For the video frame to be encoded at the base layer, the first encoding sub-rule is used to use the video frame with the closest frame number to the video frame to be encoded in the reference frame set as the target reference frame corresponding to the video frame to be encoded ; The second coding sub-rule is used to use the reliable frame as the target reference frame corresponding to the video frames to be coded at each interval in all video frames to be coded; the third coding sub-rule is used to use the reliable frame, As the target reference frame corresponding to each frame to be encoded. Optionally, for other video frames to be encoded (including: for video frames to be encoded in the enhancement layer), any encoding sub-rule is used to set the frame number closest to the video frame to be encoded in the reference frame set The video frame of is used as the target reference frame corresponding to the video frame to be encoded.
示例的,如图11至图13,以下以当前获取的非首个待编码的视频帧为帧序号为29的视频帧。待编码视频还包括的非首个待编码的视频帧的帧序号为30、31、32和33等。参考帧集合可以包括的视频帧的数量为16,参考帧集合包括的视频帧的帧序号为1、3...19、21、23、25和27。其中,帧序号为21的视频帧为可靠帧为例进行示例性说明。For example, as shown in FIG. 11 to FIG. 13 , the video frame whose frame sequence number is 29 is the video frame that is not the first video frame to be encoded currently acquired. The video to be encoded also includes frame numbers of non-first video frames to be encoded are 30, 31, 32, 33, and so on. The number of video frames that can be included in the reference frame set is 16, and the frame numbers of the video frames included in the reference frame set are 1, 3...19, 21, 23, 25 and 27. Wherein, the video frame whose frame sequence number is 21 is a reliable frame as an example for illustration.
请参考图11,图11示出了本申请实施例提供的第一编码子规则的原理示意图。针对处于基本层的待编码的视频帧,第一编码子规则用于将参考帧集合中,与待编码的视频帧的帧序号最接近的视频帧作为该待编码的视频帧的目标参考帧。如图11所示,帧序号为29的待编码的视频帧所对应的目标参考帧为参考帧集合找中帧序号为27的视频帧。帧序号为30、31、32、33的待编码的视频帧所对应的目标参考帧依次为帧序号为29、29、31、31的视频帧。Please refer to FIG. 11 , which shows a schematic diagram of the principle of the first encoding sub-rule provided by the embodiment of the present application. For a video frame to be encoded at the base layer, the first encoding subrule is used to use the video frame whose frame number is closest to the video frame to be encoded in the set of reference frames as the target reference frame of the video frame to be encoded. As shown in FIG. 11 , the target reference frame corresponding to the video frame to be encoded with frame number 29 is the video frame with frame number 27 found in the reference frame set. Target reference frames corresponding to video frames with frame numbers 30, 31, 32, and 33 to be encoded are video frames with frame numbers 29, 29, 31, and 31 in sequence.
请参考图12,图12示出了本申请实施例提供的第二编码子规则的原理示例图。第二编码子规则用于将可靠帧,作为所有待编码的视频帧中间隔待编码的视频帧的所对应的目标参考帧,即间隔设定数量为1。如图12所示,帧序号为29、33的待编码的视频帧所对应的目标参考帧均为可靠帧。帧序号为30、31、32的待编码的视频帧所对应的目标参考帧依次为帧序号为29、29、31的视频帧。Please refer to FIG. 12 , which shows a principle example diagram of the second coding sub-rule provided by the embodiment of the present application. The second encoding sub-rule is used to use the reliable frame as the target reference frame corresponding to the interval between the video frames to be encoded among all the video frames to be encoded, that is, the set number of intervals is 1. As shown in FIG. 12 , the target reference frames corresponding to the video frames to be encoded with frame numbers 29 and 33 are all reliable frames. Target reference frames corresponding to video frames with frame numbers 30, 31, and 32 to be encoded are video frames with frame numbers 29, 29, and 31 in sequence.
请参考图13,图13示出了本申请实施例提供的第三编码子规则的原理示例图。第三编码子规则用于将可靠帧,作为每个待编码帧对应的目标参考帧如图13所示,帧序号为29、31、33的待编码的视频帧所对应的目标参考帧均为可靠帧。帧序号为30、32的待编码的视频帧所对应的目标参考帧依次为帧序号为29、31的视频帧。图11至图13中的箭头表示箭头指向的视频帧为箭头起始处的待编码的视频帧所对应的目标视频帧。Please refer to FIG. 13 , which shows a principle example diagram of the third coding sub-rule provided by the embodiment of the present application. The third encoding sub-rule is used to use reliable frames as the target reference frames corresponding to each frame to be encoded. reliable frame. Target reference frames corresponding to video frames with frame numbers 30 and 32 to be encoded are video frames with frame numbers 29 and 31 in sequence. The arrows in FIG. 11 to FIG. 13 indicate that the video frame pointed by the arrow is the target video frame corresponding to the video frame to be encoded at the beginning of the arrow.
由于待编码视频中,编码端编码处理时采用的目标参考帧的帧序号,与待编码的视频帧的帧序号最接近的待编码的视频帧越多,解码端播放解码后的视频帧时,得到的视频质量越高。因此,采用第一编码子规则、第二编码子规则、第三编码子规则得到视频的质量由高到低。且由于视频的质量越高,对编码端与解码端之间网络传输性能的要求越高,因此在编码端和解码端之间网络状态较差(例如,弱网)时,采用第一编码子规则、第二编码子规则、第三编码子规则对应的用于实时通讯下的视频的流畅性能由低到高。Because in the video to be encoded, the frame number of the target reference frame used by the encoding end during encoding processing, the more video frames to be encoded that are closest to the frame number of the video frame to be encoded, when the decoding end plays the decoded video frame, The higher the quality of the resulting video. Therefore, the quality of the video obtained by using the first encoding sub-rule, the second encoding sub-rule, and the third encoding sub-rule is from high to low. And because the higher the quality of the video, the higher the requirements for network transmission performance between the encoding end and the decoding end, so when the network status between the encoding end and the decoding end is poor (for example, a weak network), the first encoding subclass is used. The rules, the second coding sub-rule, and the third coding sub-rule correspond to the smooth performance of video under real-time communication from low to high.
步骤903、编码端将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离。Step 903 , the encoding end determines the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded.
本申请实施例中,编码端可以将待编码的视频帧与目标参考帧之间间隔的视频帧的数量确定为待编码的视频帧的参考距离(RefDelta)。可选的,在待编码视频包括的多个待编码的视频帧可以按照显示时序的顺序具有帧序号的情况下,编码端可以将待编码的视频帧的帧序号与目标参考帧的帧序号的差值,确定为待编码的视频帧的参考距离。示例的,假设若待编码的视频帧的帧序号为3,即第三个显示的视频帧。目标参考帧的帧序号为1。则参考距离为2。In the embodiment of the present application, the encoding end may determine the number of video frames between the video frame to be encoded and the target reference frame as the reference distance (RefDelta) of the video frame to be encoded. Optionally, in the case that the multiple video frames to be encoded included in the video to be encoded can have frame numbers in order of display timing, the encoding end can combine the frame numbers of the video frames to be encoded with the frame numbers of the target reference frame The difference is determined as the reference distance of the video frame to be encoded. As an example, assume that if the frame number of the video frame to be encoded is 3, that is, the third displayed video frame. The frame number of the target reference frame is 1. Then the reference distance is 2.
步骤904、编码端利用目标参考帧对待编码的视频帧编码,得到编码视频帧。Step 904: The encoding end uses the target reference frame to encode the video frame to be encoded to obtain the encoded video frame.
可选的,编码端可以利用目标参考帧对待编码的视频帧采用预测编码得到编码视频帧。示例的,编码端可以利用目标参考帧对待编码的视频帧采用帧间预测编码得到编码视频帧。Optionally, the coding end may use the target reference frame to obtain the coded video frame by using predictive coding on the video frame to be coded. For example, the coding end may use the target reference frame to obtain the coded video frame by using inter-frame predictive coding on the video frame to be coded.
步骤905、编码端在编码视频帧为处于基本层的视频帧编码得到的情况下,将编码视频帧对应的视频帧添加至参考帧集合,得到新的参考帧集合。Step 905: When the coded video frame is obtained by coding the video frame at the base layer, the coder adds the video frame corresponding to the coded video frame to the reference frame set to obtain a new reference frame set.
可选的,在待编码视频采用时域可分级类型的SVC编码的情况下,参考帧集合包括的视频帧为处于基本层的视频帧。编码端在对待编码的视频帧编码得到编码视频帧后,可以判断编码视频帧是否为处于基本层的待编码的视频帧所对应的编码视频帧。在确定该编码视频帧不为处于基本层的待编码的视频帧编码得到时,不用将该编码视频帧对应的视频帧添加至参考帧集合。在确定该编码视频帧为处于基本层的待编码的视频帧编码得到时,将该编码视频帧对应的 视频帧添加至参考帧集合,得到新的参考帧集合。之后,编码端再次对非首个待编码的视频帧编码时,可以获取该新的参考帧集合。编码端根据丢帧率和视频编码规则,从新的参考帧集合中确定待编码的视频帧对应的目标参考帧,以便于后续利用目标参考帧对待编码的视频帧编码。Optionally, when the video to be encoded adopts temporally scalable SVC encoding, the video frames included in the reference frame set are video frames at the base layer. After encoding the video frame to be encoded to obtain the encoded video frame, the encoding end may determine whether the encoded video frame is the encoded video frame corresponding to the video frame to be encoded at the base layer. When it is determined that the encoded video frame is not obtained by encoding the video frame to be encoded at the base layer, the video frame corresponding to the encoded video frame does not need to be added to the reference frame set. When it is determined that the coded video frame is obtained by coding the video frame to be coded at the base layer, the video frame corresponding to the coded video frame is added to the reference frame set to obtain a new reference frame set. Afterwards, when the encoding end encodes the video frame that is not the first to be encoded again, the new reference frame set can be obtained. The encoding end determines the target reference frame corresponding to the video frame to be encoded from the new reference frame set according to the frame loss rate and the video encoding rule, so as to facilitate subsequent encoding of the video frame to be encoded by using the target reference frame.
示例的,编码视频帧可以具有层级标识。该层级标识用于指示编码视频帧对应的待编码的视频帧处于基本层,或者处于增强层。编码端在得到编码视频帧后,可以在确定该编码视频帧的层级标识指示该编码视频帧对应的待编码的视频帧处于基本层时,将该编码视频帧对应的视频帧添加至参考帧集合。在确定该编码视频帧的层级标识指示该编码视频帧对应的待编码的视频帧处于增强层时,不用将该编码视频帧解码后的视频帧添加至参考帧集合。Exemplarily, the coded video frame may have a hierarchical identification. The level identifier is used to indicate that the video frame to be encoded corresponding to the encoded video frame is at the base layer or at the enhancement layer. After obtaining the coded video frame, the coder can add the video frame corresponding to the coded video frame to the reference frame set when it is determined that the layer identifier of the coded video frame indicates that the video frame to be coded corresponding to the coded video frame is at the base layer . When it is determined that the level identifier of the coded video frame indicates that the video frame to be coded corresponding to the coded video frame is at an enhancement layer, the decoded video frame of the coded video frame does not need to be added to the reference frame set.
本申请实施例中,参考帧集合包括的视频帧的最大数量可以为第一目标数量。则编码端在编码视频帧为处于基本层的视频帧编码得到的情况下,可以比较参考帧集合当前包括的视频帧的数量与第一目标数量的大小。在参考帧集合当前包括的视频帧的数量小于第一目标数量的情况下,编码端可以直接将编码视频帧对应的视频帧添加至参考帧集合,得到新的参考帧集合。在参考帧集合当前包括的视频帧的数量等于第一目标数量的情况下,编码端可以删除参考帧集合包括的所有视频帧中帧序号最小的视频帧,将编码视频帧对应的视频帧添加至参考帧集合,得到新的参考帧集合。In this embodiment of the present application, the maximum number of video frames included in the reference frame set may be the first target number. Then, when the coded video frame is obtained by coding the video frame at the base layer, the coder can compare the number of video frames currently included in the reference frame set with the first target number. When the number of video frames currently included in the reference frame set is less than the first target number, the encoding end may directly add video frames corresponding to the encoded video frames to the reference frame set to obtain a new reference frame set. When the number of video frames currently included in the reference frame set is equal to the first target number, the encoder can delete the video frame with the smallest frame number among all the video frames included in the reference frame set, and add the video frame corresponding to the encoded video frame to Reference frame set to get a new reference frame set.
步骤901获取的参考帧集合均为:编码端在针对前一个非首个待编码的视频帧执行视频编解码方法时,通过步骤905得到新的参考帧集合。The reference frame sets obtained in step 901 are all: when the encoding end executes the video encoding and decoding method for the previous video frame that is not the first to be encoded, it obtains a new reference frame set through step 905 .
步骤906、编码端向解码端发送编码视频帧以及参考距离。 Step 906, the encoding end sends the encoded video frame and the reference distance to the decoding end.
步骤907、解码端获取解码帧集合,解码帧集合包括至少一个解码后的视频帧,解码帧集合所包括的视频帧的数量,大于或者等于解码端的参考帧集合所包括的视频帧的数量。Step 907: The decoding end obtains a decoded frame set, which includes at least one decoded video frame, and the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set of the decoding end.
本申请实施例中,解码端在接收到编码端发送的编码视频帧以及参考距离之后,可以获取解码帧集合。解码帧集合又称为第二DPB,其可以包括第二目标数量个解码后的视频帧,第二目标数量的取值可以大于1。示例的,该解码帧集合所包括的视频帧的第二目标数量与参考帧集合所包括的视频帧的第一目标数量相同。In the embodiment of the present application, after receiving the encoded video frame and the reference distance sent by the encoding end, the decoding end can obtain the set of decoded frames. The decoded frame set is also referred to as the second DPB, which may include a second target number of decoded video frames, and the value of the second target number may be greater than 1. Exemplarily, the second target number of video frames included in the decoded frame set is the same as the first target number of video frames included in the reference frame set.
解码帧集合可以包括至少一个解码后的视频帧。解码端可以对接收到每个编码端的编码视频帧解码,得到解码后的视频帧后,按照接收顺序依次将该解码后的视频帧存储,得到解码帧集合。The set of decoded frames may include at least one decoded video frame. The decoding end may decode the encoded video frames received from each encoding end, and after obtaining the decoded video frames, store the decoded video frames sequentially according to the receiving order to obtain a set of decoded frames.
步骤908、解码端在根据参考距离,确定解码帧集合中包括编码视频帧对应 的目标参考帧时,获取编码视频帧对应的目标参考帧。Step 908, when the decoding end determines that the decoded frame set includes the target reference frame corresponding to the encoded video frame according to the reference distance, obtain the target reference frame corresponding to the encoded video frame.
可选的,在待编码视频包括的多个待编码的视频帧可以按照显示时序的顺序具有帧序号的情况下,待编码的视频帧的参考距离可以为待编码的视频帧的帧序号与目标参考帧的帧序号的差值。解码端可以根据参考距离,确定与编码视频帧的帧序号相差该差值的目标帧序号。在确定解码帧集合中包括与目标帧序号对应的解码后的视频帧时,获取解码帧集合中与目标帧序号对应的解码后的视频帧,将该视频帧作为编码视频帧对应的目标参考帧。在确定解码帧集合中不包括与目标帧序号对应的解码后的视频帧时,表明解码端王福安从解码帧集合中获取到与目标帧序号对应的解码后的视频帧,即无法获取到该编码视频帧对应的目标参考帧。则解码端即使在成功接收到该编码视频帧也会因无法获取其对应的目标参考帧,而无法正确解码。解码端在确定解码帧集合中不包括与目标帧序号对应的解码后的视频帧时,将该编码视频帧丢弃。Optionally, in the case that a plurality of video frames to be encoded included in the video to be encoded can have frame numbers in order of display timing, the reference distance of the video frames to be encoded can be the frame number of the video frames to be encoded and the target The difference between the frame numbers of the reference frames. The decoding end may determine the target frame number that differs from the frame number of the encoded video frame by the difference according to the reference distance. When it is determined that the decoded frame set includes the decoded video frame corresponding to the target frame number, obtain the decoded video frame corresponding to the target frame number in the decoded frame set, and use the video frame as the target reference frame corresponding to the coded video frame . When it is determined that the decoded frame set does not include the decoded video frame corresponding to the target frame number, it indicates that the decoder Wang Fuan has obtained the decoded video frame corresponding to the target frame number from the decoded frame set, that is, the decoded video frame cannot be obtained. The target reference frame corresponding to the encoded video frame. Even if the decoding end successfully receives the coded video frame, it cannot be decoded correctly because it cannot obtain its corresponding target reference frame. When the decoding end determines that the decoded frame set does not include the decoded video frame corresponding to the target frame number, the encoded video frame is discarded.
示例的,如图14所示,假设当前接收的编码视频帧的帧序号为29。解码帧集合包括的视频帧的帧序号为1、3...19、23、25和27,且参考距离为8。则该编码视频帧对应的目标参考帧为解码帧集合中帧序号为21的视频帧。遍历解码帧集合中包括的视频帧的帧序号。确定解码帧集合中不包括帧序号为21的视频帧,则丢弃接收到的该帧序号为29的编码视频帧。For example, as shown in FIG. 14 , it is assumed that the frame sequence number of the currently received coded video frame is 29. The frame numbers of the video frames included in the decoded frame set are 1, 3...19, 23, 25 and 27, and the reference distance is 8. Then the target reference frame corresponding to the coded video frame is the video frame whose frame number is 21 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the video frame with frame number 21 is not included in the set of decoded frames, the received coded video frame with frame number 29 is discarded.
如图15所示,假设当前接收的编码视频帧的帧序号为29。解码帧集合包括的视频帧的帧序号为1、3...19、21、23和27,且参考距离为8。则该编码视频帧对应的目标参考帧为解码帧集合中帧序号为21的视频帧。遍历解码帧集合中包括的视频帧的帧序号。确定解码帧集合中包括帧序号为21的视频帧,则将解码帧集合中该帧序号为21的视频帧作为编码视频帧对应的目标参考帧。As shown in FIG. 15 , assume that the frame sequence number of the currently received coded video frame is 29. The frame numbers of the video frames included in the decoded frame set are 1, 3...19, 21, 23 and 27, and the reference distance is 8. Then the target reference frame corresponding to the coded video frame is the video frame whose frame number is 21 in the decoded frame set. Traverse the frame numbers of the video frames included in the decoded frame set. If it is determined that the decoded frame set includes the video frame whose frame number is 21, then the video frame whose frame number is 21 in the decoded frame set is used as the target reference frame corresponding to the encoded video frame.
步骤909、解码端利用目标参考帧对编码视频帧解码,得到解码后的视频帧。 Step 909, the decoding end uses the target reference frame to decode the coded video frame to obtain a decoded video frame.
可选的,解码端可以利用目标参考帧对编码视频帧采用预测解码得到解码后的视频帧。示例的,解码端可以利用目标参考帧对编码视频帧采用帧间预测解码得到解码后的视频帧。Optionally, the decoding end may use the target reference frame to perform predictive decoding on the coded video frame to obtain the decoded video frame. For example, the decoding end may use the target reference frame to perform inter-frame predictive decoding on the coded video frame to obtain the decoded video frame.
步骤910、解码端在解码后的视频帧为处于基本层的视频帧的情况下,将解码后的视频帧添加至解码帧集合,得到新的解码帧集合。In step 910, if the decoded video frame is a video frame in the base layer, the decoder adds the decoded video frame to the decoded frame set to obtain a new decoded frame set.
可选的,在待编码视频采用时域可分级类型的SVC编码的情况下,与参考帧集合对应的是,解码帧集合包括的视频帧为处于基本层待编码的视频帧对应的解码后的视频帧。解码端在对接收到的编码视频帧解码后,判断该编码视频帧是否为处于基本层的待编码的视频帧所对应的编码视频帧。在确定该编码视频帧不为处于基本层的待编码的视频帧所对应的编码视频帧时,不用将该编码 视频帧解码后的视频帧存储至解码帧集合,可以显示该编码视频帧解码后的视频帧。在确定该编码视频帧为处于基本层的待编码的视频帧所对应的编码视频帧时,将该编码视频帧解码后的视频帧存储至解码帧集合,得到新的解码帧集合。并显示该编码视频帧解码后的视频帧。之后,解码端再次接收到编码端发送的编码视频帧时,可以获取该新的解码帧集合。根据该再次接收到的编码视频帧对应的参考距离,在确定新的解码帧集合中包括该编码视频帧对应的目标参考帧时,获取编码视频帧对应的目标参考帧,以便于后续利用目标参考帧对编码视频帧解码。Optionally, in the case that the video to be encoded adopts time-domain scalable SVC encoding, corresponding to the reference frame set, the video frame included in the decoded frame set is the decoded video frame corresponding to the video frame to be encoded at the base layer video frame. After decoding the received coded video frame, the decoder determines whether the coded video frame is a coded video frame corresponding to a video frame to be coded at the base layer. When it is determined that the coded video frame is not the coded video frame corresponding to the video frame to be coded at the base layer, it is not necessary to store the decoded video frame of the coded video frame in the decoded frame set, and the decoded video frame of the coded video frame can be displayed video frames. When it is determined that the coded video frame is the coded video frame corresponding to the video frame to be coded at the base layer, the decoded video frame of the coded video frame is stored in the decoded frame set to obtain a new decoded frame set. And display the decoded video frame of the coded video frame. Afterwards, when the decoding end receives the encoded video frame sent by the encoding end again, it can acquire the new set of decoded frames. According to the reference distance corresponding to the coded video frame received again, when it is determined that the target reference frame corresponding to the coded video frame is included in the new set of decoded frames, the target reference frame corresponding to the coded video frame is obtained, so as to facilitate subsequent use of the target reference frame Frame to decode encoded video frames.
示例的,解码端接收到的编码视频帧可以具有层级标识。该层级标识用于指示编码视频帧对应的待编码的视频帧处于基本层,或者处于增强层。解码端在对接收到的编码视频帧解码后,可以在确定该编码视频帧的层级标识指示该编码视频帧对应的待编码的视频帧处于基本层时,将该编码视频帧解码后的视频帧存储至解码帧集合,并显示该编码视频帧解码后的视频帧。在确定该编码视频帧的层级标识指示该编码视频帧对应的待编码的视频帧处于增强层时,显示该编码视频帧解码后的视频帧。For example, the coded video frame received by the decoding end may have a layer identifier. The level identifier is used to indicate that the video frame to be encoded corresponding to the encoded video frame is at the base layer or at the enhancement layer. After the decoding end decodes the received coded video frame, when it is determined that the layer identifier of the coded video frame indicates that the video frame to be coded corresponding to the coded video frame is at the basic layer, the decoded video frame of the coded video frame Store to the set of decoded frames and display the decoded video frame of the encoded video frame. When it is determined that the layer identifier of the coded video frame indicates that the video frame to be coded corresponding to the coded video frame is at an enhancement layer, the decoded video frame of the coded video frame is displayed.
本申请实施例中,解码帧集合包括的视频帧的最大数量可以为第二目标数量。则解码端在编码视频帧为处于基本层的视频帧编码得到的情况下,可以比较解码帧集合当前包括的视频帧的数量与第二目标数量的大小。在解码帧集合当前包括的视频帧的数量小于第二目标数量的情况下,解码端可以直接将编码视频帧解码后的视频帧添加至解码帧集合,得到新的解码帧集合。在解码帧集合当前包括的视频帧的数量等于第二目标数量的情况下,解码端可以删除解码帧集合包括的所有视频帧中帧序号最小的视频帧,将解码后的视频帧添加至解码帧集合,得到新的解码帧集合。In this embodiment of the present application, the maximum number of video frames included in the set of decoded frames may be the second target number. Then, when the encoded video frame is obtained by encoding the video frame at the base layer, the decoder may compare the number of video frames currently included in the decoded frame set with the second target number. When the number of video frames currently included in the decoded frame set is less than the second target number, the decoding end may directly add the decoded video frames of the coded video frame to the decoded frame set to obtain a new decoded frame set. When the number of video frames currently included in the decoded frame set is equal to the second target number, the decoder can delete the video frame with the smallest frame number among all the video frames included in the decoded frame set, and add the decoded video frame to the decoded frame Set to get a new set of decoded frames.
步骤907获取的解码帧集合均为:解码端在针对前一个接收的非首个待编码的视频帧执行视频编解码方法时,通过步骤910得到新的解码帧集合。The decoded frame sets obtained in step 907 are all: when the decoder executes the video encoding and decoding method for the previously received video frame that is not the first to be encoded, a new decoded frame set is obtained through step 910 .
步骤911、解码端向编码端发送解码反馈信息,解码反馈信息包括:帧序号以及丢失标记。 Step 911, the decoding end sends decoding feedback information to the encoding end, and the decoding feedback information includes: frame number and loss flag.
丢失标记用于反映解码端是否成功接收帧序号对应的编码视频帧,并对帧序号对应的编码视频帧解码。The lost flag is used to reflect whether the decoder successfully receives the coded video frame corresponding to the frame number and decodes the coded video frame corresponding to the frame number.
可选的,丢失标记可以包括未丢失状态以及丢失状态。未丢失状态可以指示解码端成功接收编码视频帧,且编码视频帧成功解码。也即是未丢失状态的丢失标记用于反映解码端对帧序号对应的编码视频帧解码。丢失状态可以指示解码端成功接收编码视频帧,但编码视频帧未成功解码。或者,丢失状态也可以指示解码端未成功接收编码视频帧。Optionally, the lost flag may include not lost status and lost status. The not-lost state may indicate that the coded video frame is successfully received by the decoding end, and the coded video frame is successfully decoded. That is, the loss flag in the not-lost state is used to reflect that the decoding end decodes the coded video frame corresponding to the frame number. A lost state may indicate that the encoded video frame was successfully received at the decoder, but the encoded video frame was not successfully decoded. Alternatively, the lost state may also indicate that the decoding end has not successfully received the encoded video frame.
示例的,解码端可以在确定解码帧集合中包括接收到的编码视频帧所对应的目标参考帧时,确实该编码视频帧的丢失标记为未丢失状态。解码端可以在确定解码帧集合中不包括接收到的编码视频帧所对应的目标参考帧时,确实该编码视频帧的丢失标记为丢失状态。解码端可以在确定未接收到一帧序号对应的编码视频帧时,确定该帧序号对应的编码视频帧的丢失标记为丢失状态。For example, when the decoding end determines that the target reference frame corresponding to the received coded video frame is included in the set of decoded frames, it may confirm that the loss mark of the coded video frame is not lost. When the decoding end determines that the set of decoded frames does not include the target reference frame corresponding to the received coded video frame, it can confirm that the loss of the coded video frame is marked as a lost state. When the decoding end determines that the encoded video frame corresponding to the frame sequence number has not been received, it may determine that the loss flag of the encoded video frame corresponding to the frame sequence number is in a lost state.
由于帧序号是对待编码视频中的多个视频帧按照显示时序分配的。因此,针对待编码视频中待编码的视频帧、该待编码的视频帧编码后的编码视频帧,以及该编码视频帧解码后得到的解码后的视频帧,均具有同一帧序号。Because the frame number is allocated according to the display timing of multiple video frames in the video to be encoded. Therefore, the video frame to be encoded in the video to be encoded, the encoded video frame after encoding the video frame to be encoded, and the decoded video frame obtained after decoding the encoded video frame all have the same frame number.
步骤912、编码端基于解码反馈信息,更新新的参考帧集合中的可靠帧。 Step 912, the encoder updates the reliable frames in the new reference frame set based on the decoding feedback information.
本申请实施例中,编码端可以在接收到解码端发送的解码反馈信息后,基于编码反馈信息,更新步骤905得到的新的参考帧集合中的可靠帧。In this embodiment of the present application, after receiving the decoding feedback information sent by the decoding end, the encoding end may update the reliable frames in the new reference frame set obtained in step 905 based on the encoding feedback information.
可选的,编码端基于解码反馈信息,更新新的参考帧集合中的可靠帧的过程可以包括:编码端在新的参考帧集合包括的所有视频帧中,选取丢失标记为未丢失状态、帧序号为最大值、且对应的目标参考帧为可靠帧,且处于基本层的视频帧,作为可靠帧。Optionally, based on the decoding feedback information, the process of updating the reliable frames in the new reference frame set at the encoding end may include: the encoding end selects the lost flag as the unlost state, frame A video frame whose serial number is the maximum value, whose corresponding target reference frame is a reliable frame, and is in the base layer is regarded as a reliable frame.
示例的,解码端可以在每次执行完成步骤909,即每次对编码视频帧解码得到解码后的视频帧之后,均向编码端发送一次解码反馈信息。或者,解码端可以在多次执行完成步骤909,即多次对编码视频帧解码得到解码后的视频帧之后,向编码端发送一次解码反馈信息。则编码端每次接收到的解码反馈信息可以仅包括一个帧序号以及对应的丢失标记。或者,编码端每次接收到的解码反馈信息可以包括多个帧序号以及多个帧序号对应的丢失标记。或者,解码端也可以在每间隔设定时长向编码端发送一次解码反馈信息。则编码端接收到的解码反馈信息可以包括该设定时长内编码端与解码端之间传输的编码视频帧的帧序号以及帧序号对应的丢失标记。例如,在一个间隔设定时长内,编码端向解码端发送的编码视频帧的帧序号包括:帧序号X1、帧序号X2以及帧序号X3。则解码端在间隔设定时长后向编码端发送的解码反馈信息包括:帧序号X1以及帧序号X1对应的丢失标记、帧序号X2以及帧序号X2对应的丢失标记、帧序号X3以及帧序号X3对应的丢失标记。For example, the decoding end may send decoding feedback information to the encoding end each time step 909 is completed, that is, each time the encoded video frame is decoded to obtain a decoded video frame. Alternatively, the decoding end may send decoding feedback information to the encoding end once after performing step 909 multiple times, that is, decoding the encoded video frame multiple times to obtain the decoded video frame. Then, the decoding feedback information received by the encoder each time may only include a frame sequence number and a corresponding loss flag. Alternatively, the decoding feedback information received by the encoding end each time may include multiple frame numbers and loss flags corresponding to the multiple frame numbers. Alternatively, the decoding end may also send decoding feedback information to the encoding end once at a set time interval. The decoding feedback information received by the encoding end may include the frame sequence number of the encoded video frame transmitted between the encoding end and the decoding end within the set time period and a loss flag corresponding to the frame sequence number. For example, within a set time interval, the frame numbers of the coded video frames sent by the encoding end to the decoding end include: frame number X1, frame number X2, and frame number X3. Then the decoding feedback information sent by the decoding end to the encoding end after setting the interval includes: frame number X1 and the loss flag corresponding to frame number X1, frame number X2 and the loss flag corresponding to frame number X2, frame number X3 and frame number X3 Corresponding missing markers.
在解码反馈信息包括多个帧序号以及多个帧序号对应的丢失标记的情况下,解码反馈信息中的多个帧序号可以按照对应视频帧的显示顺序单调递增排列,即按照帧序号由小到大的顺序排列。In the case where the decoding feedback information includes multiple frame numbers and the missing flags corresponding to the multiple frame numbers, the multiple frame numbers in the decoding feedback information can be arranged monotonically increasing according to the display order of the corresponding video frames, that is, according to the frame numbers from small to Arranged in big order.
编码端可以按照解码反馈信息中帧序号由小到大的顺序,依次对每个帧序号执行可靠帧判断处理,直至解码反馈信息中针对多个帧序号均执行完成可靠 帧判断处理。以将新的参考帧集合包括的所有视频帧中,丢失标记为未丢失状态、帧序号为最大值、且对应的目标参考帧为可靠帧,且处于基本层的视频帧作为可靠帧。该可靠帧判断处理包括:判断帧序号对应的丢失标记是否为未丢失状态、判断该帧序号是否大于新的参考帧集合中当前可靠帧的帧序号、判断帧序号对应的视频帧是否为处于基本层的视频帧,且判断帧序号对应的目标参考帧是否为可靠帧。在确定帧序号对应的丢失标记为未丢失状态、帧序号大于新的参考帧集合中当前可靠帧的帧序号、帧序号对应的视频帧为处于基本层的视频帧,且帧序号对应的目标参考帧为可靠帧的情况下,将该新的参考帧集合中帧序号对应的视频帧作为可靠帧。即将该新的参考帧集合中帧序号对应的视频帧更新为新的可靠帧。The encoder can perform reliable frame judgment processing for each frame number in sequence in the order of frame numbers in the decoding feedback information from small to large, until the reliable frame judgment processing is completed for multiple frame numbers in the decoding feedback information. Among all the video frames included in the new set of reference frames, the lost flag is not lost, the frame number is the maximum value, and the corresponding target reference frame is a reliable frame, and the video frame in the base layer is regarded as a reliable frame. The reliable frame judging process includes: judging whether the loss flag corresponding to the frame number is not lost, judging whether the frame number is greater than the frame number of the current reliable frame in the new reference frame set, judging whether the video frame corresponding to the frame number is in the basic Layer video frames, and determine whether the target reference frame corresponding to the frame number is a reliable frame. When it is determined that the loss mark corresponding to the frame number is not lost, the frame number is greater than the frame number of the current reliable frame in the new reference frame set, the video frame corresponding to the frame number is a video frame in the basic layer, and the target reference corresponding to the frame number When the frame is a reliable frame, the video frame corresponding to the frame number in the new reference frame set is taken as a reliable frame. That is, the video frame corresponding to the frame number in the new reference frame set is updated as a new reliable frame.
示例的,假设当前参考帧集合中帧序号为21的视频帧为可靠帧。解码端发送的解码反馈信息包括帧序号22、23、24以及25。其中,帧序号21、22、23、24以及25对应的帧参考关系为图16所示。如图16所示,帧序号为21、23以及25的视频帧为处于基本层视频帧;帧序号为22和24的视频帧为处于增强层的视频帧。图16中箭头标识参考关系,帧序号为22和23的视频帧其对应的目标参考帧的帧序号为21;帧序号为24和25的视频帧其对应的目标参考帧的帧序号为23。As an example, assume that the video frame whose frame number is 21 in the current reference frame set is a reliable frame. The decoding feedback information sent by the decoding end includes frame numbers 22 , 23 , 24 and 25 . Among them, the frame reference relationship corresponding to the frame numbers 21, 22, 23, 24 and 25 is shown in FIG. 16 . As shown in FIG. 16 , video frames with frame numbers 21, 23 and 25 are video frames in the base layer; video frames with frame numbers 22 and 24 are video frames in the enhancement layer. Arrow marks reference relationship among Fig. 16, and the frame sequence number of its corresponding target reference frame is 21 for the video frame of frame sequence number 22 and 23; The frame sequence number of its corresponding target reference frame of frame sequence number is 24 and 25 is 23.
在一示例的,假设解码反馈信息包括的多个帧序号对应的丢失标记均为未丢失状态。则编码端在接收到解码反馈信息后,可以先针对帧序号22执行可靠帧判断处理。帧序号22对应的视频帧为处于增强层的视频帧,则新的参考帧集合中帧序号22对应的视频帧不能作为可靠帧。然后针对帧序号23执行可靠帧判断处理。帧序号23对应的丢失标记为未丢失状态、帧序号23大于新的参考帧集合中当前可靠帧的帧序号21、帧序号23对应的视频帧为处于基本层的视频帧,且帧序号23对应的目标参考帧21为可靠帧。因此将新的参考帧集合中帧序号23的视频帧更新为可靠帧。之后,针对帧序号24执行可靠帧判断处理。帧序号24对应的视频帧为处于增强层的视频帧,则新的参考帧集合中帧序号24对应的视频帧不能作为可靠帧。之后,针对帧序号25执行可靠帧判断处理。帧序号25对应的丢失标记为未丢失状态、帧序号25大于新的参考帧集合中当前可靠帧的帧序号23、帧序号25对应的视频帧为处于基本层的视频帧,且帧序号25对应的目标参考帧23为可靠帧。因此将新的参考帧集合中帧序号25的视频帧更新为可靠帧。最终,解码端基于该次接收到的解码反馈信息,更新得到新的参考帧集合中的可靠帧为帧序号25的视频帧。In an example, it is assumed that the loss flags corresponding to the multiple frame numbers included in the decoding feedback information are all in a non-lost state. Then, after receiving the decoding feedback information, the encoding end may perform reliable frame determination processing for the frame number 22 first. The video frame corresponding to the frame number 22 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 22 in the new reference frame set cannot be used as a reliable frame. Reliable frame judgment processing is then performed for frame number 23. The loss mark corresponding to the frame number 23 is not lost, the frame number 23 is greater than the frame number 21 of the current reliable frame in the new reference frame set, and the video frame corresponding to the frame number 23 is a video frame in the basic layer, and the frame number 23 corresponds to The target reference frame 21 of is a reliable frame. Therefore, the video frame with frame number 23 in the new reference frame set is updated as a reliable frame. After that, reliable frame judgment processing is executed for frame number 24 . The video frame corresponding to the frame number 24 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 24 in the new reference frame set cannot be used as a reliable frame. After that, reliable frame judgment processing is executed for frame number 25. The loss mark corresponding to the frame number 25 is not lost, the frame number 25 is greater than the frame number 23 of the current reliable frame in the new reference frame set, and the video frame corresponding to the frame number 25 is a video frame in the basic layer, and the frame number 25 corresponds to The target reference frame 23 is a reliable frame. Therefore, the video frame with frame number 25 in the new reference frame set is updated as a reliable frame. Finally, based on the decoding feedback information received this time, the decoding end updates and obtains that the reliable frame in the new reference frame set is the video frame with frame number 25.
在另一示例的,假设解码反馈信息包括的帧序号为22、24和25对应的丢失标记均为未丢失状态,帧序号为23对应的丢失标记为丢失状态。则编码端在接收到解码反馈信息后,可以先针对帧序号22执行可靠帧判断处理。帧序号22 对应的视频帧为处于增强层的视频帧,则新的参考帧集合中帧序号22对应的视频帧不能作为可靠帧。然后针对帧序号23执行可靠帧判断处理。帧序号23对应的丢失标记为丢失状态,则新的参考帧集合中帧序号23对应的视频帧不能作为可靠帧。之后,针对帧序号24执行可靠帧判断处理。帧序号24对应的视频帧为处于增强层的视频帧,则新的参考帧集合中帧序号24对应的视频帧不能作为可靠帧。之后,针对帧序号25执行可靠帧判断处理。帧序号25对应的丢失标记为虽然为未丢失状态,但其参考帧23是丢失状态,则新的参考帧集合中帧序号25对应的视频帧不能作为可靠帧。最终,解码端基于该次接收到的解码反馈信息,更新得到新的参考帧集合中的可靠帧依旧为帧序号21的视频帧。In another example, it is assumed that the loss flags corresponding to the frame numbers 22, 24 and 25 included in the decoding feedback information are not lost, and the loss flag corresponding to the frame number 23 is in the lost state. Then, after receiving the decoding feedback information, the encoding end may perform reliable frame determination processing for the frame number 22 first. The video frame corresponding to the frame number 22 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 22 in the new reference frame set cannot be used as a reliable frame. Reliable frame judgment processing is then performed for frame number 23. If the loss flag corresponding to frame number 23 is in the lost state, then the video frame corresponding to frame number 23 in the new reference frame set cannot be used as a reliable frame. After that, reliable frame judgment processing is executed for frame number 24 . The video frame corresponding to the frame number 24 is a video frame in the enhancement layer, and the video frame corresponding to the frame number 24 in the new reference frame set cannot be used as a reliable frame. After that, reliable frame judgment processing is executed for frame number 25. The loss mark corresponding to frame number 25 is not lost, but its reference frame 23 is in a lost state, then the video frame corresponding to frame number 25 in the new reference frame set cannot be used as a reliable frame. Finally, based on the decoding feedback information received this time, the decoding end updates and obtains that the reliable frame in the new reference frame set is still the video frame with frame number 21.
本申请实施例中,在解码端发送解码反馈信息的情况下,编码端可以基于解码端发送的解码反馈信息,得到编码端与解码端的传输丢帧率。可选的,步骤901中编码端获取编码端与解码端的传输丢帧率的过程包括:编码端根据设定时长内接收的丢失标记,确定传输丢帧率。In the embodiment of the present application, when the decoding end sends decoding feedback information, the encoding end may obtain the transmission frame loss rate between the encoding end and the decoding end based on the decoding feedback information sent by the decoding end. Optionally, in step 901, the process for the encoding end to obtain the transmission frame loss rate of the encoding end and the decoding end includes: the encoding end determines the transmission frame loss rate according to the loss flag received within a set time period.
示例的,编码端可以统计距离编码端当前时刻最近的设定时长t内,接收到的解码端发送的解码反馈信息。统计解码反馈信息所包括的帧序号的总数量N ack,即解码反馈信息反馈的视频帧总数量,以及统计所有为丢失状态的丢失标记的数量N loss,即丢失标记为丢失状态的视频帧的数量N loss。编码端将为丢失状态的丢失标记的数量N loss与总数量N ack比值,确定为编码端与解码端的传输丢帧率P -lossFor example, the encoding end may count the received decoding feedback information sent by the decoding end within a set time period t closest to the current moment of the encoding end. Count the total number of frame numbers N ack included in the decoding feedback information, that is, the total number of video frames fed back by the decoding feedback information, and count the number of all lost flags N loss , that is, the number of video frames that are marked as lost Number N loss . The encoding end determines the ratio of the number N loss of lost markers in the lost state to the total number N ack as the transmission frame loss rate P -loss between the encoding end and the decoding end.
本申请实施例中,编码端可以包括发送缓冲区,该发送缓冲区用于存储待发送的编码视频帧以及参考距离。则编码端在向解码端发送编码视频帧以及参考距离之前,所述视频编解码方法还包括:编码端将编码视频帧写入至发送缓冲区。则编码端向解码端发送编码视频帧以及参考距离的过程可以包括:在发送缓冲区的占用量大于数据量阈值,且编码视频帧为目标编码视频帧时,向解码端发送编码视频帧,目标编码视频帧为处于基本层的视频帧编码得到。或者,在发送缓冲区的占用量小于或者等于数据量阈值时,向解码端发送编码视频帧。In the embodiment of the present application, the encoding end may include a sending buffer, and the sending buffer is used to store encoded video frames to be sent and a reference distance. Then, before the encoding end sends the encoded video frame and the reference distance to the decoding end, the video encoding and decoding method further includes: the encoding end writes the encoded video frame into the sending buffer. Then the process of sending the encoded video frame and the reference distance from the encoding end to the decoding end may include: when the occupancy of the sending buffer is greater than the data volume threshold, and the encoding video frame is the target encoding video frame, sending the encoding video frame to the decoding end, the target The encoded video frame is obtained by encoding the video frame at the base layer. Alternatively, when the occupancy of the sending buffer is less than or equal to the data volume threshold, the encoded video frame is sent to the decoding end.
可选的,编码端将对待编码的视频帧编码得到编码视频帧可以先存储于发送缓冲区(SendBuffer)内。通过判断发送缓冲区的占用量与数据量阈值,以确定当前编码端与解码端之间的网络传输能力。当发送缓冲区的占用量大于数据量阈值时,表明发送缓冲区中存储有过多待发送至解码端的编码视频帧,编码端与解码端之间的网络传输能力不足,导致无法将发送缓冲区内存储的编码视频帧及时发出。则编码端可以删除发送缓冲区内存储的相对重要程度较低的编码视频帧,以保证将重要程度较高的编码视频帧通过有限的网络传输能力及时发送至解码端。当发送缓冲区的占用量小于或者等于数据量阈值时,表明发送 缓冲区中未存储有过多待发送至解码端的编码视频帧,编码端与解码端之间的网络传输能力充足。则编码端可以将发送缓冲区内存储的所有编码视频帧均发送至解码端,以保证传输视频的质量。Optionally, the coding end encodes the video frame to be coded to obtain the coded video frame and may first store it in the sending buffer (SendBuffer). By judging the occupancy of the sending buffer and the data volume threshold, the current network transmission capacity between the encoding end and the decoding end is determined. When the occupancy of the sending buffer is greater than the data volume threshold, it indicates that there are too many encoded video frames to be sent to the decoding end stored in the sending buffer, and the network transmission capacity between the encoding end and the decoding end is insufficient, resulting in the failure to transfer the sending buffer The encoded video frames stored in the internal storage are sent out in time. Then the encoding end can delete the encoded video frames with relatively low importance stored in the sending buffer, so as to ensure that the encoded video frames with high importance can be sent to the decoding end in time through the limited network transmission capacity. When the occupancy of the sending buffer is less than or equal to the data volume threshold, it indicates that there are not too many encoded video frames to be sent to the decoding end stored in the sending buffer, and the network transmission capacity between the encoding end and the decoding end is sufficient. Then the encoding end can send all the encoded video frames stored in the sending buffer to the decoding end, so as to ensure the quality of the transmitted video.
示例的,当发送缓冲区的占用量大于数据量阈值时,编码端可以仅将处于基本层的视频帧编码得到的目标编码视频帧发送至解码端,以在考虑到网络传输能力的情况下,保证解码端接收到可连续解码的编码视频帧。当发送缓冲区的占用量小于或者等于数据量阈值时,编码端可以将处于基本层和增强层的视频帧编码得到的编码视频帧均发送至解码端。For example, when the occupancy of the sending buffer is greater than the data volume threshold, the encoding end may only send the target encoded video frame obtained by encoding the video frame at the base layer to the decoding end, so that in consideration of the network transmission capacity, Ensure that the decoding end receives encoded video frames that can be decoded continuously. When the occupancy of the sending buffer is less than or equal to the data volume threshold, the encoding end may send the encoded video frames obtained by encoding the video frames in the base layer and the enhancement layer to the decoding end.
编码端还可以通过发送缓冲区中积累存储的编码视频帧的总时间范围是否大于时间阈值T -drop,以判断当前编码端与解码端之间的网络传输能力。当发送缓冲区中积累存储的编码视频帧的总时间范围大于时间阈值时,表明当前编码端与解码端之间的网络传输能力不足。编码端在确定编码视频帧为目标编码视频帧时,向解码端发送编码视频帧,目标编码视频帧为处于基本层的视频帧编码得到。当发送缓冲区中积累存储的编码视频帧的总时间范围小于或者等于时间阈值时,表明当前编码端与解码端之间的网络传输能力充足。编码端向解码端发送所有编码视频帧。 The encoding end can also determine the current network transmission capability between the encoding end and the decoding end by whether the total time range of encoded video frames accumulated and stored in the sending buffer is greater than the time threshold T -drop . When the total time range of encoded video frames accumulated and stored in the sending buffer is greater than the time threshold, it indicates that the current network transmission capacity between the encoding end and the decoding end is insufficient. When the encoding end determines that the encoded video frame is a target encoded video frame, it sends the encoded video frame to the decoding end, and the target encoded video frame is obtained by encoding a video frame at the base layer. When the total time range of encoded video frames accumulated and stored in the sending buffer is less than or equal to the time threshold, it indicates that the current network transmission capacity between the encoding end and the decoding end is sufficient. The encoder sends all encoded video frames to the decoder.
本申请实施例中,解码端可以根据当前编码端与解码端之间的网络传输能力,确定发送编码视频帧的数量。从而可以在网络传输能力较差时,在保证解码端可连续解码的基础上,保证编码端和解码端之间传输较少数量的编码视频帧。因而降低了因网络传输能力不足导致的帧丢失概率,进而降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,降低了解码端出现播放卡顿等问题的概率。In the embodiment of the present application, the decoding end may determine the number of encoded video frames to be sent according to the current network transmission capability between the encoding end and the decoding end. Therefore, when the network transmission capability is poor, on the basis of ensuring that the decoding end can be continuously decoded, a small number of encoded video frames can be transmitted between the encoding end and the decoding end. Therefore, the probability of frame loss caused by insufficient network transmission capacity is reduced, which in turn reduces the probability that some encoded video frames cannot be decoded correctly due to video frame loss, improves the correct decoding efficiency of video frames, and reduces playback stuttering at the decoding end, etc. probability of the problem.
本申请实施例提供的视频编解码方法的步骤的先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减。下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。例如,步骤905可以位于步骤907至步骤912中任一步骤之前,只要保证编码端得到编码视频帧之后,编码端基于该编码视频帧对应的视频帧更新参考帧集合,以使得确定后一个待编码的视频帧所采用的参考帧集合为更新后的参考帧集合。The order of the steps of the video encoding and decoding method provided in the embodiment of the present application can be adjusted appropriately, and the steps can also be increased or decreased accordingly according to the situation. The following are device embodiments of the present application, which can be used to implement the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application. For example, step 905 can be located before any step from step 907 to step 912, as long as it is ensured that after the encoding end obtains the encoded video frame, the encoding end updates the reference frame set based on the video frame corresponding to the encoded video frame, so that the next to-be-encoded frame can be determined The reference frame set used by the video frame is the updated reference frame set.
综上所述,本申请实施例提供的视频编解码方法,通过获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合。根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧。将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离。利用目标参考帧对待编码的视频帧编码,得到编码视频帧。以向解码端发 送编码视频帧以及参考距离。其中,参考帧集合包括至少一个编码视频帧对应的视频帧,且参考帧集合包括与解码端可成功解码的编码视频帧对应的视频帧作为可靠帧。而视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧。因而,在视频发送端和视频接收端之间网络状态较差,导致出现传输过程中视频帧丢失的情况下,即出现丢帧率大于0的情况下,由于丢帧率越大,待编码视频帧中越多待编码的视频帧采用可靠帧编码。因此,降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,进而降低了解码端出现播放卡顿等问题的概率。To sum up, the video encoding and decoding method provided by the embodiment of the present application obtains the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a set of reference frames. According to the frame loss rate and the video coding rule, the target reference frame corresponding to the video frame to be coded is determined from the reference frame set. The distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded. The video frame to be coded is coded by using the target reference frame to obtain the coded video frame. To send encoded video frames and reference distances to the decoder. Wherein, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame. The video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
本申请实施例所提供的一种视频编解码装置可执行本申请任意实施例所提供的应用于服务端多个微服务节点中任一微服务节点的视频编解码方法,具备执行应用于客户端的视频编解码方法相应的功能模块和效果。A video encoding and decoding device provided in the embodiment of the present application can execute the video encoding and decoding method applied to any one of the multiple microservice nodes of the server provided in any embodiment of the application, and has the ability to execute the video encoding and decoding method applied to the client The corresponding functional modules and effects of the video codec method.
图17是根据一示例性实施例示出的一种视频编解码装置的流程图,视频编解码装置应用于编码端。如图17所示,视频编解码装置1700包括:获取模块1701、确定模块1702、编码模块1703以及发送模块1704。Fig. 17 is a flowchart showing a video codec device according to an exemplary embodiment, and the video codec device is applied to an encoding end. As shown in FIG. 17 , a video codec device 1700 includes: an acquisition module 1701 , a determination module 1702 , an encoding module 1703 and a sending module 1704 .
获取模块1701,设置为获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,参考帧集合包括至少一个编码视频帧对应的视频帧,参考帧集合中与解码端可成功解码的编码视频帧对应的视频帧为可靠帧;确定模块1702,设置为根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧,视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧;以及还设置为将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离;编码模块1703,设置为利用目标参考帧对待编码的视频帧编码,得到编码视频帧;发送模块1704,设置为向解码端发送编码视频帧以及参考距离。The acquisition module 1701 is configured to acquire the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set is related to the decoding The video frame corresponding to the coded video frame that can be successfully decoded by the end is a reliable frame; the determination module 1702 is configured to determine the target reference frame corresponding to the video frame to be coded from the reference frame set according to the frame loss rate and the video coding rule, and video coding The rules include: the larger the frame loss rate, the larger the target reference frame corresponding to the larger number of video frames in the video to be encoded is a reliable frame; and it is also set to determine the distance between the video frame to be encoded and the target reference frame in display timing as The reference distance of the video frame to be encoded; the encoding module 1703 is configured to use the target reference frame to encode the video frame to be encoded to obtain the encoded video frame; the sending module 1704 is configured to send the encoded video frame and the reference distance to the decoding end.
综上所述,本申请实施例提供的视频编解码装置,通过获取模块获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合。确定模块根据丢帧率和视频编码规则,从参考帧集合中确定待编码的视频帧对应的目标参考帧。将待编码的视频帧与目标参考帧在显示时序上的距离,确定为待编码的视频帧的参考距离。编码模块利用目标参考帧对待编码的视频帧编码,得到编码视频帧。以使得发送模块向解码端发送编码视频帧以及参考距离。其中,参考帧集合包括至少一个编码视频帧对应的视频帧,且参考帧集合包括与解码端可成功解码的编码视频帧对应的视频帧作为可靠帧。而视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧。因而,在视频发送端和视频接收端之间网络状态较差,导致出现传输过程中视频 帧丢失的情况下,即出现丢帧率大于0的情况下,由于丢帧率越大,待编码视频帧中越多待编码的视频帧采用可靠帧编码。因此,降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,进而降低了解码端出现播放卡顿等问题的概率。To sum up, the video encoding and decoding device provided by the embodiment of the present application acquires the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and the reference frame set through the acquisition module. The determination module determines the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and the video encoding rule. The distance between the video frame to be encoded and the target reference frame in display timing is determined as the reference distance of the video frame to be encoded. The encoding module uses the target reference frame to encode the video frame to be encoded to obtain the encoded video frame. So that the sending module sends the coded video frame and the reference distance to the decoding end. Wherein, the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set includes the video frame corresponding to the encoded video frame that can be successfully decoded by the decoding end as a reliable frame. The video coding rules include: the larger the frame loss rate is, the more target reference frames corresponding to the video frames in the video to be coded are reliable frames. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in video frame loss during transmission, that is, when the frame loss rate is greater than 0, the larger the frame loss rate, the video to be encoded The more video frames to be encoded in the frame are encoded with reliable frames. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
本申请实施例所提供的一种视频编解码装置可执行本申请任意实施例所提供的应用于微服务管理设备的视频编解码方法,具备执行应用于微服务管理设备的视频编解码方法相应的功能模块和效果。A video codec device provided in the embodiment of the present application can execute the video codec method applied to the microservice management device provided in any embodiment of the present application, and has the corresponding functions for executing the video codec method applied to the microservice management device Function modules and effects.
图18是根据一示例性实施例示出的一种视频编解码装置的流程图,视频编解码装置应用于解码端。如图18所示,视频编解码装置1800包括:接收模块1801、获取模块1802以及解码模块1803。Fig. 18 is a flowchart showing a video codec device according to an exemplary embodiment, and the video codec device is applied to a decoding end. As shown in FIG. 18 , a video codec device 1800 includes: a receiving module 1801 , an acquiring module 1802 and a decoding module 1803 .
接收模块1801,设置为接收编码端根据本申请实施例提供的任一项视频编解码装置发送的编码视频帧以及参考距离;获取模块1802,设置为获取解码帧集合,解码帧集合包括至少一个解码后的视频帧,解码帧集合所包括的视频帧的数量,大于或者等于解码端的参考帧集合所包括的视频帧的数量;以及还设置为在根据参考距离,确定解码帧集合中包括编码视频帧对应的目标参考帧时,获取编码视频帧对应的目标参考帧;解码模块1803,设置为利用目标参考帧对编码视频帧解码,得到解码后的视频帧。The receiving module 1801 is configured to receive the encoded video frame and the reference distance sent by the encoding end according to any video codec device provided by the embodiment of the present application; the obtaining module 1802 is configured to obtain a set of decoded frames, and the set of decoded frames includes at least one decoding After the video frame, the number of video frames included in the decoded frame set is greater than or equal to the number of video frames included in the reference frame set at the decoding end; and it is also set to determine the encoded video frame in the decoded frame set according to the reference distance When the corresponding target reference frame is obtained, the target reference frame corresponding to the coded video frame is obtained; the decoding module 1803 is configured to decode the coded video frame by using the target reference frame to obtain a decoded video frame.
本申请实施例提供的视频编解码装置,通过接收模块接收编码端根据本申请实施例提供的一种视频编解码方法生成的编码视频帧以及参考距离。获取模块获取解码帧集合,解码帧集合包括至少一个解码后的视频帧。使得可以根据参考距离,从解码帧集合中获取编码视频帧对应的目标参考帧。从而使得解码模块利用目标参考帧对编码视频帧解码,得到解码后的视频帧。其中,编码视频帧是编码端根据本申请实施例提供的一种视频编解码方法生成的。因而,在视频发送端和视频接收端之间网络状态较差,导致出现传输过程中视频帧丢失的情况下,由于丢帧率越大,越多的编码视频帧是采用与解码端可成功解码的编码视频帧对应的可靠帧编码。因此,降低了因视频帧丢失导致部分编码视频帧无法正确解码的概率,提高了视频帧的正确解码效率,进而降低了解码端出现播放卡顿等问题的概率。The video codec device provided in the embodiment of the present application receives, through the receiving module, the coded video frame and the reference distance generated by the coder according to a video codec method provided in the embodiment of the present application. The acquiring module acquires a decoded frame set, and the decoded frame set includes at least one decoded video frame. The target reference frame corresponding to the coded video frame can be obtained from the decoded frame set according to the reference distance. Therefore, the decoding module uses the target reference frame to decode the coded video frame to obtain the decoded video frame. Wherein, the encoded video frame is generated by the encoding end according to a video encoding and decoding method provided by the embodiment of the present application. Therefore, when the network status between the video sending end and the video receiving end is poor, resulting in the loss of video frames during transmission, the greater the frame loss rate, the more encoded video frames are used and the decoding end can be successfully decoded The encoded video frame corresponds to the reliable frame encoding. Therefore, the probability that some coded video frames cannot be correctly decoded due to video frame loss is reduced, the efficiency of correct decoding of video frames is improved, and the probability of problems such as playback freezes at the decoding end is reduced.
图19是本申请实施例提供的一种电子设备的框图。本申请实施例提供的电子设备包括处理器1901、存储器1902及存储在所述存储器1902上并可在所述处理器1901上运行的计算机程序,所述计算机程序被所述处理器1901执行时 实现上述任一实施例所述的视频编解码方法。Fig. 19 is a block diagram of an electronic device provided by an embodiment of the present application. The electronic device provided in this embodiment of the present application includes a processor 1901, a memory 1902, and a computer program stored on the memory 1902 and operable on the processor 1901, and the computer program is implemented when executed by the processor 1901. The video encoding and decoding method described in any one of the above embodiments.
本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述视频编解码方法实施例的多个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。所述的计算机可读存储介质可以为非暂态存储介质。The embodiment of the present application also provides a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, multiple processes of the above-mentioned video encoding and decoding method embodiments can be achieved, and the same To avoid repetition, the technical effects will not be repeated here. Wherein, the computer-readable storage medium is, for example, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like. The computer-readable storage medium may be a non-transitory storage medium.

Claims (18)

  1. 一种视频编解码方法,应用于编码端,包括:A video encoding and decoding method applied to an encoding end, comprising:
    获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,其中,所述参考帧集合包括至少一个编码视频帧对应的视频帧,所述参考帧集合中与所述解码端可成功解码的编码视频帧对应的视频帧为可靠帧;Obtain the non-first video frame to be encoded, the transmission frame loss rate of the encoding end and the decoding end, and a reference frame set, wherein the reference frame set includes at least one video frame corresponding to the encoded video frame, and the reference frame set is related to the The video frame corresponding to the encoded video frame that can be successfully decoded by the decoder is a reliable frame;
    根据所述丢帧率和视频编码规则,从所述参考帧集合中确定所述待编码的视频帧对应的目标参考帧,其中,所述视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为所述可靠帧;According to the frame loss rate and video coding rules, determine the target reference frame corresponding to the video frame to be coded from the set of reference frames, wherein the video coding rules include: the larger the frame loss rate, the larger the video frame to be coded The target reference frame corresponding to a larger number of video frames is the reliable frame;
    将所述待编码的视频帧与所述目标参考帧在显示时序上的距离,确定为所述待编码的视频帧的参考距离;determining the distance between the video frame to be encoded and the target reference frame in display timing as the reference distance of the video frame to be encoded;
    利用所述目标参考帧对所述待编码的视频帧编码,得到编码视频帧;Encoding the video frame to be encoded by using the target reference frame to obtain an encoded video frame;
    向所述解码端发送所述编码视频帧以及所述参考距离。sending the coded video frame and the reference distance to the decoding end.
  2. 根据权利要求1所述的方法,其中,所述待编码视频由多个视频帧组成,且所述多个视频帧包括一个基本层和至少一个增强层,所述多个视频帧具有按照显示时序分配的帧序号,所述参考帧集合包括的视频帧为处于所述基本层的视频帧;The method according to claim 1, wherein the video to be encoded is composed of a plurality of video frames, and the plurality of video frames include a base layer and at least one enhancement layer, and the plurality of video frames have The assigned frame number, the video frame included in the reference frame set is a video frame in the basic layer;
    所述方法还包括:The method also includes:
    在所述编码视频帧为处于所述基本层的视频帧编码得到的情况下,将所述编码视频帧对应的视频帧添加至所述参考帧集合,得到新的参考帧集合;When the encoded video frame is obtained by encoding a video frame at the base layer, adding a video frame corresponding to the encoded video frame to the reference frame set to obtain a new reference frame set;
    接收所述解码端发送的解码反馈信息,其中,所述解码反馈信息包括:帧序号以及丢失标记,所述丢失标记用于反映所述解码端是否成功接收所述帧序号对应的编码视频帧,并对所述帧序号对应的编码视频帧解码;receiving decoding feedback information sent by the decoding end, wherein the decoding feedback information includes: a frame number and a loss flag, and the loss flag is used to reflect whether the decoding end has successfully received the encoded video frame corresponding to the frame number, And decoding the coded video frame corresponding to the frame number;
    基于所述解码反馈信息,更新所述新的参考帧集合中的可靠帧。Based on the decoding feedback information, reliable frames in the new set of reference frames are updated.
  3. 根据权利要求2所述的方法,其中,所述基于所述解码反馈信息,更新所述新的参考帧集合中的可靠帧,包括:The method according to claim 2, wherein said updating reliable frames in said new reference frame set based on said decoding feedback information comprises:
    在所述新的参考帧集合包括的所有视频帧中,选取丢失标记为未丢失状态、帧序号为最大值、对应的目标参考帧为可靠帧,且处于所述基本层的视频帧,作为可靠帧,其中,所述未丢失状态的丢失标记,用于反映所述解码端对帧序号对应的编码频帧解码。Among all the video frames included in the new reference frame set, select the video frame whose loss flag is not lost, the frame number is the maximum value, the corresponding target reference frame is a reliable frame, and is in the basic layer, as a reliable A frame, wherein the loss flag in the not-lost state is used to reflect that the decoding end decodes the coded frequency frame corresponding to the frame sequence number.
  4. 根据权利要求2所述的方法,其中,所述将所述编码视频帧对应的视频帧添加至所述参考帧集合,得到新的参考帧集合,包括:The method according to claim 2, wherein the adding the video frame corresponding to the coded video frame to the reference frame set to obtain a new reference frame set comprises:
    在所述参考帧集合当前包括的视频帧的数量小于第一目标数量的情况下, 将所述编码视频帧对应的视频帧添加至所述参考帧集合,得到新的参考帧集合,其中,所述第一目标数量为所述参考帧集合可包括的视频帧的最大数量;In the case that the number of video frames currently included in the reference frame set is less than the first target number, adding the video frame corresponding to the coded video frame to the reference frame set to obtain a new reference frame set, wherein the The first target number is the maximum number of video frames that the reference frame set can include;
    在所述参考帧集合当前包括的视频帧的数量为所述第一目标数量的情况下,删除所述参考帧集合包括的所有视频帧中帧序号最小的视频帧,将所述编码视频帧对应的视频帧添加至所述参考帧集合,得到新的参考帧集合。In the case that the number of video frames currently included in the reference frame set is the first target number, delete the video frame with the smallest frame sequence number among all video frames included in the reference frame set, and correspond to the coded video frame The video frames of are added to the reference frame set to obtain a new reference frame set.
  5. 根据权利要求1所述的方法,在所述获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合之前,还包括:The method according to claim 1, before the acquisition of the non-first video frame to be encoded, the transmission frame loss rate between the encoding end and the decoding end, and the set of reference frames, further comprising:
    获取首个待编码的视频帧;Get the first video frame to be encoded;
    对所述首个待编码的视频帧采用帧内预测编码,得到首个编码视频帧;Using intra-frame predictive coding on the first video frame to be coded to obtain the first coded video frame;
    将所述首个编码视频帧对应的视频帧添加至所述参考帧集合,选取所述首个编码视频帧对应的视频帧作为可靠帧;adding the video frame corresponding to the first coded video frame to the reference frame set, and selecting the video frame corresponding to the first coded video frame as a reliable frame;
    向所述解码端发送所述首个编码视频帧。sending the first coded video frame to the decoding end.
  6. 根据权利要求2所述的方法,其中,所述获取编码端与解码端的传输丢帧率,包括:The method according to claim 2, wherein said obtaining the transmission frame loss rate of the encoding end and the decoding end comprises:
    根据设定时长内接收的丢失标记,确定所述传输丢帧率。The transmission frame loss rate is determined according to the loss flags received within the set time period.
  7. 根据权利要求2所述的方法,其中,存在多个丢帧率区间,所述视频编码规则包括:与多个不同丢帧率区间一一对应的编码子规则,根据不同编码子规则为同一待编码视频中的每个视频帧确定目标参考帧时,对应的目标参考帧为可靠帧的视频帧的数量不同;The method according to claim 2, wherein there are a plurality of frame loss rate intervals, and the video encoding rules include: one-to-one encoding sub-rules corresponding to a plurality of different frame loss rate intervals, according to different encoding sub-rules for the same When each video frame in the coded video determines the target reference frame, the number of video frames whose corresponding target reference frame is a reliable frame is different;
    所述根据所述丢帧率和视频编码规则,从所述参考帧集合中确定所述待编码的视频帧对应的目标参考帧,包括:The determining the target reference frame corresponding to the video frame to be encoded from the reference frame set according to the frame loss rate and video coding rules includes:
    根据所述丢帧率所属的目标丢帧率区间,确定对应的目标编码子规则;According to the target frame loss rate interval to which the frame loss rate belongs, determine a corresponding target coding sub-rule;
    根据所述目标编码子规则,确定所述待编码的视频帧对应的目标参考帧。A target reference frame corresponding to the video frame to be encoded is determined according to the target coding sub-rule.
  8. 根据权利要求7所述的方法,其中,所述视频编码规则包括:第一编码子规则、第二编码子规则以及第三编码子规则,The method according to claim 7, wherein the video coding rules include: a first coding sub-rule, a second coding sub-rule and a third coding sub-rule,
    针对处于所述基本层的待编码的视频帧,所述第一编码子规则用于将所述参考帧集合中,与所述待编码的视频帧的帧序号最接近的视频帧,作为所述待编码的视频帧对应的目标参考帧;For the video frame to be encoded at the base layer, the first encoding sub-rule is used to use the video frame with the closest frame sequence number to the video frame to be encoded in the set of reference frames as the The target reference frame corresponding to the video frame to be encoded;
    所述第二编码子规则用于将所述可靠帧,作为所有待编码的视频帧中,每间隔设定数量的待编码的视频帧所对应的目标参考帧;The second encoding sub-rule is used to use the reliable frame as a target reference frame corresponding to a set number of video frames to be encoded at each interval among all video frames to be encoded;
    所述第三编码子规则用于将所述可靠帧,作为每个待编码帧对应的目标参 考帧。The third encoding sub-rule is used to use the reliable frame as the target reference frame corresponding to each frame to be encoded.
  9. 根据权利要求2所述的方法,在所述向所述解码端发送所述编码视频帧以及所述参考距离之前,还包括:The method according to claim 2, before sending the encoded video frame and the reference distance to the decoding end, further comprising:
    将所述编码视频帧写入至发送缓冲区;Writing the encoded video frame to a sending buffer;
    所述向所述解码端发送所述编码视频帧以及所述参考距离,包括:The sending the encoded video frame and the reference distance to the decoding end includes:
    在所述发送缓冲区的占用量大于数据量阈值,且所述编码视频帧为目标编码视频帧的情况下,向所述解码端发送所述编码视频帧,其中,所述目标编码视频帧为处于所述基本层的视频帧编码得到;或者,When the occupancy of the sending buffer is greater than a data volume threshold and the encoded video frame is a target encoded video frame, send the encoded video frame to the decoding end, wherein the target encoded video frame is A video frame at the base layer is encoded; or,
    在所述发送缓冲区的占用量小于或者等于所述数据量阈值的情况下,向所述解码端发送所述编码视频帧。When the occupancy of the sending buffer is less than or equal to the data amount threshold, sending the encoded video frame to the decoding end.
  10. 一种视频编解码方法,应用于解码端,包括:A video encoding and decoding method applied to a decoding end, comprising:
    接收编码端根据权利要求1至9中任一项视频编解码方法发送的编码视频帧以及参考距离;Receiving the encoded video frame and the reference distance sent by the encoding end according to any one of the video encoding and decoding methods in claims 1 to 9;
    获取解码帧集合,其中,所述解码帧集合包括至少一个解码后的视频帧,所述解码帧集合所包括的视频帧的数量,大于或者等于所述解码端的参考帧集合所包括的视频帧的数量;Obtain a set of decoded frames, wherein the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to the number of video frames included in the set of reference frames at the decoding end quantity;
    在根据所述参考距离,确定所述解码帧集合中包括所述编码视频帧对应的目标参考帧的情况下,获取所述编码视频帧对应的目标参考帧;If it is determined according to the reference distance that the set of decoded frames includes a target reference frame corresponding to the encoded video frame, acquiring the target reference frame corresponding to the encoded video frame;
    利用所述目标参考帧对所述编码视频帧解码,得到解码后的视频帧。Decoding the coded video frame by using the target reference frame to obtain a decoded video frame.
  11. 根据权利要求10所述的方法,还包括:The method of claim 10, further comprising:
    向所述编码端发送解码反馈信息,其中,所述解码反馈信息包括:帧序号以及丢失标记,所述丢失标记用于反映所述解码端是否成功接收帧序号对应的编码视频帧,并对所述帧序号对应的编码视频帧解码。Send decoding feedback information to the encoding end, wherein the decoding feedback information includes: a frame number and a loss flag, and the loss flag is used to reflect whether the decoding end has successfully received the encoded video frame corresponding to the frame number, and The coded video frame corresponding to the above frame number is decoded.
  12. 根据权利要求10所述的方法,其中,所述待编码视频由多个视频帧组成,且所述多个视频帧包括一个基本层和至少一个增强层,所述多个视频帧分别具有按照显示时序分配的帧序号,所述解码帧集合包括的视频帧为处于所述基本层的视频帧;The method according to claim 10, wherein the video to be encoded is composed of a plurality of video frames, and the plurality of video frames include a base layer and at least one enhancement layer, and the plurality of video frames respectively have A sequence number of a frame allocated in time sequence, the video frame included in the decoded frame set is a video frame in the basic layer;
    所述方法还包括:The method also includes:
    在所述解码后的视频帧为处于所述基本层的视频帧的情况下,将所述解码后的视频帧添加至所述解码帧集合,得到新的解码帧集合。If the decoded video frame is a video frame in the base layer, adding the decoded video frame to the decoded frame set to obtain a new decoded frame set.
  13. 根据权利要求12所述的方法,其中,所述将所述解码后的视频帧添加 至所述解码帧集合,得到新的解码帧集合,包括:The method according to claim 12, wherein said adding the decoded video frame to the set of decoded frames to obtain a new set of decoded frames comprises:
    在所述解码帧集合当前包括的视频帧的数量小于第二目标数量的情况下,将所述解码后的视频帧添加至所述解码帧集合,得到新的解码帧集合,其中,所述第二目标数量为所述解码帧集合可包括的视频帧的最大数量;When the number of video frames currently included in the decoded frame set is less than a second target number, adding the decoded video frames to the decoded frame set to obtain a new decoded frame set, wherein the first The target number is the maximum number of video frames that the set of decoded frames can include;
    在所述解码帧集合当前包括的视频帧的数量为所述第二目标数量的情况下,删除所述解码帧集合包括的所有视频帧中帧序号最小的视频帧,将所述解码后的视频帧添加至所述解码帧集合,得到新的解码帧集合。When the number of video frames currently included in the decoded frame set is the second target number, delete the video frame with the smallest frame sequence number among all the video frames included in the decoded frame set, and convert the decoded video Frames are added to the set of decoded frames, resulting in a new set of decoded frames.
  14. 根据权利要求10所述的方法,还包括:The method of claim 10, further comprising:
    接收首个编码视频帧,其中,所述首个编码视频帧为首个待编码的视频帧采用帧内预测编码得到;Receiving the first coded video frame, wherein the first coded video frame is the first video frame to be coded by intra-frame predictive coding;
    对所述首个编码视频帧采用帧内预测解码,得到首个解码后的视频帧;Decoding the first encoded video frame by intra-frame prediction to obtain the first decoded video frame;
    将所述首个解码后的视频帧添加至所述解码帧集合。Adding the first decoded video frame to the set of decoded frames.
  15. 一种视频编解码装置,应用于编码端,包括:A video codec device applied to an encoding end, comprising:
    获取模块,设置为获取非首个待编码的视频帧、编码端与解码端的传输丢帧率以及参考帧集合,其中,所述参考帧集合包括至少一个编码视频帧对应的视频帧,所述参考帧集合中与解码端可成功解码的编码视频帧对应的视频帧为可靠帧;An acquisition module configured to acquire a non-first video frame to be encoded, a transmission frame loss rate between the encoding end and the decoding end, and a reference frame set, wherein the reference frame set includes at least one video frame corresponding to an encoded video frame, and the reference The video frames in the frame set corresponding to the encoded video frames that can be successfully decoded by the decoder are reliable frames;
    确定模块,设置为根据所述丢帧率和视频编码规则,从所述参考帧集合中确定所述待编码的视频帧对应的目标参考帧,其中,所述视频编码规则包括:丢帧率越大,待编码视频中越多数量的视频帧对应的目标参考帧为可靠帧;以及还设置为将所述待编码的视频帧与所述目标参考帧在显示时序上的距离,确定为所述待编码的视频帧的参考距离;The determination module is configured to determine the target reference frame corresponding to the video frame to be encoded from the set of reference frames according to the frame loss rate and video encoding rules, wherein the video encoding rule includes: the frame loss rate is higher Larger, the target reference frame corresponding to the larger number of video frames in the video to be encoded is a reliable frame; and it is also set to determine the distance between the video frame to be encoded and the target reference frame in display timing as the the reference distance of the encoded video frame;
    编码模块,设置为利用所述目标参考帧对所述待编码的视频帧编码,得到编码视频帧;An encoding module, configured to use the target reference frame to encode the video frame to be encoded to obtain an encoded video frame;
    发送模块,设置为向所述解码端发送所述编码视频帧以及所述参考距离。A sending module, configured to send the coded video frame and the reference distance to the decoding end.
  16. 一种视频编解码装置,应用于解码端,包括:A video codec device applied to a decoding end, comprising:
    接收模块,设置为接收编码端根据权利要求1至9中任一项视频编解码方法发送的编码视频帧以及参考距离;The receiving module is configured to receive the encoded video frame and the reference distance sent by the encoding end according to any one of the video encoding and decoding methods in claims 1 to 9;
    获取模块,设置为获取解码帧集合,其中,所述解码帧集合包括至少一个解码后的视频帧,所述解码帧集合所包括的视频帧的数量,大于或者等于所述解码端的参考帧集合所包括的视频帧的数量;An acquisition module, configured to acquire a set of decoded frames, wherein the set of decoded frames includes at least one decoded video frame, and the number of video frames included in the set of decoded frames is greater than or equal to the set of reference frames at the decoding end the number of video frames included;
    确定模块,设置为在根据所述参考距离,确定所述解码帧集合中包括所述编码视频帧对应的目标参考帧的情况下,获取所述编码视频帧对应的目标参考帧;The determination module is configured to obtain the target reference frame corresponding to the encoded video frame when it is determined according to the reference distance that the set of decoded frames includes the target reference frame corresponding to the encoded video frame;
    解码模块,设置为利用所述目标参考帧对所述编码视频帧解码,得到解码后的视频帧。The decoding module is configured to use the target reference frame to decode the coded video frame to obtain a decoded video frame.
  17. 一种电子设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至9中任一项所述的视频编解码方法,或者实现如权利要求10至14中任一项所述的视频编解码方法。An electronic device, comprising a processor, a memory, and a computer program stored on the memory and operable on the processor, when the computer program is executed by the processor, any of claims 1 to 9 can be realized. A video encoding and decoding method according to one of the claims, or realize the video encoding and decoding method according to any one of claims 10 to 14.
  18. 一种计算机可读存储介质,设置为存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至9任一项所述的视频编解码方法,或者实现如权利要求10至14中任一项项所述的视频编解码方法。A computer-readable storage medium configured to store a computer program, and when the computer program is executed by a processor, implements the video encoding and decoding method according to any one of claims 1 to 9, or implements the method described in claims 10 to 14 The video encoding and decoding method described in any item.
PCT/CN2022/097097 2021-06-16 2022-06-06 Video coding and decoding method and apparatus WO2022262602A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110667857.9 2021-06-16
CN202110667857.9A CN113573063A (en) 2021-06-16 2021-06-16 Video coding and decoding method and device

Publications (1)

Publication Number Publication Date
WO2022262602A1 true WO2022262602A1 (en) 2022-12-22

Family

ID=78162118

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/097097 WO2022262602A1 (en) 2021-06-16 2022-06-06 Video coding and decoding method and apparatus

Country Status (2)

Country Link
CN (1) CN113573063A (en)
WO (1) WO2022262602A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573063A (en) * 2021-06-16 2021-10-29 百果园技术(新加坡)有限公司 Video coding and decoding method and device
WO2023143331A1 (en) * 2022-01-25 2023-08-03 阿里巴巴(中国)有限公司 Facial video encoding method, facial video decoding method, and apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6111915A (en) * 1997-03-24 2000-08-29 Oki Electric Industry Co., Ltd. Picture decoder
US20120219067A1 (en) * 2011-02-24 2012-08-30 Andrei Jefremov Transmitting A Video Signal
CN106817585A (en) * 2015-12-02 2017-06-09 掌赢信息科技(上海)有限公司 A kind of method for video coding of utilization long term reference frame, electronic equipment and system
CN107734332A (en) * 2016-07-06 2018-02-23 上海兆言网络科技有限公司 Reference frame management method and apparatus for video communication
CN110392284A (en) * 2019-07-29 2019-10-29 腾讯科技(深圳)有限公司 Video coding, video data handling procedure, device, computer equipment and storage medium
CN110691212A (en) * 2018-07-04 2020-01-14 阿里巴巴集团控股有限公司 Method and system for coding and decoding data
CN113573063A (en) * 2021-06-16 2021-10-29 百果园技术(新加坡)有限公司 Video coding and decoding method and device

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3323057B2 (en) * 1996-04-10 2002-09-09 沖電気工業株式会社 Encoding device, decoding device, and transmission system
JP4010270B2 (en) * 2003-04-01 2007-11-21 日本ビクター株式会社 Image coding and transmission device
JP4659838B2 (en) * 2005-01-10 2011-03-30 株式会社エヌ・ティ・ティ・ドコモ Device for predictively coding a sequence of frames
CN101237587A (en) * 2007-02-02 2008-08-06 中兴通讯股份有限公司 A video sequence coding method and its error control system
CN101360243A (en) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 Video communication system and method based on feedback reference frame
CN102014286B (en) * 2010-12-21 2012-10-31 广东威创视讯科技股份有限公司 Video coding and decoding method and device
US20170094294A1 (en) * 2015-09-28 2017-03-30 Cybrook Inc. Video encoding and decoding with back channel message management
CN107113441B (en) * 2016-12-30 2019-07-26 深圳市大疆创新科技有限公司 Image processing method, device, unmanned vehicle and receiving end
CN110166776B (en) * 2018-02-11 2023-08-04 腾讯科技(深圳)有限公司 Video encoding method, device and storage medium
CN111713107A (en) * 2019-06-28 2020-09-25 深圳市大疆创新科技有限公司 Image processing method and device, unmanned aerial vehicle and receiving end
CN112532908B (en) * 2019-09-19 2022-07-19 华为技术有限公司 Video image transmission method, sending equipment, video call method and equipment
CN112929747B (en) * 2021-01-18 2023-03-31 北京洛塔信息技术有限公司 Video coding method, device and equipment based on network feedback and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6111915A (en) * 1997-03-24 2000-08-29 Oki Electric Industry Co., Ltd. Picture decoder
US20120219067A1 (en) * 2011-02-24 2012-08-30 Andrei Jefremov Transmitting A Video Signal
CN106817585A (en) * 2015-12-02 2017-06-09 掌赢信息科技(上海)有限公司 A kind of method for video coding of utilization long term reference frame, electronic equipment and system
CN107734332A (en) * 2016-07-06 2018-02-23 上海兆言网络科技有限公司 Reference frame management method and apparatus for video communication
CN110691212A (en) * 2018-07-04 2020-01-14 阿里巴巴集团控股有限公司 Method and system for coding and decoding data
CN110392284A (en) * 2019-07-29 2019-10-29 腾讯科技(深圳)有限公司 Video coding, video data handling procedure, device, computer equipment and storage medium
CN113573063A (en) * 2021-06-16 2021-10-29 百果园技术(新加坡)有限公司 Video coding and decoding method and device

Also Published As

Publication number Publication date
CN113573063A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
WO2022262602A1 (en) Video coding and decoding method and apparatus
KR102058759B1 (en) Signaling of state information for a decoded picture buffer and reference picture lists
US20220224360A1 (en) Method and Device for Transmitting a Data Stream with Selectable Ratio of Error Correction Packets to Data Packets
US10652580B2 (en) Video data processing method and apparatus
WO2016131223A1 (en) Frame loss method for video frame and video sending apparatus
KR101944565B1 (en) Reducing latency in video encoding and decoding
US20090103635A1 (en) System and method of unequal error protection with hybrid arq/fec for video streaming over wireless local area networks
CN104969560A (en) Determining available media data for network streaming
CN110392284B (en) Video encoding method, video data processing method, video encoding apparatus, video data processing apparatus, computer device, and storage medium
US8189492B2 (en) Error recovery in an audio-video multipoint control component
CN107566918A (en) A kind of low delay under video distribution scene takes the neutrel extraction of root
US20090052531A1 (en) Video coding
US20100125768A1 (en) Error resilience in video communication by retransmission of packets of designated reference frames
US8411743B2 (en) Encoding/decoding system using feedback
WO2023142716A1 (en) Encoding method and apparatus, real-time communication method and apparatus, device, and storage medium
WO2021052500A1 (en) Video image transmission method, sending device, and video call method and device
CN111093083A (en) Data transmission method and device
CN112866746A (en) Multi-path streaming cloud game control method, device, equipment and storage medium
US20120106632A1 (en) Method and apparatus for error resilient long term referencing block refresh
US11265583B2 (en) Long-term reference for error recovery in video conferencing system
JP2005033556A (en) Data transmitter, data transmitting method, data receiver, data receiving method
WO2023071469A1 (en) Video processing method, electronic device and storage medium
CN111279694A (en) GDR code stream encoding method, terminal device and machine readable storage medium
CN111654724B (en) Low-bit-rate coding transmission method of video conference system
CN109889917A (en) A kind of video transmission method based on caching coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22824082

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE