WO2022152137A1 - 基于网络反馈的视频编码方法、装置、设备及存储介质 - Google Patents

基于网络反馈的视频编码方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022152137A1
WO2022152137A1 PCT/CN2022/071471 CN2022071471W WO2022152137A1 WO 2022152137 A1 WO2022152137 A1 WO 2022152137A1 CN 2022071471 W CN2022071471 W CN 2022071471W WO 2022152137 A1 WO2022152137 A1 WO 2022152137A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
encoded
image
network
information
Prior art date
Application number
PCT/CN2022/071471
Other languages
English (en)
French (fr)
Inventor
张凯明
Original Assignee
百果园技术(新加坡)有限公司
张凯明
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 张凯明 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2022152137A1 publication Critical patent/WO2022152137A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44227Monitoring of local network, e.g. connection or bandwidth variations; Detecting new devices in the local network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4621Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to a video encoding method, apparatus, device, and storage medium based on network feedback.
  • the video frame is generally encoded by adopting a multi-reference frame strategy.
  • the encoder When encoding the current frame, the encoder will traverse the reference frame list in turn, and select the frame with the highest similarity to the current frame as the reference frame.
  • transmission frame loss occurs, it is easy to cause continuous frame loss in the picture group at the decoding end, which affects the video transmission quality.
  • Embodiments of the present application provide a video encoding method, apparatus, device, and storage medium based on network feedback, so as to improve video transmission quality.
  • an embodiment of the present application provides a video encoding method based on network feedback, including:
  • the decoding feedback information provided by the decoding end determine the referenced image frame that can be used as a reference in the encoded image frame
  • an embodiment of the present application provides a video encoding device based on network feedback, including a network detection module, a decoding feedback module, a picture group division module, and a reference frame determination module, wherein:
  • the network detection module is configured to determine the current network state information according to the real-time bit rate information provided by the network transmission layer;
  • the decoding feedback module is configured to determine, according to the decoding feedback information provided by the decoding end, a referenced image frame that can be used as a reference in the encoded image frame;
  • the picture group dividing module is configured to determine the picture group size based on the network state information, and divide the frame sequence to be encoded into a plurality of tiny picture groups according to the picture group size, and the picture group size is related to the network state. information is positively correlated;
  • the encoding execution module is configured to determine the to-be-encoded reference frame of the to-be-encoded image frame from the available reference image frames according to the position of the to-be-encoded image frame in the tiny image group, and based on the to-be-encoded image frame
  • the reference frame encodes the to-be-encoded image frame.
  • an embodiment of the present application provides a video encoding device based on network feedback, including: a memory and one or more processors;
  • the memory configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the video encoding method based on network feedback as described in the first aspect.
  • embodiments of the present application provide a storage medium containing computer-executable instructions, the computer-executable instructions, when executed by a computer processor, are configured to execute the video based on network feedback as described in the first aspect encoding method.
  • an embodiment of the present application provides a program for video encoding based on network feedback.
  • the program When the program is executed, operations related to the video encoding method based on network feedback as described in the first aspect can be implemented.
  • the size of the picture group is adjusted in real time according to the network state information, and the frame sequence to be encoded is divided into a plurality of small picture groups based on the size of the picture group.
  • the reference image frame when encoding the image frame to be encoded, according to the position of the image frame to be encoded in the small image group, select the appropriate reference frame to be encoded in the reference image frame, and based on the corresponding reference frame to be encoded to be encoded The reference frame to be encoded is selected.
  • the image frame is encoded, and the network feedback is used to dynamically adjust the size of the tiny image group in the encoding process and the selection of the reference frame, so as to realize the dynamic balance of video quality and fluency, and effectively improve the video transmission quality.
  • FIG. 1 is a flowchart of a video coding method based on network feedback provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario based on a video call provided by an embodiment of the present application
  • FIG. 3 is a flowchart of another video coding method based on network feedback provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a video encoding apparatus based on network feedback provided by an embodiment of the present application
  • FIG. 5 is a schematic structural diagram of a video encoding device based on network feedback provided by an embodiment of the present application.
  • FIG. 1 shows a flowchart of a video encoding method based on network feedback provided by an embodiment of the present application.
  • the video encoding method based on network feedback provided by an embodiment of the present application may be performed by a video encoding device based on network feedback.
  • the video encoding apparatus based on network feedback can be implemented by means of hardware and/or software, and be integrated in the video encoding device based on network feedback.
  • the following description takes as an example that a video encoding device based on network feedback performs a video encoding method based on network feedback.
  • the video encoding method based on network feedback can be applied to a video encoding device, with reference to FIG. 1 , the video encoding method based on network feedback includes:
  • S101 Determine current network status information according to the real-time bit rate information provided by the network transport layer.
  • the video encoding device based on network feedback provided in this embodiment (hereinafter referred to as the video encoding device) is used as the encoding end. After encoding the to-be-encoded image frame to obtain the encoded image frame, it is sent to the decoding end through the network, and the encoded image frame is sent to the decoding end by the decoding end. The image frames are decoded to obtain corresponding decoded image frames, and the encoded image frames are played in sequence to realize the transmission and playback of the video.
  • the encoding end and the decoding end provided in this embodiment may be computer devices with a real-time video call function, such as a smart phone.
  • the received video image information is encoded and sent, and the mobile terminal that receives and plays the video image information is used as the encoding terminal to decode and play the received video image information. transmission.
  • the network transport layer of the encoding end is used to obtain the current real-time bit rate information, and the current network state information is determined according to the real-time bit rate information. It can be understood that, the higher the real-time code rate reflected by the real-time code rate information, the better the network state reflected by the corresponding network state information.
  • S102 According to the decoding feedback information provided by the decoding end, determine a referenced image frame that can be used as a reference in the encoded image frame.
  • the decoding end After receiving the encoded image frame and successfully decoding the encoded image frame, the decoding end returns decoding feedback information to the video encoding apparatus provided in this embodiment.
  • the successfully decoded coded image frame may be determined according to the decoding feedback information returned by the decoding end, and the successfully decoded coded image frame may be determined as a reference image frame that can be used as a reference.
  • S103 Determine the image group size based on the network state information, and divide the frame sequence to be encoded into multiple tiny image groups according to the image group size, where the image group size is positively correlated with the network state information.
  • the image group size corresponding to the current network state information is determined according to the corresponding relationship between the network state information and the image group size.
  • the image group size is positively correlated with the network state information, that is, when the network state reflected by the network state information is better, the corresponding image group size is larger, and when the network state reflected by the network state information is worse, the corresponding The smaller the image group size.
  • the frame sequence to be encoded is divided into a plurality of tiny image groups according to the above-determined size of the image group, and the size of the tiny image group corresponds to the size of the above-determined image group.
  • S104 Determine the to-be-encoded reference frame of the to-be-encoded image frame from the available reference image frames according to the position of the to-be-encoded image frame in the tiny image group, and assign the to-be-encoded reference frame to the to-be-encoded reference frame based on the to-be-encoded reference frame.
  • the encoded image frame is encoded.
  • the position of the to-be-encoded image frame that currently needs to be encoded is determined in the corresponding tiny image group, and the to-be-encoded reference frame used as a reference is determined according to the position in the tiny image group.
  • the to-be-encoded reference frame to be used as a reference may be determined according to different reference frame determination methods. For example, when the to-be-encoded image frame is located at the first frame of the micro-picture group, from the referenced image frames that have been determined to be successfully decoded in the previous micro-picture group, the nearest reference image frame may be selected as the to-be-encoded image frame. Encoded reference frame.
  • the image frame to be encoded is encoded based on the reference frame to be encoded to obtain an encoded image frame. After the encoded image frame is obtained, it can be sent to the decoding end via the network for decoding and playback.
  • FIG. 2 is a schematic diagram of an application scenario based on a video call provided by an embodiment of the present application.
  • the video call application scenario corresponds to an encoder, a decoder, and a server.
  • a video call connection is established between the encoder and the decoder through the server.
  • the decoding end starts the video acquisition module and collects image information, and generates the corresponding frame sequence to be encoded in real time.
  • the encoding end divides the frame sequence to be encoded in real time to obtain image group sizes of different sizes, and determines the to-be-encoded image frame according to the position of the to-be-encoded image frame in the tiny image group
  • the reference frame is encoded to obtain an encoded image frame, which is sent to the decoding end in the form of a code stream through the server.
  • the decoding end decodes the received encoded image frame, and after the decoding is successful, it plays and feeds back the corresponding decoding feedback information to the encoding end.
  • the encoder uses the feedback of the network state information to dynamically adjust the selection process of the reference frame in the encoding process.
  • the network state is good, a longer micro-picture group is established, so that the internal frames of the micro-picture group can be sequentially dependent and improve the video quality. ;
  • the network status is poor, create a short micro-image group. Once a frame is lost, it can quickly reach the end of the micro-image group, re-find the reference image frame, and open a new order dependency, avoiding long-term continuous loss. frame condition.
  • the to-be-encoded frame sequence is divided into a plurality of tiny image groups, and at the same time, according to the decoding feedback information returned by the decoding end, determine the reference that can be used as a reference Image frame, when encoding the image frame to be encoded, according to the position of the image frame to be encoded in the tiny image group, select a suitable reference frame to be encoded from the reference image frames, and based on the corresponding reference frame to be encoded image frame to be encoded
  • use network feedback to dynamically adjust the size of tiny image groups and the selection of reference frames in the encoding process to achieve dynamic balance between video quality and fluency, and effectively improve video transmission quality.
  • FIG. 3 shows a flowchart of another video coding method based on network feedback provided by the embodiment of the present application.
  • the video coding method based on network feedback is a The specificity of the method.
  • the video coding method based on network feedback includes:
  • S201 Obtain real-time bit rate information provided by the network transport layer acquisition in real time, calculate average bit rate information based on the real-time bit rate information, and determine current network status information based on the average bit rate information.
  • the real-time bit rate information provided by the network transport layer of the video encoding apparatus is acquired in real time, and the average bit rate information is calculated according to the real-time bit rate information. For example, the real-time bit rate information corresponding to the most recent multiple frames (for example, 10 frames) is determined, and the average value of the multiple real-time bit rate information is used as the average bit rate information. At the same time, with the real-time acquisition and update of real-time bit rate information, the average bit rate information is updated synchronously.
  • the average bit rate information determined by real-time calculation is used as the current network state information. It can be understood that the higher the average bit rate information, the better the network state reflected by the network state information.
  • the real-time bit rate information is continuously obtained, and the average bit rate information is calculated based on the following formula:
  • the network state may also be fed back through network bandwidth, packet loss rate, etc. as network state information.
  • S202 Receive decoding feedback information returned by the decoding end, where the decoding feedback information is generated by the decoding end based on the successfully decoded encoded image frames.
  • the decoding end when the decoding end receives the encoded image frame and successfully decodes the encoded image frame to obtain the decoded image frame, it will generate and return to the video encoding device a decoding feedback pointing to the encoded image frame. information.
  • the decoding end After the decoding end successfully decodes the encoded image frame, it returns an identification character message (ACK message) to the video encoding device as decoding feedback information. Encoded image frame.
  • ACK message identification character message
  • S203 Determine an encoded image frame corresponding to the decoding feedback information, and determine the encoded image frame as a referenced image frame that can be used as a reference.
  • the video encoding apparatus provided in this embodiment will establish and maintain a reference frame list with a set length.
  • the reference frame list is configured to record the reference image frames determined to be available for reference.
  • the coded image frame that is successfully decoded is determined according to the decoding feedback information, the coded image frame determined to be successfully decoded is determined as a referenceable image frame that can be used as a reference, and the referenced image frame is added to the to the reference frame list.
  • the video encoding device receives the decoding feedback information indicating that a certain frame has been successfully decoded, it means that the frame can be referred to, and the frame can be used as a reference frame in subsequent encoding; if the encoder does not receive the corresponding frame
  • the decoding feedback information of the frame means that the frame is lost during transmission or has not been successfully decoded due to the limitation of decoding hardware, and cannot be used as a reference.
  • S204 Determine the window sliding speed corresponding to the network state information based on the correspondence between the network state information and the window sliding speed.
  • this embodiment determines the window sliding speed corresponding to the current network state information based on the corresponding relationship between the network state information and the window sliding speed.
  • the corresponding relationship between the network state information and the window sliding speed is determined by the following calculation formula:
  • SS represents the window sliding speed
  • Br av represents the average bit rate information
  • ⁇ and ⁇ represent experimental constant coefficients (determined based on experimental data), respectively.
  • the size of the tiny image group is set by sliding the window. Small groups of tiny pictures to avoid long-term continuous frame loss.
  • S205 Determine the size of the image group according to the window sliding speed, the limit value constraint of the image group, and the frame loss rate information, where the image group size is positively correlated with the network state information.
  • the method for determining the size of the image group is determined.
  • the ratio of the number of dropped frames to the total number of frames in the most recent set time period (take 10 seconds as an example) is used as the frame loss rate information, as shown in the following formula:
  • FL av represents the frame loss rate information
  • D(j) represents the number of frames lost in the jth second
  • fps represents the set number of frames to be encoded in 1 second.
  • determining the size of the image group according to the window sliding speed, the limit value constraint of the image group and the frame loss rate information provided by this embodiment specifically includes steps S2051-S2053:
  • the image group size is determined to be outside the set range.
  • Upper limit The upper limit outside the range may be set to a specific numerical value, or may not be limited to a specific numerical value.
  • the specific value of the upper limit outside the range is not limited, it means that when the network does not lose frames, the size of the tiny image group is not limited, which is equivalent to starting the video call and continuing until the end of the call. Only one micro-picture group is generated, so that the image frames in the micro-picture group can always be encoded in an order-dependent manner to obtain higher video quality.
  • S2052 Based on the fact that the frame loss rate information is within the set frame loss rate range, determine the size of the group of pictures according to the window sliding speed, the limit value constraint of the group of pictures, and the frame loss rate information.
  • the size of the group of images is determined according to the window sliding speed, the limit value constraint of the group of images and the frame loss rate information, and the group of images is set.
  • the minimum value of the size is the in-range lower limit value, which is greater than or equal to the out-of-range lower limit value.
  • the real-time GOG size is determined according to the preset window sliding speed, GOG limit value constraints, and the corresponding relationship between the frame loss rate information and GOG size.
  • the image group limit value constraint is configured to constrain the upper and lower window limits of the sliding window corresponding to the size of the image group, and the window sliding speed can represent the change rate of the sliding window, and the change rate is proportional to the size of the image group.
  • the frame drop rate information is inversely proportional to the GOP size.
  • the frame loss rate information when the frame loss rate information is located within the set frame loss rate range, and the image group size is determined according to the window sliding speed, the image group limit value constraint, and the frame loss rate information, it is further set in the frame loss rate information.
  • the maximum value of the picture group size is the set low bit rate upper limit value. For example, if the set low bit rate value is set to 200kb/s and the low bit rate upper limit is 5, then when the real-time bit rate information is lower than 200kb/s, the upper limit of the picture group size of the micro picture group is 5 to reduce the The case of generating long-distance tiny image groups in poor network conditions.
  • the minimum value of the image group size is further set as the lower limit value within the range, and the lower limit value within the range is greater than or equal to the lower limit value outside the range.
  • the lower limit value outside the range is the GOP size determined when the frame loss rate information is higher than the set frame loss rate range in the following step S2053.
  • the lower limit value outside the range is set to 1, and the lower limit value within the range can be set to 2, that is, when the frame loss rate information is within the set frame loss rate range, the size of the obtained image group is all greater than or equal to 2.
  • the GOP size is directly determined as the lower limit value outside the set range (for example, the GOP size is set to 1).
  • the to-be-encoded reference frame for which the current to-be-encoded image frame is used as a reference is determined as: The reference frames that can be decoded successfully are determined in the reference frame list.
  • Gopsize represents the size of the image group
  • represents no restriction on the size of the image group
  • M represents the limit value constraint of the image group, which includes the upper and lower window limits of the sliding window
  • FL av represents the frame loss rate information
  • SS represents the window sliding speed
  • Th 1 and Th 2 represent the lower limit value and upper limit value of the frame drop rate range set, respectively
  • f(M, FL av , SS) represent the preset window sliding speed, the limit value constraint of the group of pictures and the frame drop Correspondence or calculation function between rate information and image group size.
  • step S207 Determine whether the image frame to be encoded is located at the first frame of the tiny image group. If yes, go to step S208, otherwise go to step S209.
  • each to-be-encoded image frame of each tiny image group is sequentially encoded.
  • the types of image frames in the micro-picture group are I frame and P frame
  • the I frame is an encoded frame without inter-frame prediction
  • the P frame is an encoded frame using an encoded frame from the time sequence forward for inter-frame prediction.
  • the I frame is determined by a preset rule (for example, every 20 and 50 image frames in the frame sequence to be encoded are designated as the I frame).
  • step S208 it is judged whether the image frame to be encoded is located in the first frame of the tiny image group, if so, jump to step S208, otherwise (the image to be encoded at this time) frame is located inside the micro-picture group) and jumps to step S209.
  • S208 Determine the confirmation character frame with the closest distance from the reference image frames, and use the confirmation character frame as the to-be-encoded reference frame of the to-be-encoded image frame.
  • the to-be-encoded image frame is located in the first frame of the micro-picture group, it is determined in the reference frame list that a reference image frame closest to the current to-be-encoded image frame is the most to-be-encoded reference frame.
  • the reference frame serving as the to-be-encoded image frame is encoded to ensure that when the first to-be-encoded image frame in the micro-picture group is encoded, the referenced reference frame must be an encoded image frame that has been successfully decoded. After determining the to-be-encoded reference frame of the to-be-encoded image frame, jump to step S210.
  • S209 Use the encoded image frame of the previous frame of the to-be-encoded image frame as the to-be-encoded reference frame of the to-be-encoded image frame.
  • an encoded image frame preceding the current to-be-encoded image frame is determined, and the encoded image frame is used as the to-be-encoded reference frame of the to-be-encoded image frame.
  • the video quality is effectively improved by determining the reference frame in an order-dependent manner within the tiny picture group.
  • S210 Encode the to-be-encoded image frame based on the to-be-encoded reference frame.
  • the frame sequence to be encoded is divided into a plurality of tiny image groups, and at the same time, according to the decoding feedback information returned by the decoding end, determine the reference that can be used as a reference Image frame, when encoding the image frame to be encoded, according to the position of the image frame to be encoded in the tiny image group, select the appropriate reference frame to be encoded from the reference image frames, and based on the corresponding reference frame to be encoded image frame to be encoded
  • use network feedback to dynamically adjust the size of tiny image groups and the selection of reference frames in the encoding process to achieve dynamic balance between video quality and fluency, and effectively improve video transmission quality.
  • the video freezes less, and users pay more attention to the video quality, so set a longer micro image group to improve the quality in the long-distance sequence dependence process;
  • users have a high tolerance for video quality, and the smoothness of video calls becomes the key. Therefore, set a short micro image group to respond quickly when frame loss occurs, and re-find the reference image frame to avoid continuous frame loss.
  • the freeze so that in the real-time video call scene, the dynamic balance of video quality and fluency is achieved.
  • FIG. 4 is a schematic structural diagram of a video encoding apparatus based on network feedback provided by an embodiment of the present application.
  • the video encoding apparatus based on network feedback includes a network detection module 41 , a decoding feedback module 42 , a picture group division module 43 and a reference frame determination module 44 .
  • the network detection module 41 is configured to determine the current network state information according to the real-time bit rate information provided by the network transport layer;
  • the decoding feedback module 42 is configured to, according to the decoding feedback information provided by the decoding end, A reference image frame that can be used as a reference is determined in the image frame;
  • the image group dividing module 43 is configured to determine the image group size based on the network state information, and divide the to-be-encoded frame sequence into multiple image groups according to the image group size.
  • the size of the picture group has a positive correlation with the network state information;
  • the encoding execution module is configured to, according to the position of the image frame to be encoded in the small picture group, from the reference picture A to-be-encoded reference frame of the to-be-encoded image frame is determined in the frame, and the to-be-encoded image frame is encoded based on the to-be-encoded reference frame.
  • the frame sequence to be encoded is divided into a plurality of small picture groups, and at the same time, according to the decoding feedback information returned by the decoding end, determine the reference that can be used as a reference Image frame, when encoding the image frame to be encoded, according to the position of the image frame to be encoded in the tiny image group, select the appropriate reference frame to be encoded from the reference image frames, and based on the corresponding reference frame to be encoded image frame to be encoded
  • use network feedback to dynamically adjust the size of tiny image groups and the selection of reference frames in the encoding process to achieve dynamic balance between video quality and fluency, and effectively improve video transmission quality.
  • FIG. 5 is a schematic structural diagram of a video encoding device based on network feedback provided by an embodiment of the present application.
  • the video encoding device based on network feedback includes: an input device 53, an output device 54, a memory 52, and one or more processors 51; the memory 52 is configured to store one or more programs; when the One or more programs are executed by the one or more processors 51 , so that the one or more processors 51 implement the video encoding method based on network feedback as provided in the above embodiments.
  • the video encoding apparatus, device and computer based on network feedback provided above can be configured to execute the video encoding method based on network feedback provided in any of the above embodiments, and have corresponding functions and beneficial effects.
  • Embodiments of the present application further provide a storage medium containing computer-executable instructions, which, when executed by a computer processor, are configured to execute the video encoding method based on network feedback provided by the foregoing embodiments.
  • a storage medium containing computer-executable instructions provided by the embodiments of the present application the computer-executable instructions of which are not limited to the above-mentioned video encoding method based on network feedback, and can also execute any of the computer-executable instructions provided by the embodiments of the present application.
  • Related operations in video coding methods based on network feedback are not limited to the above-mentioned video encoding method based on network feedback.
  • the embodiments of the present application further provide a program for video encoding based on network feedback.
  • the program When the program is executed, operations related to the video encoding method based on network feedback provided by any embodiment of the present application can be implemented.
  • the video encoding apparatus, device, storage medium and program based on network feedback provided in the above embodiments can perform the video encoding method based on network feedback provided in any embodiment of the present application, and the technical details not described in detail in the above embodiments, Reference may be made to the video coding method based on network feedback provided by any embodiment of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例提供了基于网络反馈的视频编码方法、装置、设备及存储介质。本申请实施例提供的技术方案通过根据网络状态信息实时调整图像组大小,并基于该图像组大小将待编码帧序列划分为多个微小图像组,同时根据解码端返回的解码反馈信息,确定可用作参考的可参考图像帧,在对待编码图像帧进行编码时,根据待编码图像帧在微小图像组中的位置,在可参考图像帧中选择适合的待编码参考帧,并基于对应待编码参考帧对待编码图像帧进行编码,利用网络反馈动态调整编码过程中的微小图像组的大小以及参考帧的选择,实现视频质量与流畅度的动态兼顾,有效提高视频传输质量。

Description

基于网络反馈的视频编码方法、装置、设备及存储介质
本申请要求在2021年01月18日提交中国专利局,申请号为202110065168.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,尤其涉及基于网络反馈的视频编码方法、装置、设备及存储介质。
背景技术
随着4G、5G技术的发展,移动互联网的信息载体正在经历从文字、图片到视频的转变,各种视频类应用应运而生,获得了消费者们的一致青睐。然而,内容数据量的骤增,给网络带宽带来了较大压力。
相关技术中,一般通过采用多参考帧策略对视频帧进行编码,在编码当前帧时,编码器会依次遍历参考帧列表,选出与当前帧相似度最高的帧作为参考帧。然而,在发生传输丢帧时,在解码端容易造成图像组内的连续丢帧现象,影响视频传输质量。
发明内容
本申请实施例提供基于网络反馈的视频编码方法、装置、设备及存储介质,以提高视频传输质量。
在第一方面,本申请实施例提供了一种基于网络反馈的视频编码方法,包括:
根据网络传输层提供的实时码率信息,确定当前的网络状态信息;
根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧;
基于所述网络状态信息确定图像组大小,并按照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系;
依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
在第二方面,本申请实施例提供了一种基于网络反馈的视频编码装置,包括网络检测模块、解码反馈模块、图像组划分模块和参考帧确定模块,其中:
所述网络检测模块,配置为根据网络传输层提供的实时码率信息,确定当前的网络状态信息;
所述解码反馈模块,配置为根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧;
所述图像组划分模块,配置为基于所述网络状态信息确定图像组大小,并按照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系;
所述编码执行模块,配置为依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
在第三方面,本申请实施例提供了一种基于网络反馈的视频编码设备,包括:存储器以及一个或多个处理器;
所述存储器,配置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如第一方面所述的基于网络反馈的视频编码方法。
在第四方面,本申请实施例提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时配置为执行如第一方面所述的基于网络反馈的视频编码方法。
在第五方面,本申请实施例提供了一种基于网络反馈的视频编码的程序,该程序被执行时,可以实现如第一方面所述的基于网络反馈的视频编码方法有关的操作。
本申请实施例通过根据网络状态信息实时调整图像组大小,并基于该图像组大小将待编码帧序列划分为多个微小图像组,同时根据解码端返回的解码反馈信息,确定可用作参考的可参考图像帧,在对待编码图像帧进行编码时,根据待编码图像帧在微小图像组中的位置,在可参考图像帧中选择适合的待编码参考帧,并基于对应待编码参考帧对待编码图像帧进行编码,利用网络反馈动 态调整编码过程中的微小图像组的大小以及参考帧的选择,实现视频质量与流畅度的动态兼顾,有效提高视频传输质量。
附图说明
图1是本申请实施例提供的一种基于网络反馈的视频编码方法的流程图;
图2是本申请实施例提供的一种基于视频通话的应用场景示意图;
图3是本申请实施例提供的另一种基于网络反馈的视频编码方法的流程图;
图4是本申请实施例提供的一种基于网络反馈的视频编码装置的结构示意图;
图5是本申请实施例提供的一种基于网络反馈的视频编码设备的结构示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面结合附图对本申请具体实施例作进一步的详细描述。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部内容。在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作(或步骤)描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。
图1给出了本申请实施例提供的一种基于网络反馈的视频编码方法的流程图,本申请实施例提供的基于网络反馈的视频编码方法可以由基于网络反馈的视频编码装置来执行,该基于网络反馈的视频编码装置可以通过硬件和/或软件的方式实现,并集成在基于网络反馈的视频编码设备中。
下述以基于网络反馈的视频编码装置执行基于网络反馈的视频编码方法为例进行描述。该基于网络反馈的视频编码方法可应用于视频编码设备,参考图1,该基于网络反馈的视频编码方法包括:
S101:根据网络传输层提供的实时码率信息,确定当前的网络状态信息。
本实施例提供的基于网络反馈的视频编码装置(以下简称视频编码装置)作为编码端,在对待编码图像帧进行编码得到已编码图像帧后,通过网络发送给解码端,由解码端对已编码图像帧进行解码得到相应的已解码图像帧,并依次对已编码图像帧进行播放,实现对视频的传输与播放。
本实施例提供的编码端和解码端可以是智能手机等具有实时视频通话功能的计算机设备,例如在移动端实时视频通话过程中,将采集并提供视频画面信息的移动端作为编码端,对采集到的视频画面信息进行编码并发送,将接收并播放视频画面信息的移动端作为编码端,对接收到的视频画面信息进行解码并播放,并且编码端和解码端之间基于服务器进行视频画面信息的传输。
示例性的,在对待编码图像帧进行编码过程中,利用编码端的网络传输层获取当前的实时码率信息,并根据实时码率信息确定当前的网络状态信息。可以理解的是,实时码率信息反映的实时码率越高,对应的网络状态信息反映的网络状态越好。
S102:根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧。
解码端在接收到已编码图像帧并成功对已编码图像帧进行解码后,向本实施例提供的视频编码装置返回解码反馈信息。
示例性的,根据解码端返回的解码反馈信息可确定被成功解码的已编码图像帧,并将被成功解码的已编码图像帧确定为可用作参考的可参考图像帧。
S103:基于所述网络状态信息确定图像组大小,并按照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系。
示例性的,根据网络状态信息与图像组大小的对应关系,确定当前网络状态信息所对应的图像组大小。其中,图像组大小与网络状态信息为正相关关系,即在网络状态信息反映的网络状态越好时,对应的图像组大小越大,而在网络状态信息反映的网络状态越差时,对应的图像组大小越小。
通过窗口滑动的方式,按照上述确定的图像组大小,对待编码帧序列进行划分得到多个微小图像组,并且微小图像组的大小与上述确定的图像组大小相对应。
S104:依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图 像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
示例性的,在编码过程中,确定当前需要进行编码的待编码图像帧在对应的微小图像组中的位置,并根据在微小图像组中的位置确定用作参考的待编码参考帧。
其中,根据待编码图像帧在微小图像组中的不同位置,可根据不同的参考帧确定方式确定用做参考的待编码参考帧。例如,在待编码图像帧位于微小图像组的首帧时,可以从前面的微小图像组中已确定被成功解码的可参考图像帧中,选择距离最近的可参考图像帧作为本次编码的待编码参考帧。
在一实施例中,在确定待编码参考帧后,基于该待编码参考帧对待编码图像帧进行编码,得到已编码图像帧。在得到已编码图像帧后,可经网络发送至解码端进行解码并播放。
图2为本申请实施例提供的一种基于视频通话的应用场景示意图。如图2所示,在视频通话的应用场景中对应有编码端、解码端和服务器,编码端和解码端之间通过服务器建立视频通话连接,解码端在接收编码端发出的视频通话请求后,解码端启动视频采集模块并采集图像信息,并实时生成相应的待编码帧序列。
该编码端基于本申请实施例提供的基于网络反馈的视频编码方法,实时对待编码帧序列进行切分得到不同大小的图像组大小,并依据待编码图像帧位于微小图像组中的位置确定待编码参考帧进行编码得到已编码图像帧,并通过服务器,以码流的形式向解码端进行发送。
解码端对接收到的已编码图像帧进行解码,并在解码成功后进行播放并向编码端反馈相应的解码反馈信息。编码端利用对网络状态信息的反馈,动态调整编码过程中的参考帧的选择过程,当网络状态较好时,建立较长的微小图像组,使得微小图像组内部帧可以顺序依赖,提升视频质量;当网络状态较差时,建立较短的微小图像组,一旦发生丢帧,可以迅速达到微小图像组的尾部,重新寻找可参考图像帧,开启新的顺序依赖,避免了长时间的连续丢帧的情况。
上述,通过根据网络状态信息实时调整图像组大小,并基于该图像组大小将待编码帧序列划分为多个微小图像组,同时根据解码端返回的解码反馈信息,确定可用作参考的可参考图像帧,在对待编码图像帧进行编码时,根据待编码图像帧在微小图像组中的位置,在可参考图像帧中选择适合的待编码参考帧, 并基于对应待编码参考帧对待编码图像帧进行编码,利用网络反馈动态调整编码过程中的微小图像组的大小以及参考帧的选择,实现视频质量与流畅度的动态兼顾,有效提高视频传输质量。
在上述实施例的基础上,图3给出了本申请实施例提供的另一种基于网络反馈的视频编码方法的流程图,该基于网络反馈的视频编码方法是对上述基于网络反馈的视频编码方法的具体化。参考图3,该基于网络反馈的视频编码方法包括:
S201:实时获取网络传输层获取提供的实时码率信息,并基于所述实时码率信息计算平均码率信息,并基于所述平均码率信息确定当前的网络状态信息。
在一实施例中,实时获取视频编码装置的网络传输层提供的实时码率信息,并根据实时码率信息计算平均码率信息。例如,确定最近多帧(例如10帧)对应的实时码率信息,并将这多个实时码率信息的平均值作为平均码率信息。同时,随着实时码率信息的实时获取与更新,同步更新平均码率信息。
本实施例将实时计算确定的平均码率信息作为当前的网络状态信息,可以理解的是,平均码率信息越高,网络状态信息反映的网络状态越好。
示例性的,持续获取实时码率信息,并基于以下公式计算平均码率信息:
Figure PCTCN2022071471-appb-000001
以上公式以10帧的平均码率为例进行示例性说明,可根据具体需要进行适应性调整,其中Br av表示平均码率信息,Br(i)表示第i帧的实时码率。
在其他实施例中,还可以是通过网络带宽、丢包率等作为网络状态信息反馈网络状态。
S202:接收解码端返回的解码反馈信息,所述解码反馈信息由解码端基于解码成功的已编码图像帧生成。
在一实施例中,解码端在接收到已编码图像帧时并成功对已编码图像帧进行解码得到已解码图像帧时,会生成并向视频编码装置返回一个指向该已编码图像帧的解码反馈信息。
例如,解码端在成功解码已编码图像帧后,向视频编码装置返回一个标识字符消息(ACK消息)作为解码反馈信息,视频编码装置在接收到标识字符消息后即可得知具体被成功解码的已编码图像帧。
S203:确定所述解码反馈信息所对应的已编码图像帧,并将所述已编码图像帧确定为可用作参考的可参考图像帧。
本实施例提供的视频编码装置会建立并维护一个设定长度的参考帧列表,该参考帧列表配置为记录确定为可用作参考的可参考图像帧,在对待编码图像帧进行编码时,可在参考帧列表中选取选择合适的可参考图像帧作为参考帧进行编码。
在一实施例中,根据解码反馈信息确定被成功解码的已编码图像帧,将确定被成功解码的已编码图像帧确定为可用作参考的可参考图像帧,并将该可参考图像帧加入到参考帧列表中。
可以理解的是,若视频编码装置收到指示某一帧已成功解码的解码反馈信息,代表该帧可被参考,后续进行编码时即可将该帧作为参考帧;若编码器未收到对应帧的解码反馈信息,代表该帧在传输过程中丢失或因解码硬件限制尚未成功解码,不可被用作参考。
S204:基于网络状态信息与窗口滑动速度之间的对应关系,确定所述网络状态信息所对应的窗口滑动速度。
在一实施例中,本实施例基于网络状态信息与窗口滑动速度之间的对应关系,确定当前的网络状态信息下所对应的窗口滑动速度。其中网络状态信息与窗口滑动速度之间的对应关系通过以下计算公式进行确定:
SS=α*Br av
其中,SS表示窗口滑动速度,Br av表示平均码率信息,α和β分别表示实验常数系数(基于实验数据进行确定)。
需要进行解释的是,本申请实施例通过窗口滑动的方式设定微小图像组的大小,在较好的网络状态中,增大微小图像组以提升视频质量,而在较差网络状态下,减小微小图像组以避免出现长期连续丢帧的状况。
另外,考虑到网络环境起伏多变的情况,在较好的网络状态中一旦出现丢帧,会造成长距离微小图像组内部的连续丢帧的情况,而在较差网络下也会出现因短期的网络波峰而产生长距离微小图像组的情况。因此通过设置窗口滑动速度的方式限制微小图像组的图像组大小,当网络状态较好时,设定较高的窗口滑动速度,一旦出现丢帧,使微小图像组能够迅速响应,避免连续丢帧;当网络状态较差时,设定较小的窗口滑动速度,减少因短期的网络状态变好,产生长距离的微小图像组,而导致留下丢帧隐患的情况。
S205:根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,所述图像组大小与所述网络状态信息为正相关关系。
实时获取网络反馈的丢帧率信息,并将实时的丢帧率信息与设定的丢帧率范围进行比较,判断实时的丢帧率信息在丢帧率范围以内、在丢帧率范围以上还是在丢帧率范围以下,并根据丢帧率信息与丢帧率范围的对应关系确定图像组大小的确定方式。
示例性的,本实施例采用最近设定时间段内(以10秒为例)丢帧数量占总体帧数的比值,作为丢帧率信息,如以下公式所示:
Figure PCTCN2022071471-appb-000002
其中,FL av表示丢帧率信息,D(j)代表第j秒丢失的帧数,fps代表设定的1秒要编码的帧数。
在一实施例中,本实施例提供的根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,具体包括步骤S2051-S2053:
S2051:基于丢帧率信息低于设定的丢帧率范围,将图像组大小确定为设定的范围外上限值。
在一实施例中,在丢帧率信息低于设定的丢帧率范围时(即丢帧率信息小于设定的丢帧率下限阈值时),将图像组大小确定为设定的范围外上限值。其中范围外上限制可以设置为具体的数值,也可以不对具体数值进行限定。
需要进行解释的是,在不对范围外上限制的具体数值进行限定时,表示在当网络不丢帧时,就不对微小图像组的大小进行限制,相当于从开始视频通话一直持续到通话结束,仅产生一个微小图像组,使得微小图像组内的图像帧可一直以顺序依赖的方式进行编码,获得更高的视频质量。
S2052:基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小。
在一实施例中,基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,并设定所述图像组大小的最小值为范围内下限值,所述范围内下限值大于等于所述范围外下限值。
在丢帧率信息位于设定的丢帧率范围内时,根据预先设定的窗口滑动速度、 图像组极限值约束和丢帧率信息与图像组大小的对应关系,确定实时的图像组大小。
其中,图像组极限值约束配置为约束图像组大小对应的滑动窗口的窗口上限和窗口下限,窗口滑动速度可表示滑动窗口的变化速率,该变化速率与图像组大小成正比,本实施例提供的丢帧率信息与图像组大小成反比。
在一实施例中,在基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小时,进一步设定在所述实时码率信息低于设定低码率值时,所述图像组大小的最大值为设定的低码率上限值。例如,将设定低码率值设置为200kb/s,低码率上限值为5,则在实时码率信息低于200kb/s时,微小图像组的图像组大小上限为5,以减少较差网络状态下产生长距离微小图像组的情况。
同时,还进一步设定所述图像组大小的最小值为范围内下限值,所述范围内下限值大于等于范围外下限值。其中范围外下限值为下述步骤S2053中,在丢帧率信息高于设定的丢帧率范围时确定的图像组大小。例如,本实施例将范围外下限值设置为1,可将范围内下限值设置为2,即在基于丢帧率信息位于设定的丢帧率范围内时,得到的图像组大小均大于等于2。
S2053:基于丢帧率信息高于设定的丢帧率范围,将图像组大小确定为设定的范围外下限值。
在一实施例中,在丢帧率信息高于设定的丢帧率范围时,则直接将图像组大小确定为设定的范围外下限值(例如将图像组大小设置为1)。在一实施例中,在丢帧率信息高于设定的丢帧率范围,为了保证视频传输质量,减少丢帧的情况,将当前待编码图像帧用做参考的待编码参考帧确定为:参考帧列表中确定成功解码的可参考图像帧。
其中,步骤S2051-S2053中对图像组大小的确定可通过以下公式进行:
Figure PCTCN2022071471-appb-000003
其中Gopsize表示图像组大小,∞表示不对图像组大小进行限制,M表示图像组极限值约束,其包括滑动窗口的窗口上限和窗口下限,FL av表示丢帧率信息,SS表示窗口滑动速度,Th 1和Th 2分别表示设定的丢帧率范围的范围下限值和范围上限值,f(M,FL av,SS)表示预先设定的窗口滑动速度、图像组极限 值约束和丢帧率信息与图像组大小之间的对应关系或计算函数。
S206:按照所述图像组大小将待编码帧序列划分为多个微小图像组。
S207:判断待编码图像帧是否位于所述微小图像组的首帧。若是,则跳转至步骤S208,否则跳转至步骤S209。
在一实施例中,在对待编码帧序列进行划分得到多个对应不同图像组大小的多个微小图像组后,依次对每个微小图像组的各个待编码图像帧进行编码。在对待编码图像帧进行编码时,先判断该待编码图像帧是否为I帧,若是,则不为该待编码图像帧指定参考帧并直接进行编码,否则,进一步判断该待编码图像帧在对应微小图像组中的位置。
其中,微小图像组中的图像帧的类型有I帧和P帧,I帧为不使用帧间预测的编码帧,P帧为使用来自时序前向的编码帧来做帧间预测的编码帧。其中I帧通过预先设定的规则进行确定(例如指定待编码帧序列中的每20、50各图像帧作为I帧)。
在一实施例中,根据待编码图像帧在对应微小图像组中的位置判断该待编码图像帧是否位于微小图像组的首帧,若是,则跳转至步骤S208,否则(此时待编码图像帧位于微小图像组的内部)跳转至步骤S209。
S208:从所述可参考图像帧中确定距离最近的确认字符帧,并将所述确认字符帧作为所述待编码图像帧的待编码参考帧。
在一实施例中,若待编码图像帧位于微小图像组的首帧,则在参考帧列表中确定距离当前待编码图像帧最近的一个可参考图像帧最为待编码参考帧,该待编码图像帧将作为待编码图像帧的参考帧进行编码,保证微小图像组中的第一个待编码图像帧在进行编码时,所参考的参考帧一定是已成功进行解码的已编码图像帧。在确定待编码图像帧的待编码参考帧后,跳转至步骤S210。
S209:将所述待编码图像帧前一帧的已编码图像帧作为所述待编码图像帧的待编码参考帧。
在当前待编码图像帧位于微小图像组内部时,确定当前待编码图像帧前一帧的已编码图像帧,并将该已编码图像帧作为待编码图像帧的待编码参考帧。通过在微小图像组内部采用顺序依赖的方式确定参考帧,有效提高视频质量。
S210:基于所述待编码参考帧对所述待编码图像帧进行编码。
上述,通过根据网络状态信息实时调整图像组大小,并基于该图像组大小将待编码帧序列划分为多个微小图像组,同时根据解码端返回的解码反馈信息, 确定可用作参考的可参考图像帧,在对待编码图像帧进行编码时,根据待编码图像帧在微小图像组中的位置,在可参考图像帧中选择适合的待编码参考帧,并基于对应待编码参考帧对待编码图像帧进行编码,利用网络反馈动态调整编码过程中的微小图像组的大小以及参考帧的选择,实现视频质量与流畅度的动态兼顾,有效提高视频传输质量。在实时视频通话过程中,在较好网络环境下,视频卡顿较少,用户更加注重视频质量,因此设置较长的微小图像组,在长距离的顺序依赖过程中提升质量;而在较差网络环境下,用户对视频质量容忍度较高,视频通话的流畅度成为关键,因此设置较短的微小图像组,在发生丢帧时迅速响应,重新寻找可参考图像帧,避免连续丢帧造成的卡顿,使得在实时视频通话场景中,实现对视频质量与流畅度的动态兼顾。
图4是本申请实施例提供的一种基于网络反馈的视频编码装置的结构示意图。参考图4,该基于网络反馈的视频编码装置包括网络检测模块41、解码反馈模块42、图像组划分模块43和参考帧确定模块44。
其中,所述网络检测模块41,配置为根据网络传输层提供的实时码率信息,确定当前的网络状态信息;所述解码反馈模块42,配置为根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧;所述图像组划分模块43,配置为基于所述网络状态信息确定图像组大小,并按照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系;所述编码执行模块,配置为依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
上述,通过根据网络状态信息实时调整图像组大小,并基于该图像组大小将待编码帧序列划分为多个微小图像组,同时根据解码端返回的解码反馈信息,确定可用作参考的可参考图像帧,在对待编码图像帧进行编码时,根据待编码图像帧在微小图像组中的位置,在可参考图像帧中选择适合的待编码参考帧,并基于对应待编码参考帧对待编码图像帧进行编码,利用网络反馈动态调整编码过程中的微小图像组的大小以及参考帧的选择,实现视频质量与流畅度的动态兼顾,有效提高视频传输质量。
本申请实施例还提供了一种基于网络反馈的视频编码设备,该基于网络反 馈的视频编码设备可集成本申请实施例提供的基于网络反馈的视频编码装置。图5是本申请实施例提供的一种基于网络反馈的视频编码设备的结构示意图。参考图5,该基于网络反馈的视频编码设备包括:输入装置53、输出装置54、存储器52以及一个或多个处理器51;所述存储器52,配置为存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器51执行,使得所述一个或多个处理器51实现如上述实施例提供的基于网络反馈的视频编码方法。上述提供的基于网络反馈的视频编码装置、设备和计算机可配置为执行上述任意实施例提供的基于网络反馈的视频编码方法,具备相应的功能和有益效果。
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时配置为执行如上述实施例提供的基于网络反馈的视频编码方法。当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的基于网络反馈的视频编码方法,还可以执行本申请任意实施例所提供的基于网络反馈的视频编码方法中的相关操作。
本申请实施例还提供一种基于网络反馈的视频编码的程序,该程序被执行时,可以实现本申请任意实施例所提供的基于网络反馈的视频编码方法有关的操作。上述实施例中提供的基于网络反馈的视频编码装置、设备、存储介质及程序可执行本申请任意实施例所提供的基于网络反馈的视频编码方法,未在上述实施例中详尽描述的技术细节,可参见本申请任意实施例所提供的基于网络反馈的视频编码方法。
上述仅为本申请的较佳实施例及所运用的技术原理。本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行的各种明显变化、重新调整及替代均不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由权利要求的范围决定。

Claims (11)

  1. 一种基于网络反馈的视频编码方法,应用于视频编码设备,其中,包括:
    根据网络传输层提供的实时码率信息,确定当前的网络状态信息;
    根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧;
    基于所述网络状态信息确定图像组大小,并按照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系;
    依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
  2. 根据权利要求1所述的基于网络反馈的视频编码方法,其中,所述根据网络传输层提供的实时码率信息,确定当前的网络状态信息,包括:
    实时获取网络传输层获取提供的实时码率信息;
    基于所述实时码率信息计算平均码率信息,并基于所述平均码率信息确定当前的网络状态信息。
  3. 根据权利要求1-2任一项所述的基于网络反馈的视频编码方法,其中,所述根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧,包括:
    接收解码端返回的解码反馈信息,所述解码反馈信息由解码端基于解码成功的已编码图像帧生成;
    确定所述解码反馈信息所对应的已编码图像帧,并将所述已编码图像帧确定为可用作参考的可参考图像帧。
  4. 根据权利要求1-3任一项所述的基于网络反馈的视频编码方法,其中,所述基于所述网络状态信息确定图像组大小,包括:
    基于网络状态信息与窗口滑动速度之间的对应关系,确定所述网络状态信息所对应的窗口滑动速度;
    根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小。
  5. 根据权利要求4所述的基于网络反馈的视频编码方法,其中,所述根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,包括:
    基于丢帧率信息低于设定的丢帧率范围,将图像组大小确定为设定的范围外上限值;
    基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小;
    基于丢帧率信息高于设定的丢帧率范围,将图像组大小确定为设定的范围外下限值。
  6. 根据权利要求5所述的基于网络反馈的视频编码方法,其中,所述基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,包括:
    基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,并设定所述图像组大小的最小值为范围内下限值,所述范围内下限值大于等于所述范围外下限值。
  7. 根据权利要求5所述的基于网络反馈的视频编码方法,其中,所述基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,包括:
    基于丢帧率信息位于设定的丢帧率范围内,根据所述窗口滑动速度、图像组极限值约束和丢帧率信息确定图像组大小,并设定在所述实时码率信息低于设定低码率值时,所述图像组大小的最大值为设定的低码率上限值。
  8. 根据权利要求1-7任一项所述的基于网络反馈的视频编码方法,其中,所述依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的参考帧,包括:
    判断待编码图像帧是否位于所述微小图像组的首帧;
    响应于是的判断结果,从所述可参考图像帧中确定距离最近的确认字符帧,并将所述确认字符帧作为所述待编码图像帧的待编码参考帧;
    响应于否的判断结果,将所述待编码图像帧前一帧的已编码图像帧作为所述待编码图像帧的待编码参考帧。
  9. 一种基于网络反馈的视频编码装置,其中,包括网络检测模块、解码反馈模块、图像组划分模块和参考帧确定模块,其中:
    所述网络检测模块,用于根据网络传输层提供的实时码率信息,确定当前的网络状态信息;
    所述解码反馈模块,用于根据解码端提供的解码反馈信息,在已编码图像帧中确定可用作参考的可参考图像帧;
    所述图像组划分模块,用于基于所述网络状态信息确定图像组大小,并按 照所述图像组大小将待编码帧序列划分为多个微小图像组,所述图像组大小与所述网络状态信息为正相关关系;
    所述编码执行模块,用于依据待编码图像帧位于所述微小图像组中的位置,从所述可参考图像帧中确定所述待编码图像帧的待编码参考帧,并基于所述待编码参考帧对所述待编码图像帧进行编码。
  10. 一种基于网络反馈的视频编码设备,其中,包括:存储器以及一个或多个处理器;
    所述存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-8任一所述的基于网络反馈的视频编码方法。
  11. 一种包含计算机可执行指令的存储介质,其中,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-8任一所述的基于网络反馈的视频编码方法。
PCT/CN2022/071471 2021-01-18 2022-01-11 基于网络反馈的视频编码方法、装置、设备及存储介质 WO2022152137A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110065168.0A CN112929747B (zh) 2021-01-18 2021-01-18 基于网络反馈的视频编码方法、装置、设备及存储介质
CN202110065168.0 2021-01-18

Publications (1)

Publication Number Publication Date
WO2022152137A1 true WO2022152137A1 (zh) 2022-07-21

Family

ID=76163397

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071471 WO2022152137A1 (zh) 2021-01-18 2022-01-11 基于网络反馈的视频编码方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112929747B (zh)
WO (1) WO2022152137A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024120214A1 (zh) * 2022-12-08 2024-06-13 广州市百果园网络科技有限公司 一种编码控制方法、装置、设备、存储介质及产品

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112929747B (zh) * 2021-01-18 2023-03-31 北京洛塔信息技术有限公司 基于网络反馈的视频编码方法、装置、设备及存储介质
CN113573063B (zh) * 2021-06-16 2024-06-14 百果园技术(新加坡)有限公司 视频编解码方法及装置
CN116264622A (zh) * 2021-12-15 2023-06-16 腾讯科技(深圳)有限公司 视频编码方法、装置、电子设备和存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1168055A (zh) * 1996-04-19 1997-12-17 冲电气工业株式会社 图像编码器,图像解码器及图象发送系统
US6621868B1 (en) * 1998-03-02 2003-09-16 Nippon Telegraph And Telephone Corporation Video communication system and video communication method
CN101360243A (zh) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 基于反馈参考帧的视频通信系统及方法
US20100124275A1 (en) * 2008-11-18 2010-05-20 National Taiwan University System and method for dynamically encoding multimedia streams
CN104780367A (zh) * 2015-04-13 2015-07-15 浙江宇视科技有限公司 一种动态调整gop长度的方法和装置
CN106973066A (zh) * 2017-05-10 2017-07-21 福建星网智慧科技股份有限公司 一种实时通讯中h264编码视频数据传输方法以及系统
CN110113610A (zh) * 2019-04-23 2019-08-09 西安万像电子科技有限公司 数据传输方法及装置
CN112929747A (zh) * 2021-01-18 2021-06-08 北京洛塔信息技术有限公司 基于网络反馈的视频编码方法、装置、设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404572B (zh) * 2011-11-22 2013-11-27 西交利物浦大学 延迟约束条件下基于系统rs码的视频编解码系统及其方法
EP3376766B1 (en) * 2017-03-14 2019-01-30 Axis AB Method and encoder system for determining gop length for encoding video
CN110392284B (zh) * 2019-07-29 2022-02-01 腾讯科技(深圳)有限公司 视频编码、视频数据处理方法、装置、计算机设备和存储介质
CN110708569B (zh) * 2019-09-12 2021-08-13 北京达佳互联信息技术有限公司 一种视频处理方法、装置、电子设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1168055A (zh) * 1996-04-19 1997-12-17 冲电气工业株式会社 图像编码器,图像解码器及图象发送系统
US6621868B1 (en) * 1998-03-02 2003-09-16 Nippon Telegraph And Telephone Corporation Video communication system and video communication method
CN101360243A (zh) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 基于反馈参考帧的视频通信系统及方法
US20100124275A1 (en) * 2008-11-18 2010-05-20 National Taiwan University System and method for dynamically encoding multimedia streams
CN104780367A (zh) * 2015-04-13 2015-07-15 浙江宇视科技有限公司 一种动态调整gop长度的方法和装置
CN106973066A (zh) * 2017-05-10 2017-07-21 福建星网智慧科技股份有限公司 一种实时通讯中h264编码视频数据传输方法以及系统
CN110113610A (zh) * 2019-04-23 2019-08-09 西安万像电子科技有限公司 数据传输方法及装置
CN112929747A (zh) * 2021-01-18 2021-06-08 北京洛塔信息技术有限公司 基于网络反馈的视频编码方法、装置、设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024120214A1 (zh) * 2022-12-08 2024-06-13 广州市百果园网络科技有限公司 一种编码控制方法、装置、设备、存储介质及产品

Also Published As

Publication number Publication date
CN112929747B (zh) 2023-03-31
CN112929747A (zh) 2021-06-08

Similar Documents

Publication Publication Date Title
WO2022152137A1 (zh) 基于网络反馈的视频编码方法、装置、设备及存储介质
US9918085B2 (en) Media coding for loss recovery with remotely predicted data units
KR102058759B1 (ko) 디코딩된 픽쳐 버퍼 및 참조 픽쳐 목록들에 관한 상태 정보의 시그널링 기법
CN102883152A (zh) 具有适应性的媒体流传输
US11694316B2 (en) Method and apparatus for determining experience quality of VR multimedia
CN114666225B (zh) 带宽调整方法、数据传输方法、设备及计算机存储介质
CN110248192B (zh) 编码器切换、解码器切换、屏幕分享方法和屏幕分享系统
CN1240092A (zh) 视频编码
US11356739B2 (en) Video playback method, terminal apparatus, and storage medium
US8249070B2 (en) Methods and apparatuses for performing scene adaptive rate control
US10701124B1 (en) Handling timestamp inaccuracies for streaming network protocols
CN113573101A (zh) 视频编码方法、装置、设备及存储介质
EP1187460A2 (en) Image transmitting method and apparatus and image receiving method and apparatus
US20200202872A1 (en) Combined forward and backward extrapolation of lost network data
CN112866746A (zh) 一种多路串流云游戏控制方法、装置、设备及存储介质
US6667698B2 (en) Distributed compression and transmission method and system
CN115868161A (zh) 基于强化学习的速率控制
US8681860B2 (en) Moving picture compression apparatus and method of controlling operation of same
CN107493478B (zh) 编码帧率设置方法及设备
US20070110168A1 (en) Method for generating high quality, low delay video streaming
US20050195901A1 (en) Video compression method optimized for low-power decompression platforms
WO2023142665A1 (zh) 图像处理方法、装置、计算机设备、存储介质及程序产品
CN112004087B (zh) 一种以双帧作为控制单元的码率控制优化方法及存储介质
CN112004082B (zh) 一种双帧作为控制单元的码率控制的优化方法
US20220279190A1 (en) Transmission apparatus, reception apparatus, transmission method,reception method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22739016

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22739016

Country of ref document: EP

Kind code of ref document: A1