WO2023279978A1 - Procédé et appareil de codage vidéo, dispositif, et support de stockage - Google Patents

Procédé et appareil de codage vidéo, dispositif, et support de stockage Download PDF

Info

Publication number
WO2023279978A1
WO2023279978A1 PCT/CN2022/100805 CN2022100805W WO2023279978A1 WO 2023279978 A1 WO2023279978 A1 WO 2023279978A1 CN 2022100805 W CN2022100805 W CN 2022100805W WO 2023279978 A1 WO2023279978 A1 WO 2023279978A1
Authority
WO
WIPO (PCT)
Prior art keywords
code rate
target
resolution
change information
current video
Prior art date
Application number
PCT/CN2022/100805
Other languages
English (en)
Chinese (zh)
Inventor
张凯明
要瑞宵
Original Assignee
百果园技术(新加坡)有限公司
张凯明
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 张凯明 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2023279978A1 publication Critical patent/WO2023279978A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44209Monitoring of downstream path of the transmission network originating from a server, e.g. bandwidth variations of a wireless network

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to a video coding method, device, device, and storage medium.
  • instant messaging technology has penetrated into all aspects of people's daily life and work.
  • video technology has created more diversified instant messaging scenarios, such as video calls, remote conferences, online live broadcasts, online education, and so on.
  • the encoder must adjust the video encoding strategy in real time based on the network quality, so that the bit rate consumed by video encoding conforms to the network bandwidth, so as to avoid playback freezes and playback failures caused by poor network quality.
  • the regulation of video coding strategies is mainly divided into two categories: coder internal code rate regulation and encoder external frame rate/resolution regulation.
  • Encoder internal bit rate control refers to the bit rate adjustment within a certain range in the scene of a fixed frame rate/resolution; when the network quality fluctuates greatly, the external frame rate/resolution control of the encoder is usually performed to Consumed bitrate for smooth video encoding.
  • Embodiments of the present application provide a video encoding method, device, equipment, and storage medium. Described technical scheme is as follows:
  • the embodiment of the present application provides a video encoding method, the method comprising:
  • code rate change information corresponding to current network conditions, where the code rate change information is used to indicate fluctuations of the current network conditions relative to historical network conditions;
  • an embodiment of the present application provides a video encoding device, the device comprising:
  • a code rate information acquisition module configured to obtain code rate change information corresponding to current network conditions, where the code rate change information is used to indicate fluctuations of the current network conditions relative to historical network conditions;
  • a parameter information acquisition module configured to acquire parameter change information corresponding to the current video content, where the parameter change information is used to indicate the content complexity of the current video content;
  • a resolution determination module configured to determine a target resolution based on the code rate change information and the parameter change information
  • a video coding module configured to use the target resolution to code the current video frame.
  • an embodiment of the present application provides a computer device, the computer device includes a processor and a memory, and a computer program is stored in the memory, and the computer program is loaded and executed by the processor to realize the above-mentioned Video encoding method.
  • an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the foregoing video encoding method is implemented.
  • an embodiment of the present application provides a computer program product, which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.
  • the resolution used for encoding the current video frame is determined in combination with the fluctuation of the network quality and the complexity of the video content, so that the resolution can be adjusted more accurately. Since human eyes are more sensitive to changes in picture quality in simple scenes, this application combines the complexity of video content when adjusting the resolution, and considers the sensitivity of human eyes to different content complexity of video content, so that the resolution The regulation is more accurate and comprehensive. Moreover, this application considers the fluctuation of network quality and the complexity of video content when adjusting the resolution. When the network quality fluctuates too frequently, it effectively avoids frequent switching of resolutions, which helps to reduce the video frequency caused by frequent fluctuations in network quality. The resulting breathing effect improves the user's video viewing and calling experience.
  • FIG. 1 is a schematic diagram of an instant messaging scenario provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a video coding method provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of key block matching provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of a video encoding method provided by another embodiment of the present application.
  • FIG. 5 is a block diagram of a video encoding device provided by an embodiment of the present application.
  • Fig. 6 is a block diagram of a video encoding device provided by another embodiment of the present application.
  • the encoder internal bit rate control is applicable to the scene of fixed frame rate/resolution.
  • the external frame rate/resolution control of the encoder is usually performed.
  • the frame rate control and resolution control are introduced and explained respectively.
  • Frame rate control The client automatically maintains a frame-level bit rate statistics queue, which is used to cache the actual consumed bit rate of recent encoded frames, and then calculates the frame-level average consumed bit rate of encoded frames, and enters encoding in the current frame Before the processor, by comparing the frame-level average consumption bit rate of the encoded frame with the target frame-level bit rate estimated by using the actual bandwidth, it is judged whether the current frame still has enough bit rate for encoding. If the remaining bit rate is insufficient, directly Discard the current frame and wait for the arrival of the next acquisition frame.
  • the frame-level average bitrate consumption calculation formula of encoded frames is as follows:
  • Rav is the frame-level average consumption code rate of the encoded frame
  • R k is the actual consumption code rate of the kth frame
  • n is the sequence number of the current frame
  • TargteBits is the target frame-level code rate
  • FPS is the set frame rate
  • is a threshold coefficient; if the discard flag is 1, it means that the current frame needs to be discarded; if the discard flag is 0, it means that the current frame does not need to be discarded.
  • Resolution control the code table solution. For videos with different resolutions, set various bit rates in sequence from small to large to encode and decode, get the reconstructed video, and calculate the PSNR (Peak Signal to Noise Ratio) under various bit rates. The PSNR value is higher Larger means better video quality. Through a large amount of experimental data, the most suitable bit rate range for a certain resolution can be obtained. For example, 270P (standard definition) video has the best video quality when the bit rate is 320kbps (kilobits per second) to 500kbps.
  • a code table solution can be formed by integrating the appropriate bit rate ranges of various resolutions.
  • the code table scheme looks like this:
  • level represents different resolution levels, such as 180P (smooth), 270P (standard definition), 360P (high definition); level mr refers to the default resolution level, such as the default resolution is 270P; uplevel and domnlevel represent level respectively
  • the upper and lower bounds of the most suitable bit rate range for mr when the target frame level bit rate exceeds the bounds, it is necessary to switch the resolution according to the above code table scheme. For example, assuming that the above-mentioned level mr is 270P, level mr -1 is 180P, and domnlevel is 350kps, then when TargteBits fluctuates from 400kps to 300kbps, the resolution needs to be switched from 270P to 180P.
  • the frame rate adjustment focuses more on ensuring the video quality, mainly at the expense of the smoothness of the video to obtain the stability of the bit rate.
  • this type of adjustment will cause obvious video freezes, and even freezes in severe cases. Dead, very unfriendly to the user's subjective experience.
  • the control of frame rate control is too strong, which is suitable for extreme situations.
  • Resolution control focuses more on ensuring the fluency of the video, and adjusts the video resolution in real time based on the network quality to achieve the effect of balancing the video bit rate.
  • the resolution control method is relatively gentler, more friendly to the user's subjective experience, and has a wider range of application scenarios.
  • resolution control when the network quality fluctuates too frequently, the resolution switching will also be too frequent, which will produce a significant breathing effect.
  • the human eye is more sensitive to changes in picture quality in simple scenes. At this time, resolution control needs to be more careful to avoid a sharp drop or rise in video picture quality due to excessive resolution control, which will affect users' viewing and viewing experience. call experience.
  • an embodiment of the present application provides a video encoding method, which can more efficiently and flexibly determine the resolution used when encoding the current video frame during the video encoding process.
  • the technical solutions provided by the present application will be described in conjunction with several embodiments.
  • the "instant messaging” mentioned in the following embodiments refers to instant messaging combined with video, such as video calls, teleconferencing, online live broadcast, online education, and so on.
  • the embodiment of the present application does not limit the way of carrying out "instant messaging”. Users can carry out instant messaging through applications, webpages, applets, etc. that have instant messaging functions.
  • the application program with the instant messaging function includes an instant messaging application program, a video playing application program, a live broadcast application program, a game application program, an educational application program, etc., which are not limited in this embodiment of the present application.
  • the embodiment of this application is only for the convenience of explaining the process of video encoding and video decoding, and introduces video encoding and video decoding in the context of instant messaging, but this does not constitute a technical solution of this application. limit.
  • the video encoding method provided by the embodiment of the present application can also be applied in non-instant messaging scenarios, for example, users share the captured video to personal social platforms, webpages and other scenarios, all of which should belong to the scope of protection of the present application within.
  • FIG. 1 shows a schematic diagram of an instant messaging scenario provided by an embodiment of the present application.
  • the instant messaging scenario includes: a video encoding end 110 and a video decoding end 120 .
  • the video encoding end 110 refers to the generation end of video content in the instant messaging scene, which is used to generate video content, encode the generated video content, and send the encoded video content to the video decoding end 120, etc.
  • the video decoding end 120 refers to the receiving end of the video content in the instant messaging scene, and is used for decoding the received video content, playing the decoded video content, and so on.
  • the video encoding end 110 and the video decoding end 120 communicate with each other through a network.
  • the video encoding end 110 and the video decoding end 120 may communicate directly, or communicate indirectly through a server (not shown in FIG. 1 ), for example, the video encoding end 110 sends the encoded video content to the server , and then the server sends the encoded video content to the video decoder 120 .
  • the embodiment of the present application does not limit the device type of the video encoding end 110 and/or the video decoding end 120.
  • the video encoding end 110 and/or the video decoding end 120 is any of the following devices: computer equipment, terminal equipment, Servers, wearable devices, Bluetooth devices, in-vehicle devices, etc.
  • the video encoding end 110 and the video decoding end 120 are of the same device type, or are of different device types.
  • the video encoding end 110 and the video decoding end 120 can be implemented as the same device, that is, the device has both the functions of the video encoding end 110 and the video decoding end 120 .
  • FIG. 1 only illustrates that the video encoding end 110 and the video decoding end 120 are different devices for illustration, but this does not constitute a limitation to the technical solution of the present application.
  • the video encoding end 110 and the video decoding end 120 are distinguished only for the convenience of explaining the process of video encoding and video decoding, but this does not constitute a limitation to the technical solution of the application.
  • the video encoding end 110 can encode and send the video content, and can also receive and decode the video content; the video decoding end 120 can receive the video content and decode it, or Encode the video content and send it.
  • all parties involved in instant messaging can perform steps such as generating and encoding video content, decoding and playing video content generated by other parties, and so on.
  • FIG. 2 shows a flowchart of a video encoding method provided by an embodiment of the present application. This method can be applied to the video encoding end 110 in the instant messaging scene shown in FIG. 1 . As shown in Fig. 2, the method includes the following steps (210-240).
  • step 210 the code rate change information corresponding to the current network condition is acquired, and the code rate change information is used to indicate the fluctuation of the current network condition relative to the historical network condition.
  • Network quality has an important impact on the bit rate and resolution used in the video encoding process.
  • the influence of the parameters From the above content, it can be seen that when the network quality fluctuates greatly and the bit rate control range is exceeded, the frame rate and/or resolution need to be regulated.
  • the frame rate regulation is too strong, it is only applicable to extreme cases, so when the network quality fluctuates greatly, a more moderate resolution regulation method is adopted.
  • the code rate change information is used to indicate the fluctuation of the current network condition relative to the historical network condition.
  • the historical network condition may be the network quality from a certain time before the current time to the current time, or may be the network quality from the start time of encoding the video content to the current time.
  • Step 220 acquiring parameter change information corresponding to the current video content, where the parameter change information is used to indicate the content complexity of the current video content.
  • the video coding method provided by the embodiment of the present application also considers the sensitivity of human eyes to the complexity of different content of video content, so as to avoid drastic changes in video picture quality .
  • the application implements
  • the parameter change information of the quantization parameter is used to indicate the content complexity of the current video content.
  • Step 230 Determine the target resolution based on the code rate change information and the parameter change information.
  • the target resolution used when encoding the current video frame is determined based on the two aspects of information.
  • the video encoding end determines that resolution adjustment is not necessary based on the two aspects of information, and directly uses the original resolution of the current video frame as the target resolution.
  • the video encoding end determines that resolution adjustment is required based on the two aspects of information, and then increases or decreases the original resolution of the current video frame to obtain the target resolution. For other descriptions such as determination of the target resolution, please refer to the following embodiments, and details will not be repeated here.
  • Step 240 encode the current video frame with the target resolution.
  • the video encoding end After the video encoding end determines the target resolution, it can use the target resolution to encode the current video frame, so that the video bit rate matches the network bandwidth, and avoids problems such as video playback freeze or playback failure.
  • the encoding of the current video frame by the video encoding end using the target resolution includes: performing scaling processing on the current video frame by the video encoding end based on the target resolution.
  • encoding the current video frame with the target resolution includes: performing upsampling processing on the current video frame; When the frame resolution matches the target resolution, the target resolution is used to encode the upsampled current video frame.
  • encoding the current video frame with the target resolution includes: performing downsampling processing on the current video frame; When the frame resolution matches the target resolution, the target resolution is used to encode the downsampled current video frame.
  • matching the resolution of the current video frame after the up-sampling process (or the current video frame after the down-sampling process) with the target resolution includes: the current video frame after the up-sampling process (or the current video frame after the down-sampling process)
  • the resolution of the current video frame after upsampling) is equal to the target resolution, or the difference between the resolution of the current video frame after upsampling (or the current video frame after downsampling) and the target resolution is less than set the threshold.
  • the resolution adopted for encoding the current video frame is determined, so as to realize more accurate Adjust the resolution. Since human eyes are more sensitive to changes in picture quality in simple scenes, this application combines the complexity of video content when adjusting the resolution, and considers the sensitivity of human eyes to different content complexity of video content, so that the resolution The regulation is more accurate and comprehensive. Moreover, this application considers the fluctuation of network quality and the complexity of video content when adjusting the resolution. When the network quality fluctuates too frequently, it effectively avoids frequent switching of resolutions, which helps to reduce the video frequency caused by frequent fluctuations in network quality. The resulting breathing effect improves the user's video viewing and calling experience.
  • step 210 includes the following steps (212-216).
  • Step 212 determine the real-time target code rate, where the real-time target code rate is used to indicate the current network condition.
  • the video encoding end uses a bandwidth estimation method to determine the real-time target code rate, and the bandwidth estimation method includes but not limited to: a link delay method, a packet detection bandwidth estimation method, and the like.
  • Step 214 obtaining the historical average code rate, which is used to indicate the historical network conditions.
  • the video encoding end In order to obtain the fluctuation of the current network condition relative to the historical network condition, the video encoding end also needs to determine the historical average bit rate used to indicate the historical network condition.
  • the historical average bit rate is the average bit rate from the start moment of video encoding to the current moment.
  • the historical average bit rate may be when the video call is started The average code rate between the time and the current time; or, the historical average code rate is the average code rate in a period of time before the current time, for example, the historical average code rate is the average code rate within 5 seconds before the current time.
  • the video encoding end needs to maintain a code rate buffer, and the average value in the code rate buffer is the historical average code rate, which is used to indicate historical network conditions.
  • Step 216 Determine code rate change information based on the real-time target code rate and the historical average code rate.
  • the video encoding end After the video encoding end determines the real-time target bit rate and the historical average bit rate, it can compare the real-time target bit rate and the historical average bit rate to determine the bit rate change information.
  • the above step 216 includes: determining a code rate difference between the real-time target code rate and the historical average code rate; and determining code rate change information based on the code rate difference.
  • determining the code rate change information based on the code rate difference includes: when the absolute value of the code rate difference is greater than the first threshold and the code rate difference is a positive number, determining that the code rate change information includes code The code rate changes drastically, and the code rate increases suddenly; when the absolute value of the code rate difference is greater than the first threshold, and the code rate difference is negative, the code rate change information includes that the code rate changes drastically, and the code rate sudden drop; when the absolute value of the code rate difference is smaller than the first threshold, determining that the code rate change information includes that the code rate does not change drastically.
  • the first threshold is a positive number.
  • the real-time target code rate is R
  • the historical average code rate is R av
  • R av avg(buffer_R)
  • buffer_R is the code rate cache maintained by the video encoding end
  • avg( ⁇ ) represents the averaging function
  • the code rate change information includes that the code rate has not changed drastically;
  • the code rate change information includes a drastic change in the code rate, and a sudden increase in the code rate
  • the code rate change information includes that the code rate changes drastically, and the code rate drops sharply.
  • the above step 220 includes the following steps (222-226).
  • Step 222 acquiring a first quantization parameter value, where the first quantization parameter value refers to the quantization parameter value of the previous video frame of the current video frame.
  • the current video frame can be used
  • the content complexity of the previous video frame represents the content complexity of the current video frame. Therefore, in the embodiment of the present application, the video encoding end needs to obtain the quantization parameter value of the previous video frame of the current video frame, that is, the first quantization parameter value.
  • Step 224 acquiring a second quantization parameter value, where the second quantization parameter value refers to an average value of quantization parameter values of at least one historical video frame.
  • the video encoding end also needs to determine the mean value of the quantization parameter values of at least one historical video frame, that is, the second quantization parameter value.
  • at least one historical video frame is all video frames from the start moment of video encoding to the current moment, for example, when the technical solution of the present application is applied to a video call scenario, at least one historical video frame may be a video call All video frames from the moment of opening to the current moment; or, at least one historical video frame is all video frames in a period of time before the current moment; or, at least one historical video frame is N video frames before the current video frame , N is a positive number, optionally, N is 5, 10, 15, etc.
  • the video encoder needs to maintain a quantization parameter value cache, and the average value in the quantization parameter value cache is the second quantization parameter value, which is used to indicate the content complexity of recent video content.
  • the video encoding end maintains a quantization parameter value cache with a length of N, where N is a positive number, for example, N is 10.
  • Step 226 Determine parameter change information based on the first quantization parameter value and the second quantization parameter value.
  • the video encoding end compares the first quantization parameter value and the second quantization parameter value to determine parameter change information.
  • the above step 226 includes: determining a parameter difference between the first quantization parameter value and the second quantization parameter value; and determining parameter change information based on the parameter difference.
  • determining the parameter change information based on the parameter difference includes: when the absolute value of the parameter difference is greater than a second threshold, determining that the parameter change information indicates that the video content is complex; when the absolute value of the parameter difference is smaller than the second threshold In the case of the threshold, the determination parameter change information indicates that the video content is simple.
  • the second threshold is a positive number.
  • the first quantization parameter value is QP
  • the second quantization parameter value is QP av
  • QP av avg(buffer_QP)
  • buffer_QP is the quantization parameter value cache maintained by the video encoder
  • avg( ⁇ ) represents the average value function
  • the parameter difference between the first quantization parameter value and the second quantization parameter value is QP-QP av .
  • the second threshold is Th2
  • the parameter change information has the following situations:
  • the parameter change information indicates that the video content is simple
  • the parameter change information indicates that the video content is complex.
  • the parameter change information in this case can be set
  • the content of the indicated video is simple, and it can also be set that in this case the parameter change information indicates that the content of the video is complex. It should be understood that all these should fall within the protection scope of the present application.
  • the above step 230 includes any one of the following steps (232-238).
  • Step 232 when the code rate change information includes a sudden increase in the code rate, and the parameter change information indicates that the video content is simple, gradually increase the real-time consumption code rate within the target D frame interval; if the real-time consumption code rate exceeds the target D frame interval If it is greater than the real-time target bit rate, the original resolution of the current video frame is determined as the target resolution; if the real-time consumption bit rate is less than the real-time target bit rate after exceeding the target D frame interval, then increase the original resolution of the current video frame to get target resolution.
  • the human eye is more sensitive to changes in picture quality in simple scenes, when the parameter change information indicates that the video content is simple, and the bit rate change information includes a sudden increase in the bit rate, it should be avoided that the resolution control is too large to cause video distortion. Picture quality has risen dramatically. Therefore, in the embodiment of the present application, for the scene where the video content is simple and the code rate increases suddenly, the code rate control is first adopted, and then whether to adopt the resolution control is determined based on the result of the code rate control.
  • the target D frame is the next D frame, and the value of D may be 15. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, and then continue to use resolution control, that is, increase the original resolution of the current video frame to obtain the target resolution.
  • the method of increasing the bit rate in the D frame interval improves the picture quality of the video, and can realize a relatively smooth transition.
  • the code rate change information includes a sudden increase in the code rate
  • the real-time target code rate is at a high level
  • the real-time consumed code rate is at a low level.
  • a waste of code rate may occur.
  • the video encoding end fills redundant error correction packets, for example, resends some data packets of key frames, so as to enhance the error correction capability of key frames.
  • Step 234 when the code rate change information includes a sudden drop in the code rate, and the parameter change information indicates that the video content is simple, gradually reduce the real-time consumption code rate within the target D frame interval; if the real-time consumption code rate exceeds the target D frame interval If it is smaller than the real-time target bit rate, the original resolution of the current video frame is determined as the target resolution; if the real-time consumption bit rate exceeds the target D frame interval and is greater than the real-time target bit rate, then the original resolution of the current video frame is reduced to obtain target resolution.
  • the human eye is more sensitive to changes in picture quality in simple scenes, when the parameter change information indicates that the video content is simple, and the bit rate change information includes a sudden drop in the bit rate, it should be avoided that the resolution control is too strong to cause video distortion. Picture quality drops drastically. Therefore, in the embodiment of the present application, for the scene where the video content is simple and the code rate drops suddenly, the code rate control is first adopted, and then whether to adopt the resolution control is determined based on the result of the code rate control.
  • the target D frame is the next D frame, and the value of D may be 15. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, then continue to use resolution control, that is, reduce the original resolution of the current video frame to obtain the target resolution.
  • the method of reducing the bit rate in the D frame interval reduces the picture quality of the video, and can achieve a relatively smooth transition.
  • Step 236 when the code rate change information includes a sudden increase in the code rate, and the parameter change information indicates that the video content is complex, gradually increase the real-time consumption code rate within the target D frame interval; if the real-time consumption code rate exceeds the target D frame interval If it is greater than the real-time target bit rate, the original resolution of the current video frame is determined as the target resolution; if the real-time consumption bit rate is less than the real-time target bit rate after exceeding the target D frame interval, then increase the original resolution of the current video frame to get target resolution.
  • the code rate control is firstly used to improve the picture quality of the video frame by frame, and then it is determined whether to adopt the resolution control based on the result of the code rate control.
  • the real-time consumption bit rate is gradually increased within the target D frame interval.
  • the target D frame is the next D frame, and the value of D may be 15. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, and then continue to use resolution control, that is, increase the original resolution of the current video frame to obtain the target resolution.
  • the method of increasing the bit rate in the D frame interval improves the picture quality of the video, and can realize a relatively smooth transition.
  • Step 238 if the code rate change information includes a sudden drop in the code rate, and the parameter change information indicates that the video content is complex, reduce the original resolution of the current video frame to obtain the target resolution.
  • the parameter change information indicates that the video content is complex
  • the actual quantization parameter after encoding the historical video frame is relatively large; and because the bit rate change information includes the bit rate drop, the real-time target bit rate is low and the quantization parameter value is large. Therefore, for scenarios where the video content is complex and the bit rate drops suddenly, it is difficult to adjust the bit rate to adapt to the current network conditions.
  • the embodiment of the present application directly adjusts the resolution, that is, reduces the original resolution of the current video frame to obtain the target resolution to ensure the followability of the code rate.
  • the internal code rate control scheme of the encoder can be directly adopted without resolution control, so that the video encoding end directly converts the original
  • the resolution is determined as the target resolution.
  • the technical solutions provided by the embodiments of the present application classify resolution adjustment into multiple types by combining fluctuations in network quality and complexity of video content.
  • the bit rate control method is first used to smooth the picture quality changes of the video.
  • the resolution control method is adopted to effectively avoid the sharp change of the video picture quality.
  • the bit rate drops sharply, directly adjust the resolution to ensure the followability of the bit rate; Avoid drastic changes in picture quality.
  • the embodiment of the present application improves the video playback effect or the video call effect through the above refined resolution control scheme, and brings better video playback or video call experience to the user.
  • the quantization parameter value of the previous video frame of the current video frame is used as the measure of the complexity of the content of the current video frame, which is suitable for the case where the difference between the front and back frames of the video is small Condition. Based on this, it is also necessary to determine whether the scene changes before adopting the resolution control solution introduced in the above embodiment.
  • the above method further includes the following steps.
  • Step 250 detecting the scene change of the current video content relative to the historical video content.
  • the scene change of the current video content relative to the historical video content can be detected first, so as to avoid errors and adverse effects caused by the scene change on the resolution adjustment.
  • the embodiment of the present application does not limit the scene change detection method.
  • the scene change detection method includes: a key block matching algorithm, a full frame matching algorithm, a downsampling frame matching algorithm, and the like.
  • keyblock matching algorithms have low computational complexity. In the following, the process of applying the key block matching algorithm to detect scene changes will be described.
  • the above step 250 includes the following steps (252-256).
  • Step 252 determine at least one key block from the current video frame.
  • the video encoding end determines a key blocks from the current video frame, where a is a positive integer, such as 4, 5, 6, and so on.
  • the size of at least one key block is the same, for example, the size is 16 ⁇ 16 pixels, or 8 ⁇ 8 pixels, or 32 ⁇ 32 pixels, or 4 ⁇ 4 pixels, or are 64 ⁇ 64 pixels; or, the size of at least one key block is not exactly the same, for example, the size of one key block is 16 ⁇ 16 pixels, and the size of another key block is 32 ⁇ 32 pixels, etc.
  • at least one key block is evenly distributed in the current video frame, or at least one key block is concentrated in a certain area in the current video frame, for example, at least one key frame is concentrated in the middle area of the current video frame.
  • the video encoding end determines 5 key blocks from the current video frame, the 5 key blocks are evenly distributed in the current video frame, and the size of the 5 key blocks are all 16 ⁇ 16 pixels.
  • a total of 5 squares with a size of 16 ⁇ 16 pixels are selected as key blocks at the center of the current video frame and its surroundings.
  • the key block at the center of the current video frame is 310, and the key blocks around the center are The block is 320.
  • the horizontal and vertical offsets of each key block 320 from the key block 310 are both 64 pixels, so that the five key blocks are evenly distributed in the current video frame as shown in FIG. 3 .
  • Step 254 For the first key block in the at least one key block, search for a matching block with the target radius as the search radius at the position corresponding to the first key block in the previous video frame of the current video frame.
  • the key blocks are matched.
  • key block matching is performed between the current video frame and the previous video frame of the current video frame, so that, for each key block in at least one key block, it is necessary to select from the previous video frame of the current video frame Do a matching block search.
  • the first key block in at least one key block it is first necessary to determine the corresponding position of the first key block in the previous video frame of the current video frame, and then use the target radius r as the search around the corresponding position Radius to search for matching blocks.
  • the embodiment of the present application does not limit the size of the target radius.
  • the target radius is 1 pixel, or 2 pixels, or 4 pixels, or 8 pixels.
  • Step 256 if there is a block whose mean square error with the first key block is smaller than the target threshold among the searched blocks, determine that the first key block has a matching block in the previous video frame of the current video frame.
  • the block is determined as the first key block A matching block of a key block, that is, a matching block of the first key block exists in the previous video frame of the current video frame.
  • the video encoding end can also use MAD (Mean Absolute Deviation, average absolute error), SAD (Sum of Absolute Difference, absolute error sum), etc. to perform matching block search Search, the embodiment of the present application does not limit the matching criteria used in the matching block search.
  • the size of the matching block and the key block must be the same. For example, assuming that the size of the first key block is 16 ⁇ 16 pixels, the size of the matching block of the first key block is also 16 ⁇ 16 pixels.
  • the mean square error MSE between the searched blocks and key blocks within the search range is:
  • N is the size of the key block, such as N is 16;
  • C ij is the pixel value corresponding to the i -th row and j-th column in the key block of the current video frame;
  • a target threshold Th 0 is set, and when the MSE is smaller than Th 0 , the searched block is considered to be a matching block of the key block.
  • the current video frame when at least one key block has a matching block in the previous video frame of the current video frame, it is considered that the current video frame is similar to the previous video frame, that is, it is determined that the current video frame is relatively similar to the historical video frame There is no scene change; and if there is a certain key block in at least one key block and there is no matching block in the previous video frame of the current video frame, it is determined that the current video frame has a scene change relative to the historical video frame.
  • the above step 230 is implemented as: determining the target resolution based on bit rate change information and parameter change information when the current video content has no scene change relative to the historical video content. That is, when it is detected that the scene does not change, the target resolution is determined by using the resolution control solution provided by the above embodiment, taking into account fluctuations in network quality and complexity of video content.
  • the above step 256 it also includes: when the current video content has a scene change relative to the historical video content, the resolution corresponding to the real-time target bit rate in the target correspondence , determined as the target resolution; wherein, the target correspondence includes at least one set of correspondences between the target bit rate and the resolution. That is, when a scene change is detected, the code table scheme is used to determine the target resolution, and the video encoding end determines the resolution corresponding to the real-time target bit rate from the target correspondence, and takes this resolution as the target resolution.
  • the technical solution provided by the embodiment of the present application detects the scene change of the current video content relative to the historical video content, and combines the fluctuation of the network quality and the complexity of the video content when no scene change is detected. To determine the target resolution, effectively avoiding the impact of bit rate fluctuations on resolution adjustment due to scene changes, and improving the accuracy of resolution adjustment.
  • FIG. 4 shows a flowchart of a video encoding method provided by an embodiment of the present application. This method can be applied to the video encoding end 110 in the instant messaging scene shown in FIG. 1 . As shown in Fig. 4, the method includes the following steps (401 to 412).
  • Step 401 acquire video frame data.
  • the video capture module at the video encoding end can call the camera module to capture video frames at the target frame rate, and transmit the captured video frame data to the video
  • the audio and video processing module at the encoding end performs processing.
  • Step 402 acquiring network quality feedback.
  • the audio and video processing module at the video encoding end can use the bandwidth estimation method to determine the real-time target code rate R, which is used to indicate the current network conditions.
  • this step will maintain a code rate buffer buffer_R, and the code rate buffer buffer_R can be a second-level code rate buffer, such as 5 seconds.
  • the average value R av of the code rate cache is the historical average code rate, which is used to indicate the historical network conditions.
  • Step 403 acquire video content feedback.
  • the quantitative parameter value is used to indicate the content complexity of the video content.
  • the video encoding end obtains the quantization parameter value of the previous video frame of the current video frame, that is, the first quantization parameter value QP.
  • the video encoding end establishes and maintains a buffer_QP of quantization parameter values with a length of N.
  • the value of N is 10.
  • the average value of the quantization parameter value buffer buffer_QP is the second quantization parameter value QP av , which is used to indicate the content complexity of recent video content.
  • Step 404 detecting whether the scene changes.
  • the content complexity of the video content may change greatly, which may cause large bit rate fluctuations. Therefore, the scene change of the current video content relative to the historical video content is detected first, so as to avoid errors and adverse effects caused by the scene change on the resolution adjustment.
  • the video encoding end establishes a frame-level buffer buffer_pre, which is used to store the previous video frame data of the current video frame, so that each time the video encoding end receives the real-time current video frame, it performs key blocks between the current video frame and the previous video frame Match to detect whether the current video content has a scene change relative to the historical video content. If there is no scene change, execute the following step 405; if there is a scene change, execute the following step 411.
  • Step 405 detecting whether the network quality fluctuates.
  • the video encoding end compares the real-time target bit rate R obtained in the above step 402 with the historical average bit rate R av to determine whether the network quality fluctuates greatly. If the code rate difference between the real-time target code rate R and the historical average code rate R av is greater than the first threshold Th1, it is determined that the network quality fluctuates greatly and the code rate changes sharply, and the following step 406 is executed.
  • the code rate difference between the real-time target code rate R and the historical average code rate R av is less than or equal to the first threshold Th1, it is determined that the network quality fluctuation is small, and the original resolution of the current video frame is maintained, that is, the current The original resolution of the video frame is used as the target resolution, and the following step 412 is performed.
  • Step 406 determine the complexity of the content.
  • the video encoder compares the first quantization parameter value QP and the second quantization parameter value QP av obtained in step 403 above to determine the content complexity of the video content. If the absolute value of the parameter difference between the first quantization parameter value QP and the second quantization parameter value QP av is greater than the second threshold Th2, it indicates that the video content is complex; if the first quantization parameter value QP and the second quantization parameter value QP If the absolute value of the parameter difference between av is smaller than the second threshold Th2, it indicates that the video content is simple.
  • Step 407 execute the control scheme in the scenario where the video content is simple and the code rate increases suddenly. At this point, gradually increase the real-time consumption bit rate within the target D frame interval. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, and then continue to use resolution control, that is, increase the original resolution of the current video frame to obtain the target resolution.
  • Step 408 execute the control scheme in the scenario where the video content is simple and the code rate drops suddenly. At this time, gradually reduce the real-time consumption bit rate within the target D frame interval. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, then continue to use resolution control, that is, reduce the original resolution of the current video frame to obtain the target resolution.
  • Step 409 execute the control scheme in the scene where the video content is complex and the code rate increases suddenly. At this point, gradually increase the real-time consumption bit rate within the target D frame interval. If the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate regulation can adapt to the current network conditions. At this time, the original resolution of the current video frame is determined as the target resolution. If the real-time consumption bit rate is lower than the real-time target bit rate after exceeding the target D frame interval, it means that the bit rate control still cannot adapt to the current network conditions, and then continue to use resolution control, that is, increase the original resolution of the current video frame to obtain the target resolution.
  • Step 410 execute the control scheme in the scenario where the video content is complex and the code rate drops suddenly.
  • the resolution is directly regulated, that is, the original resolution of the current video frame is reduced to obtain the target resolution.
  • Step 411 adopting the code table scheme to regulate the resolution.
  • the control schemes in steps 405 to 410 above are no longer applicable.
  • the code table scheme is adopted to select the resolution corresponding to the real-time target bit rate as the target resolution.
  • this step clears the code rate buffer buffer_R maintained in the above step 402 and the quantization parameter value buffer buffer_QP maintained in the above step 403 to zero.
  • Step 412 encode the current video frame with the target resolution.
  • the video encoding end scales the current video frame based on the target resolution, and after the resolution of the current video frame matches the target resolution, the current video frame is input to the encoder for video encoding.
  • the video encoding end updates the frame-level buffer buffer_pre maintained in step 404 above.
  • FIG. 5 shows a block diagram of a video coding apparatus provided by an embodiment of the present application.
  • the device has the function of realizing the above example of the video coding method, and the function may be realized by hardware, or may be realized by executing corresponding software by the hardware.
  • the device may be the above-mentioned video coding end, or may be set in the above-mentioned video coding end.
  • the apparatus 500 may include: a code rate information acquisition module 510 , a parameter information acquisition module 520 , a resolution determination module 530 and a video encoding module 540 .
  • the code rate information obtaining module 510 is configured to obtain code rate change information corresponding to the current network condition, and the code rate change information is used to indicate the fluctuation of the current network condition relative to the historical network condition.
  • the parameter information obtaining module 520 is configured to obtain parameter change information corresponding to the current video content, and the parameter change information is used to indicate the content complexity of the current video content.
  • a resolution determining module 530 configured to determine a target resolution based on the code rate change information and the parameter change information.
  • a video encoding module 540 configured to encode the current video frame by using the target resolution.
  • the code rate information acquisition module 510 is configured to: determine a real-time target code rate, the real-time target code rate is used to indicate the current network condition; acquire a historical average code rate, the historical average code rate It is used to indicate the historical network condition; and determine the code rate change information based on the real-time target code rate and the historical average code rate.
  • the determining the code rate change information based on the real-time target code rate and the historical average code rate includes: determining the code rate between the real-time target code rate and the historical average code rate Rate difference; when the absolute value of the code rate difference is greater than the first threshold and the code rate difference is a positive number, it is determined that the code rate change information includes a drastic change in the code rate, and the code rate Sudden increase; when the absolute value of the code rate difference is greater than the first threshold and the code rate difference is negative, determining that the code rate change information includes a sharp change in the code rate, and a sudden drop in the code rate ; In a case where the absolute value of the code rate difference is smaller than a first threshold, determining that the code rate change information includes that the code rate has not changed drastically.
  • the parameter information acquiring module 520 is configured to: acquire a first quantization parameter value, where the first quantization parameter value refers to the quantization parameter value of the previous video frame of the current video frame; acquire the second A quantization parameter value, the second quantization parameter value refers to the mean value of the quantization parameter value of at least one historical video frame; based on the first quantization parameter value and the second quantization parameter value, the parameter change information is determined.
  • the determining the parameter change information based on the first quantization parameter value and the second quantization parameter value includes: determining the difference between the first quantization parameter value and the second quantization parameter value The parameter difference between; in the case where the absolute value of the parameter difference is greater than the second threshold, it is determined that the parameter change information indicates that the video content is complex; in the case where the absolute value of the parameter difference is smaller than the second threshold , it is determined that the parameter change information indicates that the video content is simple.
  • the resolution determination module 530 is configured to: step by step within the target D frame interval when the code rate change information includes a sudden increase in code rate, and the parameter change information indicates that the video content is simple Increase the real-time consumption code rate; If the real-time consumption code rate is greater than the real-time target code rate after exceeding the target D frame interval, then determine the original resolution of the current video frame as the target resolution; if the The real-time consumption code rate is less than the real-time target code rate after exceeding the target D frame interval, then increasing the original resolution of the current video frame to obtain the target resolution; when the code rate change information includes a code rate drop, And when the parameter change information indicates that the video content is simple, gradually reduce the real-time consumption code rate within the target D frame interval; if the real-time consumption code rate is less than the real-time target code rate after exceeding the target D frame interval, then The original resolution of the current video frame is determined as the target resolution; if the real-time consumption bit rate is greater than the real-time target bit rate after exceeding the target D frame
  • the resolution determination module 530 is further configured to: determine the original resolution of the current video frame as the target when the code rate change information includes that the code rate has not changed drastically resolution.
  • the video encoding module 540 is configured to: perform upsampling processing on the current video frame when the target resolution is greater than the original resolution of the current video frame; When the resolution of the current video frame after the upsampling process matches the target resolution, use the target resolution to encode the current video frame after the upsampling process; When the target resolution is smaller than the original resolution of the current video frame, downsampling is performed on the current video frame; the resolution of the current video frame after the downsampling process is different from the target resolution In the case of rate matching, the target resolution is used to encode the current video frame after the downsampling process.
  • the apparatus 500 further includes: a scene change detection module 550, configured to detect a scene change of the current video content relative to historical video content; the resolution determination module 530, used The target resolution is determined based on the code rate change information and the parameter change information when the current video content has no scene change relative to the historical video content.
  • the scene change detection module 550 is configured to: determine at least one key block from the current video frame; for the first key block in the at least one key block, The position corresponding to the first key block in the previous video frame of the current video frame is searched for matching blocks with the target radius as the search radius; if there is a distance between the searched block and the first key block If the mean square error is less than the target threshold, it is determined that the first key block has a matching block in the previous video frame of the current video frame; wherein, the at least one key block is in the previous video frame of the current video frame If there are matching blocks in a video frame, it is determined that a scene change occurs in the current video content relative to the historical video content.
  • the resolution determination module 530 is further configured to: in the case that the scene of the current video content changes relative to the historical video content, the resolution corresponding to the real-time target bit rate in the target correspondence rate, determined as the target resolution; wherein, the target correspondence includes at least one set of correspondences between target code rates and resolutions.
  • the resolution adopted for encoding the current video frame is determined, so as to realize more accurate Adjust the resolution. Since human eyes are more sensitive to changes in picture quality in simple scenes, this application combines the complexity of video content when adjusting the resolution, and considers the sensitivity of human eyes to different content complexity of video content, so that the resolution The regulation is more accurate and comprehensive. Moreover, this application considers the fluctuation of network quality and the complexity of video content when adjusting the resolution. When the network quality fluctuates too frequently, it effectively avoids frequent switching of resolutions, which helps to reduce the video frequency caused by frequent fluctuations in network quality. The resulting breathing effect improves the user's video viewing and calling experience.
  • the device provided by the above-mentioned embodiments implements its functions, it only uses the division of the above-mentioned functional modules as an example. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to needs. The internal structure of the system is divided into different functional modules to complete all or part of the functions described above.
  • the device and the method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
  • a computer device the computer device includes a processor and a memory, and a computer program is stored in the memory, and the computer program is loaded and executed by the processor to realize the above-mentioned Video encoding method.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned video encoding method is implemented.
  • a computer program product is also provided, which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé et un appareil de codage vidéo, un dispositif, et un support de stockage, ayant trait au domaine technique de l'informatique. Le procédé consiste à : acquérir des informations de changement de rendement de codage correspondant à un état de réseau actuel, les informations de changement de rendement de codage étant utilisées pour indiquer une fluctuation de l'état de réseau actuel par rapport à un état de réseau passé (210) ; acquérir des informations de changement de paramètre correspondant au contenu vidéo actuel, les informations de changement de paramètre étant utilisées pour indiquer un degré de complexité de contenu du contenu vidéo actuel (220) ; déterminer une résolution cible sur la base des informations de changement de rendement de codage et des informations de changement de paramètre (230) ; et coder une image vidéo actuelle en utilisant la résolution cible (240). La fluctuation de la qualité du réseau et le degré de complexité du contenu vidéo sont pris en considération tout en ajustant et en commandant la résolution, de sorte que lorsque la qualité du réseau fluctue trop fréquemment, une fréquente commutation de la résolution est efficacement évitée, ce qui facilite la réduction de l'effet de respiration provoqué par une fréquente fluctuation de la qualité du réseau et améliore l'expérience de visualisation et d'appel vidéo des utilisateurs.
PCT/CN2022/100805 2021-07-09 2022-06-23 Procédé et appareil de codage vidéo, dispositif, et support de stockage WO2023279978A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110779065.0 2021-07-09
CN202110779065.0A CN113573101B (zh) 2021-07-09 2021-07-09 视频编码方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023279978A1 true WO2023279978A1 (fr) 2023-01-12

Family

ID=78164256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/100805 WO2023279978A1 (fr) 2021-07-09 2022-06-23 Procédé et appareil de codage vidéo, dispositif, et support de stockage

Country Status (2)

Country Link
CN (1) CN113573101B (fr)
WO (1) WO2023279978A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886880A (zh) * 2023-09-08 2023-10-13 中移(杭州)信息技术有限公司 监控视频调整方法、装置、设备及计算机程序产品
CN117440209A (zh) * 2023-12-15 2024-01-23 牡丹江师范学院 一种基于演唱场景的实现方法及系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573101B (zh) * 2021-07-09 2023-11-28 百果园技术(新加坡)有限公司 视频编码方法、装置、设备及存储介质
CN115225928B (zh) * 2022-05-11 2023-07-25 北京广播电视台 一种多类型音视频混播系统及方法
CN115396729B (zh) * 2022-08-26 2023-12-08 百果园技术(新加坡)有限公司 视频目标帧确定方法、装置、设备及存储介质
CN117014608A (zh) * 2022-09-07 2023-11-07 腾讯科技(深圳)有限公司 视频流码率调整方法、装置、计算机设备和存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1285691A (zh) * 1999-08-20 2001-02-28 三星电子株式会社 根据网络带宽自适应地控制数据传输速率的装置
CN101345867A (zh) * 2008-08-22 2009-01-14 四川长虹电器股份有限公司 一种基于帧复杂度的码率控制方法
CN103974060A (zh) * 2013-01-31 2014-08-06 华为技术有限公司 视频质量调整方法和装置
CN105959700A (zh) * 2016-05-31 2016-09-21 腾讯科技(深圳)有限公司 视频图像编码的方法和装置
CN109561310A (zh) * 2017-09-26 2019-04-02 腾讯科技(深圳)有限公司 视频编码处理方法、装置、设备和存储介质
CN110650370A (zh) * 2019-10-18 2020-01-03 北京达佳互联信息技术有限公司 一种视频编码参数确定方法、装置、电子设备及存储介质
US20210127180A1 (en) * 2017-05-25 2021-04-29 Samsung Electronics Co., Ltd. Methods and systems for saving data while streaming video
CN113573101A (zh) * 2021-07-09 2021-10-29 百果园技术(新加坡)有限公司 视频编码方法、装置、设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105635734B (zh) * 2014-11-03 2019-04-12 掌赢信息科技(上海)有限公司 基于视频通话场景的自适应视频编码方法及装置
US20180063549A1 (en) * 2016-08-24 2018-03-01 Ati Technologies Ulc System and method for dynamically changing resolution based on content
CN107659827A (zh) * 2017-09-25 2018-02-02 北京小鱼易连科技有限公司 基于内容分析的桌面视频编码控制系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1285691A (zh) * 1999-08-20 2001-02-28 三星电子株式会社 根据网络带宽自适应地控制数据传输速率的装置
CN101345867A (zh) * 2008-08-22 2009-01-14 四川长虹电器股份有限公司 一种基于帧复杂度的码率控制方法
CN103974060A (zh) * 2013-01-31 2014-08-06 华为技术有限公司 视频质量调整方法和装置
CN105959700A (zh) * 2016-05-31 2016-09-21 腾讯科技(深圳)有限公司 视频图像编码的方法和装置
US20210127180A1 (en) * 2017-05-25 2021-04-29 Samsung Electronics Co., Ltd. Methods and systems for saving data while streaming video
CN109561310A (zh) * 2017-09-26 2019-04-02 腾讯科技(深圳)有限公司 视频编码处理方法、装置、设备和存储介质
CN110650370A (zh) * 2019-10-18 2020-01-03 北京达佳互联信息技术有限公司 一种视频编码参数确定方法、装置、电子设备及存储介质
CN113573101A (zh) * 2021-07-09 2021-10-29 百果园技术(新加坡)有限公司 视频编码方法、装置、设备及存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886880A (zh) * 2023-09-08 2023-10-13 中移(杭州)信息技术有限公司 监控视频调整方法、装置、设备及计算机程序产品
CN116886880B (zh) * 2023-09-08 2023-12-26 中移(杭州)信息技术有限公司 监控视频调整方法、装置、设备及计算机程序产品
CN117440209A (zh) * 2023-12-15 2024-01-23 牡丹江师范学院 一种基于演唱场景的实现方法及系统
CN117440209B (zh) * 2023-12-15 2024-03-01 牡丹江师范学院 一种基于演唱场景的实现方法及系统

Also Published As

Publication number Publication date
CN113573101A (zh) 2021-10-29
CN113573101B (zh) 2023-11-28

Similar Documents

Publication Publication Date Title
WO2023279978A1 (fr) Procédé et appareil de codage vidéo, dispositif, et support de stockage
CN107211193B (zh) 感知体验质量估计驱动的智能适应视频流传输方法和系统
US5903673A (en) Digital video signal encoder and encoding method
US6118817A (en) Digital video signal encoder and encoding method having adjustable quantization
US8681866B1 (en) Method and apparatus for encoding video by downsampling frame resolution
US9210436B2 (en) Distributed video coding/decoding method, distributed video coding/decoding apparatus, and transcoding apparatus
US11363298B2 (en) Video processing apparatus and processing method of video stream
US9826260B2 (en) Video encoding device and video encoding method
US10171829B2 (en) Picture encoding device and picture encoding method
US9049420B1 (en) Relative quality score for video transcoding
EP2727344B1 (fr) Sélection de trames à être codées basée sur la similarité, la qualité et la signifiance des trames
US8253776B2 (en) Image rectification method and related device for a video device
US10609440B1 (en) Timing data anomaly detection and correction
US9369706B1 (en) Method and apparatus for encoding video using granular downsampling of frame resolution
US20210409724A1 (en) Method and device for bitrate adjustment in encoding process
WO2020244328A1 (fr) Procédé et appareil de traitement d'image
US11184415B2 (en) Media feed prioritization for multi-party conferencing
WO2015085873A1 (fr) Procédé et appareil d'obtention de flux de code vidéo
Menon et al. Content-adaptive variable framerate encoding scheme for green live streaming
CN111654660B (zh) 一种基于图像分割的视频会议系统编码传输方法
Nguyen et al. A QoS-adaptive framework for screen sharing over Internet
US20220303555A1 (en) Combining high-quality foreground with enhanced low-quality background
US20230136314A1 (en) Deep learning based white balance correction of video frames
CN113160342B (zh) 基于反馈的编码方法及装置、存储介质、电子设备
WO2023051705A1 (fr) Procédé et appareil de communication vidéo, dispositif électronique et support lisible par ordinateur

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836732

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE