WO2023142716A1 - 编码方法、实时通信方法、装置、设备及存储介质 - Google Patents

编码方法、实时通信方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023142716A1
WO2023142716A1 PCT/CN2022/137893 CN2022137893W WO2023142716A1 WO 2023142716 A1 WO2023142716 A1 WO 2023142716A1 CN 2022137893 W CN2022137893 W CN 2022137893W WO 2023142716 A1 WO2023142716 A1 WO 2023142716A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
video frame
current video
encoded
encoding
Prior art date
Application number
PCT/CN2022/137893
Other languages
English (en)
French (fr)
Inventor
张佳
杨小祥
陈思佳
曹健
黄永铖
曹洪彬
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023142716A1 publication Critical patent/WO2023142716A1/zh
Priority to US18/514,741 priority Critical patent/US20240098310A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the embodiments of the present application relate to the technical field of the Internet, and in particular, to an encoding method, a real-time communication method, a device, a device, and a storage medium.
  • a video is a continuous image sequence consisting of consecutive frames, and a frame is an image. Due to the persistence of vision effect of the human eye, when the frame sequence is played at a certain rate, what we see is a video with continuous actions. Due to the high similarity between consecutive frames, in order to facilitate storage and transmission, the original video needs to be encoded and compressed to remove redundancy in space and time dimensions.
  • Current video coding standards generally include I frames and P frames.
  • the I frame is a key frame, and the I frame is obtained by completely encoding the current video frame.
  • the I frame is only encoded using intra-frame prediction.
  • the decoder can independently decode the content of the I frame without the information of other frames.
  • the I frame It is generally used as a reference frame for subsequent frames, and also as an entry point for code stream switching.
  • a set of video coded image sequences generally starts with an I frame.
  • the P frame is a forward predictive coding frame, and the P frame is obtained after coding according to the difference data between the current video frame and the previous video frame.
  • the encoding efficiency of the I frame is lower than that of the P frame.
  • the first video frame is encoded as an I frame, and the subsequent I frame is performed according to a fixed cycle.
  • the encoding of the frame, and the video frame after the video frame encoded as the I frame is encoded as the P frame, that is, there are multiple P frames between the two I frames.
  • the present application provides an encoding method, a real-time communication method, a device, a device and a storage medium.
  • an encoding method executed by the server, including:
  • the current video frame in the video stream is precoded to obtain the precoded frame of the current video frame, and the precoding mode is used to precode the first video frame in a video stream as I frame, precoding the video frame after the first video frame into a P frame;
  • the current video frame is a scene switching frame, and the scene switching frame is relative to the previous one
  • the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame
  • the encoded frame is obtained by encoding the video frame before the current video frame
  • the current video frame is a scene switching frame
  • encode the current video frame as an I frame
  • the current video frame is not the scene switching frame
  • encode the current video frame as a P frame.
  • a real-time communication method executed by a server, including:
  • the video image acquisition is performed on the video generated in real time to obtain the video stream;
  • precoding Perform precoding on the current video frame in the video stream according to a precoding method to obtain a precoded frame of the current video frame, and the precoding method is used to convert the first video frame in one of the video streams Precoding as an I frame, precoding a video frame after the first video frame as a P frame;
  • the current video frame is a scene switching frame, and the scene switching frame is relative to the previous one
  • the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame
  • the encoded frame is obtained by encoding the video frame before the current video frame
  • the current video frame is a scene switching frame, encode the current video frame into an I frame, and if the current video frame is not the scene switching frame, encode the current video frame into a P frame to obtain the current coded frames of video frames;
  • an encoding device including:
  • the first encoding module is configured to pre-encode the current video frame in the video stream according to a pre-encoding method to obtain a pre-encoded frame of the current video frame, and the pre-encoding method is used to pre-encode the first video frame in a video stream
  • a video frame is precoded as an I frame, and a video frame after the first video frame is precoded as a P frame;
  • a determining module configured to determine whether the current video frame is a scene switching frame according to the pre-encoded frame and the target pre-encoded frame when there is no I frame in the M coded frames before the current video frame, and the scene
  • the switch frame is a video frame in which scenes have been switched relative to the previous video frame, the target precoded frame is obtained by precoding the previous video frame of the current video frame, and the coded frame is obtained by precoding the previous video frame of the current video frame.
  • the video frame is encoded, and the M is a preset positive integer;
  • the second encoding module is used to encode the current video frame as an I frame when the current video frame is a scene switching frame, and encode the current video frame when the current video frame is not the scene switching frame Encoded as P-frames.
  • a real-time communication device including:
  • the collection module is used for video image collection of the video generated in real time to obtain a video stream
  • the first encoding module is configured to pre-encode the current video frame in the video stream according to a pre-encoding method to obtain a pre-encoded frame of the current video frame, and the pre-encoding method is used to convert one of the video streams
  • the first video frame in is precoded as I frame, and the video frame after described first video frame is precoded as P frame;
  • a determining module configured to determine whether the current video frame is a scene switch frame according to the precoded frame and the target precoded frame if there is no I frame in the M coded frames before the current video frame, and the scene switch
  • the frame is a video frame in which scene switching has occurred relative to the previous video frame
  • the target precoded frame is obtained by precoding the previous video frame of the current video frame
  • the coded frame is the video frame before the current video frame
  • the video frame is encoded
  • the M is a preset positive integer
  • the second encoding module is used to encode the current video frame as an I frame if the current video frame is a scene switching frame, and encode the current video frame as an I frame if the current video frame is not the scene switching frame P frame, to obtain the coded frame of the current video frame;
  • a sending module configured to send the code stream to the client, so that the client displays video images according to the code stream.
  • an electronic device including: one or more processors and a memory, the memory is used to store computer-readable instructions, and the one or more processors are used to call and run the computer-readable instructions stored in the memory An instruction to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
  • a computer-readable storage medium for storing computer-readable instructions, the computer-readable instructions cause the computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations .
  • a computer program product including computer-readable instructions, the computer-readable instructions cause a computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
  • FIG. 1 is a schematic diagram of an application scenario of an encoding method provided in an embodiment of the present application
  • FIG. 2 is a flowchart of an encoding method provided in an embodiment of the present application
  • FIG. 3 is a schematic flowchart of an encoding method provided in an embodiment of the present application.
  • FIG. 4 is a flowchart of an encoding method provided in an embodiment of the present application.
  • FIG. 5 is a flowchart of a real-time communication method provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of a method for obtaining an optimal decoding configuration provided in an embodiment of the present application
  • FIG. 7 is a flowchart of a method for obtaining an optimal decoding configuration provided in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an encoding device provided in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application.
  • Fig. 10 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • Cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and network in a wide area network or a local area network to realize data calculation, storage, processing, and sharing.
  • Cloud technology is a general term for network technology, information technology, integration technology, management platform technology, application technology, etc. based on cloud computing business model applications. It can form a resource pool, which can be used on demand and is flexible and convenient. Cloud computing technology will become an important support.
  • the background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites. With the rapid development and application of the Internet industry, each item may have its own identification mark in the future, which needs to be transmitted to the background system for logical processing. Data of different levels will be processed separately, and all kinds of industry data need to be powerful.
  • the system backing support can only be realized through cloud computing.
  • Cloud gaming also known as gaming on demand, is an online gaming technology based on cloud computing technology. Cloud gaming technology enables thin clients with relatively limited graphics processing and data computing capabilities to run high-quality games.
  • the game is not run on the player's game terminal, but in the cloud server, and the cloud server renders the game scene into a video and audio stream, which is transmitted to the player's game terminal through the network.
  • the player's game terminal does not need to have powerful graphics computing and data processing capabilities, but only needs to have basic streaming media playback capabilities and the ability to obtain player input instructions and send them to the cloud server.
  • I frame (Intra picture), also known as key frame or intra frame coding frame, I frame is obtained by completely encoding the current video frame, I frame is only encoded using intra frame prediction, and the decoder does not need other frames
  • the information of the I frame can be independently decoded to the content of the I frame, and the I frame is generally used as a reference frame of the subsequent frame and also as an entry point for code stream switching.
  • a P frame (predictive-frame) is a forward predictive coding frame, and the P frame is obtained after coding according to the difference data between the current video frame and the previous video frame.
  • the encoding efficiency is low when a video frame scene switching occurs.
  • this application when encoding the current video frame in the video stream, one pre-encoding and one secondary encoding are performed, and the current video frame is first pre-encoded according to the pre-encoding method to obtain the current video frame.
  • the precoded frame of frame if there is no I frame in the first M coded frames of current video frame, then according to precoded frame and target precoded frame, when determining that current video frame is a scene switching frame (that is, video frame scene switching has occurred), Encode the current video frame as an I frame, otherwise encode the current video frame as a P frame, where the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame. Therefore, it is realized that an I frame is inserted when a video frame scene switching occurs, and when the I frame is inserted and then compiled into a P frame, the number of bytes occupied by the P frame is reduced, thereby improving the coding efficiency.
  • FIG. 1 is a schematic diagram of an application scenario of an encoding method provided in an embodiment of the present application.
  • server 120 has graphics processing functions, such as image segmentation and image fusion functions, and server 120 also has data transmission functions of video and audio streams, such as video encoding functions.
  • the application scenario shown in FIG. 1 may also include: a base station, core network side equipment, etc.
  • FIG. 1 exemplarily shows a terminal device and a server, and may actually include other The number of terminal devices and servers is not limited in this application.
  • the server 120 in FIG. 1 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server providing cloud computing services. This application does not limit this.
  • a cloud server refers to a server that runs games in the cloud, and has functions such as video enhancement (pre-encoding processing), video encoding, etc., but is not limited thereto.
  • Terminal equipment refers to a type of equipment that has rich human-computer interaction methods, has the ability to access the Internet, is usually equipped with various operating systems, and has strong processing capabilities.
  • the terminal device may be a smart phone, a living room TV, a tablet computer, a vehicle terminal, a player game terminal, such as a handheld game console, but not limited thereto.
  • cloud game manufacturers mainly rely on the hardware encoders of hardware manufacturers to encode and transmit game content.
  • the characteristic of the hardware encoder is that the encoding speed is extremely fast, but the intermediate encoded data is difficult to extract and utilize.
  • cloud game manufacturers it is also difficult for cloud game manufacturers as users to modify the hardware encoding process, and most of the time they can only use the capabilities provided by hardware manufacturers.
  • the encoding method provided in this application is an adaptive I-frame insertion method, which can realize the insertion of I-frames when video frame scene switching occurs, for example, inserting I-frames at the moment of game scene switching, which can improve the coding efficiency of the encoder.
  • the encoding method provided in this application is applicable to cloud game scenarios and is compatible with various hardware encoders.
  • Fig. 2 is a flow chart of an encoding method provided by the embodiment of the present application.
  • the method may be executed by the server 120 as shown in Fig. 1, but is not limited thereto.
  • the method includes the following steps:
  • the precoding mode is used to precode the first video frame in a video stream as an I frame, and precode the video frames after the first video frame as a P frame.
  • Pre-encoding refers to an additional encoding process before encoding processing, and encoding performed after pre-encoding is secondary encoding. Wherein, by performing precoding on the current video frame, the data size of the precoded frame can be obtained.
  • the encoding mode of pre-encoding and the encoding mode of secondary encoding adopt the encoding mode that can make the size of the encoded encoded frame data better reflect the matching degree of inter-frame prediction. If the degree of inter-frame correlation is large, the amount of frame data after encoding the current video frame is relatively small; otherwise, the amount of frame data after encoding the current video frame is relatively large.
  • the encoding mode of the pre-encoding and the encoding mode of the secondary encoding (referring to the encoding performed by S103) can be a fixed quantization parameter (CQP) mode, and the pre-encoding and the secondary encoding can be performed using the CQP mode, so that the inter-frame
  • CQP fixed quantization parameter
  • the pre-encoding and the secondary encoding can be performed using the CQP mode, so that the inter-frame
  • the degree of correlation is high, the amount of frame data after encoding of the current video frame is relatively small; on the contrary, when the degree of correlation between frames is small, the amount of frame data after encoding of the current video frame is relatively large.
  • the preset encoding mode in the embodiment of the present application may also use other encoding modes, which is not limited in the present application.
  • the current video frame is pre-coded according to the pre-coding method to obtain the pre-coding frame of the current video frame, which can be :
  • the current video frame is down-sampled according to a preset sampling size to obtain a down-sampled video frame, and the down-sampled video frame is pre-coded according to a pre-coding manner to obtain a pre-coded frame of the current video frame.
  • the sampling size may be a sampling window of a preset size, a preset sampling area, a preset sampling width, a preset sampling length, and the like.
  • the sampling width can be, for example, 1/2 of the width of the original video frame
  • the sampling length can be, for example, 1/2 of the length of the original video frame, that is, the width of the downsampled video frame is 1/2 of the width of the current video frame
  • One, that is, the length of the downsampled video frame is half of the length of the current video frame.
  • the preset sampling width and the preset sampling length may also be other values, which are not limited in this embodiment of the present application.
  • the scene switching frame is a video frame in which a scene switching has occurred relative to the previous video frame
  • the target precoding frame is obtained by precoding the previous video frame of the current video frame
  • the coded frame is obtained by coding the video frame before the current video frame
  • M is a preset positive integer.
  • the coded frame is obtained by coding a video frame before the current video frame, specifically, it may be obtained by performing secondary coding on the video frame before the current video frame after precoding.
  • M is a preset positive integer. In order to avoid too much insertion of I frames, in the embodiment of the present application, preset M is used to avoid too many insertions of I frames.
  • the value of M is, for example, 128. In other embodiments, The value of M can also be other positive integers, such as 64, 256 and so on.
  • the precoding method is used to precode a first video frame into an I frame, and precode the video frames after the first video frame into a P frame, so the precoded frame of the current video frame It is a P frame, the target precoding frame is the precoding frame of the previous video frame of the current video frame, and the target precoding frame is also a P frame, and it can be determined whether the current video frame is a scene switching frame according to two consecutive P frames.
  • a scene switch frame is a video frame in which a scene switch occurs relative to the previous video frame, wherein the scene switch may include: for two consecutive video frames, the scene, image background or other information of the video frame has changed.
  • the amount of data included in the pre-encoded frame may be represented by the number of bytes or other content quantification parameters.
  • the current video frame is a scene switching frame, specifically, comparing the ratio of the number of bytes with a preset threshold, or comparing the difference between the number of bytes and the set difference Compare, or compare the number of bytes with a benchmark number of bytes, etc.
  • the preset threshold is equal to 2, for example, or other values, which are not limited in this embodiment of the present application.
  • the inter-frame correlation of the video texture is obviously stronger than the intra-frame correlation
  • the data size of the I frame will be significantly larger than the data size of the P frame.
  • this law is very stable.
  • the inter-frame correlation suddenly weakens, resulting in an increase in the data volume of the P frame.
  • the encoder will partially abandon the inter-frame correlation and use the intra-frame correlation to encode the current frame, resulting in the generation of a large number of intra-frame coding blocks.
  • the preset threshold is, for example, 2, that is, the second P frame If the number of bytes is greater than or equal to twice the number of bytes of the first P frame, it is determined that the current video frame corresponding to the second P frame is a scene switching frame.
  • whether the current video frame is a scene switching frame is determined by comparing the ratio of the number of bytes with the preset threshold, which can effectively avoid the influence of intra-frame coding blocks on the judgment result and improve the judgment result. weekly accuracy.
  • the current video frame is coded as a P frame, or the current video frame is coded according to the secondary coding method and the preset coding mode.
  • the secondary encoding method may be: starting from the first video frame, encoding the I frame according to the preset cycle, and encoding the video frames after the I frame
  • Video frames are encoded as P-frames.
  • the preset period is 200 frames, that is, the first video frame is encoded as an I frame, the 201st video frame is encoded as an I frame, the 401st video frame is encoded as an I frame, ..., each video encoded as an I frame
  • the video frame after the frame is encoded as a P frame, and the encoded frame between two I frames is a P frame.
  • the implementation manner of the above-mentioned secondary encoding manner is an implementable manner.
  • the secondary encoding manner may also be another implementation manner, which is not limited in the embodiment of the present application.
  • the preset encoding mode may be a CQP mode.
  • CQP mode for precoding and secondary encoding can make the amount of frame data after encoding the current video frame relatively small when the degree of inter-frame correlation is large; The amount of frame data is relatively large.
  • the current video frame is pre-encoded to obtain the pre-encoded frame of the current video frame, and then the M before the current video frame is determined. Whether there is an I frame in a coded frame, if there is no I frame, the insertion of the I frame (that is, coding) is performed.
  • the current video frame is encoded twice according to the second encoding method and the preset encoding mode, for example, if the second encoding method is the above-mentioned implementable periodicity Encoding mode (that is, encoding of I frame and P frame according to a fixed cycle), the current video frame is encoded as a P frame.
  • the second encoding method is the above-mentioned implementable periodicity Encoding mode (that is, encoding of I frame and P frame according to a fixed cycle)
  • the current video frame is encoded as a P frame.
  • the I frame by determining whether there is an I frame in the M coded frames before the current video frame, it is determined whether the current video frame is encoded as an I frame, which can ensure that when the scene is switched, the I frame can be encoded to ensure that the video There will be no loss of I frames during playback, and the video quality will be improved. At the same time, avoid inserting too many I frames during encoding, which will affect the encoding efficiency.
  • the method of this embodiment may also include:
  • the current video frame is a scene switching frame, encode the current video frame as an I frame, and if the current video frame is not a scene switching frame, encode the current video frame as a P frame.
  • the I frame is inserted at the scene switch, which can improve the encoding efficiency of the encoder, and at the same time ensure the smoothness of the video picture during playback. For example, in cloud games, it can save bandwidth for cloud game services with the same image quality, avoid freezing the game screen, and improve the fluency of the game screen.
  • the pre-encoded frame of the current video frame is obtained by first pre-encoding the current video frame in the video stream according to the pre-encoding method. If there is no I frame in the first M encoded frames of the current video frame, then When it is determined that the current video frame is a scene switching frame according to the pre-encoding frame and the target pre-encoding frame (that is, a video frame scene switching has occurred), the current video frame is encoded as an I frame, otherwise the current video frame is encoded as a P frame, and the target The precoded frame is obtained by precoding the previous video frame of the current video frame.
  • an I frame is inserted when a video frame scene switching occurs, and when the I frame is inserted and then compiled into a P frame, the number of bytes occupied by the P frame is reduced, thereby improving the coding efficiency. Since the coding efficiency of I frames is lower than that of P frames, in this application, it is determined that there is no I frame in the first M coded frames of the current video frame and the I frame is inserted when a video frame scene switch occurs, which can avoid I frame insertion Too many, improve coding efficiency.
  • Fig. 3 is a schematic flowchart of an encoding method provided by the embodiment of the present application. As shown in Fig. 3, the method of this embodiment may include:
  • S201 Perform precoding on a current video frame in a video stream according to a precoding manner, to obtain a precoded frame of the current video frame.
  • the precoding mode is used to precode the first video frame in a video stream as an I frame, and precode the video frames after the first video frame as a P frame.
  • Fig. 4 is a flow chart of an encoding method provided by the embodiment of the present application.
  • the current video frame is any video frame in the sequence of video frames.
  • S202 may include:
  • the sampling size may be a sampling window of a preset size, a preset sampling area, a preset sampling width, a preset sampling length, and the like.
  • the sampling width can be, for example, 1/2 of the width of the original video frame
  • the sampling length can be, for example, 1/2 of the length of the original video frame, that is, the width of the downsampled video frame is 1/2 of the width of the current video frame
  • One, that is, the length of the downsampled video frame is half of the length of the current video frame.
  • the preset sampling width and the preset sampling length may also be other values, which are not limited in this embodiment of the present application.
  • the pre-coding time can be saved, and the overall coding efficiency can be improved.
  • the number of coded frames before the current video frame is counted, and when the number of coded frames before the current video frame reaches M, it is judged whether there is an I frame in the M coded frames before the current video frame.
  • the value of M is equal to 128. In other embodiments, the value of M may also be other positive integers, such as 64, 256 and so on.
  • the encoding mode of the pre-encoding and the encoding mode of the secondary encoding can be, for example, the CQP mode.
  • the CQP mode for encoding can make it possible to make the current The amount of frame data after encoding the video frame is relatively small, on the contrary, the amount of frame data after encoding the current video frame is relatively large.
  • the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame, if so, then determine that the current video frame is a scene switching frame, then perform S204, if not, then determine that the current video frame is not a scene switching frame, then Execute S205.
  • the preset threshold is 2, for example.
  • the pre-encoded encoded frame is used to determine whether scene switching occurs, if there is no I frame in the first M encoded frames of the current video frame and scene switching occurs Then the current video frame is encoded as an I frame, otherwise the current video frame is encoded as a P frame. Therefore, it is realized that an I frame is inserted when a video frame scene switching occurs, and when the I frame is inserted and then compiled into a P frame, the number of bytes occupied by the P frame is reduced, thereby improving the coding efficiency.
  • Fig. 5 is a flow chart of a real-time communication method provided by the embodiment of the present application. As shown in Fig. 5, the method of this embodiment may include:
  • the server collects video images from videos generated in real time to obtain video streams.
  • each video frame includes an image composed of virtual game screens.
  • the server precodes the current video frame in the video stream according to the precoding method to obtain the precoded frame of the current video frame.
  • the precoding method is used to precode the first video frame in a video stream into an I frame , the video frames after the first video frame are precoded as P frames.
  • the server determines whether the current video frame is a scene switch frame according to the precoded frame and the target precoded frame, and the scene switch frame is a scene that occurs relative to the previous video frame
  • the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame
  • one encoded frame is obtained by encoding a video frame before the current video frame
  • M is a preset positive integer.
  • the server encodes the current video frame into an I frame, and if the current video frame is not a scene switching frame, the server encodes the current video frame into a P frame to obtain an encoded frame of the current video frame.
  • the server obtains a code stream according to the coded frame of the current video frame and multiple coded frames before the current video frame.
  • the server sends the code stream to the client.
  • the client terminal displays a video image according to the code stream.
  • the server precodes the current video frame in the video stream to obtain the precoded frame of the current video frame. If there is no I frame in the first M coded frames of the current video frame, the When the encoded frame and the target pre-encoded frame determine that the current video frame is a scene switching frame (that is, a video frame scene switch occurs), the current video frame is encoded as an I frame, otherwise the current video frame is encoded as a P frame, and the target pre-encoded The frame is obtained by precoding the previous video frame of the current video frame.
  • an I frame is inserted when a video frame scene switching occurs, and when the I frame is inserted and then compiled into a P frame, the number of bytes occupied by the P frame is reduced, thereby improving the coding efficiency. Since the coding efficiency of I frames is lower than that of P frames, in this application, it is determined that there is no I frame in the first M coded frames of the current video frame and the I frame is inserted when a video frame scene switch occurs, which can avoid I frame insertion Too many, improve coding efficiency.
  • the above-mentioned image coding method has practical significance only when the decoding end, that is, the above-mentioned terminal device has the ability to decode the above-mentioned coded stream.
  • the following will provide an optimal decoding configuration Get method.
  • FIG. 6 is a flow chart of a method for obtaining an optimal decoding configuration provided in an embodiment of the present application. As shown in FIG. 6, the method includes:
  • the cloud server sends a decoding capability inquiry request to the terminal device.
  • the cloud server receives decoding capability response data of the terminal device, where the decoding capability response data includes: the decoding capability of the terminal device.
  • the cloud server determines the optimal decoding configuration according to the decoding capability of the terminal device, the cloud game type, and the current network status.
  • the cloud server sends the optimal decoding configuration to the terminal device.
  • the terminal device decodes the code stream of the video stream by using the optimal decoding configuration.
  • FIG. 7 is a flowchart of a method for obtaining an optimal decoding configuration provided in an embodiment of the present application.
  • the cloud server may send a decoding capability request to the terminal device through a client installed on the terminal device , the terminal device can also return a decoding capability response to the cloud server through the client.
  • the client may be a cloud game client.
  • the decoding capability query request is used to request to obtain feedback data representing the decoding capability of the terminal device.
  • the decoding capability request includes at least one of the following, but not limited thereto: a protocol version number, and a specific decoding protocol query.
  • the protocol version number refers to the minimum protocol version supported by the cloud server, and the protocol may be a decoding protocol.
  • the specific decoding protocol query refers to the decoding protocol to be queried by the cloud server, for example, the video decoding protocol H264 or H265.
  • code implementation of the decoding capability request can be as follows:
  • the Profile and Level supported by the terminal device are listed in two tuples, for example, device A supports H264 capability: (Baseline,Level51), (Main,Level51), (High,Level51).
  • the decoding capability response may include, in addition to the decoding capability of the terminal device, an indication of whether the query of the decoding protocol to be queried by the cloud server is successful, and a protocol version number supported by the terminal device.
  • the indication of whether the query of the decoding protocol to be queried by the cloud server is successful can be represented by 0; if the query of the decoding protocol to be queried by the cloud server fails, then for the cloud Whether the decoding protocol to be queried by the server is successful can be indicated by an error code, such as 001.
  • the protocol version number refers to the minimum protocol version supported by the terminal device, and the protocol may be a decoding protocol.
  • the decoding capability of the terminal device includes at least one of the following, but is not limited thereto: the type of decoding protocol supported by the terminal device, the Profile, Level, and performance supported by the decoding protocol, and the like.
  • Example 1 the code implementation of the decoding ability response can be as follows:
  • Example 2 if the terminal device only supports part of the decoding protocol, it will return the supported decoding protocol information.
  • the code implementation of the decoding capability response can be as follows:
  • codecs 0; support 0 hardware codecs
  • Example 4 if the request for the decoding capability of the terminal device fails, a specific error code is returned.
  • the code implementation of the decoding capability response can be as follows:
  • the cloud server selects a higher capability within the decoding capability range of the terminal device, such as in the above example 1, select profile3 and performances3, wherein the cloud server can match the cloud game type and
  • the mapping relationship between the decoding capabilities of the terminal devices is used to select the optimal decoding configuration, or to select the optimal decoding configuration according to other selection rules.
  • the cloud server can select a higher capability within the decoding capability range of the terminal device, such as in the above example 1, select profile3 and performances3, wherein the cloud server can communicate with the terminal device according to the network status
  • the optimal decoding configuration can be selected, and the optimal decoding configuration can also be selected according to other selection rules.
  • the cloud server can select the optimal decoding configuration according to the mapping relationship between the cloud game type, the network status and the decoding capability of the terminal device, or select the optimal decoding configuration according to other selection rules.
  • the present application does not limit how to determine the optimal decoding configuration.
  • the terminal device can decode the code stream of the video stream through the optimal decoding configuration, so that the decoding effect can be improved.
  • Fig. 8 is a schematic structural diagram of an encoding device provided by an embodiment of the present application. As shown in Fig. 8, the encoding device may include: a first encoding module 11, a determination module 12 and a second encoding module 13,
  • the first encoding module 11 pre-encodes the current video frame in the video stream according to the pre-encoding mode to obtain the pre-encoded frame of the current video frame.
  • the pre-encoding mode is used to pre-encode the first video frame in a video stream. Encoded as an I frame, the video frame after the first video frame is pre-encoded as a P frame;
  • the determination module 12 is used to determine whether the current video frame is a scene switch frame according to the precoded frame and the target precoded frame when there is no I frame in the M coded frames before the current video frame, and the scene switch frame is relative to the previous video frame
  • the target pre-encoded frame is obtained by pre-encoding the previous video frame of the current video frame
  • an encoded frame is obtained by encoding a video frame before the current video frame
  • M is a preset positive integer
  • the second encoding module 13 is used for encoding the current video frame as an I frame when the current video frame is a scene switching frame, and encoding the current video frame as a P frame when the current video frame is not a scene switching frame.
  • the first coding module 11 is also used for: if there is an I frame in the M coded frames before the current video frame, the current video frame is coded as a P frame, or, according to the secondary coding method and the preset coding mode Encodes the current video frame.
  • the first encoding module 11 is configured to: down-sample the current video frame according to a preset sampling width and a preset sampling length to obtain a down-sampled video frame; perform pre-coding on the down-sampled video frame Precoding, get the precoding frame of the current video frame.
  • the determination module 12 is also used for: counting the number of coded frames before the current video frame; when the number of coded frames before the current video frame reaches M, confirm whether there is no I in the M coded frames before the current video frame frame.
  • the value of M includes 128.
  • the determining module 12 is configured to: determine whether the current video frame is a scene switching frame according to the size of the pre-encoded frame and the size of the target pre-encoded frame.
  • the determining module 12 is configured to: determine whether the current video frame is a scene switching frame according to the byte count of the pre-encoded frame and the byte count of the target pre-encoded frame.
  • the determination module 12 is specifically configured to: if the ratio of the number of bytes of the pre-encoded frame to the number of bytes of the target pre-encoded frame is greater than or equal to a preset threshold, then determine that the current video frame is a scene switching frame; if the pre-encoded If the ratio of the number of bytes of the frame to the number of bytes of the target pre-encoded frame is less than a preset threshold, it is determined that the current video frame is not a scene switching frame.
  • the preset coding mode is a fixed quantization parameter CQP mode
  • the precoding coding mode is a CQP mode
  • the secondary encoding method includes: starting from the first video frame, encoding I frames according to a preset period, and encoding video frames subsequent to the video frames encoded as I frames into P frames.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the encoding device shown in FIG. 8 can execute the method embodiment corresponding to FIG. 2 , and the aforementioned and other operations and/or functions of each module in the encoding device are to realize the corresponding process in the method embodiment corresponding to FIG. 2 , for the sake of brevity, it is not repeated here.
  • FIG. 9 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application.
  • the real-time communication device may include: an acquisition module 21, a first encoding module 22, a determination module 23, and a second encoding module 24 and sending module 25.
  • the receiving module 21 is used for video image acquisition to the video generated in real time, and obtains the video stream;
  • the first encoding module 22 is used to pre-encode the current video frame in the video stream according to the pre-encoding method to obtain the pre-encoded frame of the current video frame.
  • the pre-encoding method is used to pre-encode the first video frame in a video stream. Encoded as an I frame, the video frame after the first video frame is pre-encoded as a P frame;
  • the determination module 23 is used to determine whether the current video frame is a scene switch frame if there is no I frame in the M coded frames before the current video frame, according to the precoded frame and the target precoded frame, and the scene switch frame occurs relative to the previous video frame
  • the video frame of scene switching, the target pre-encoding frame is obtained by pre-encoding the previous video frame of the current video frame, and one encoded frame is obtained by encoding a video frame before the current video frame, and M is a preset positive integer;
  • the second encoding module 24 is used for encoding the current video frame as an I frame if the current video frame is a scene switching frame, and encoding the current video frame as a P frame if the current video frame is not a scene switching frame to obtain an encoded frame of the current video frame ;
  • the sending module 25 is used to send the code stream to the client, so that the client can display video images according to the code stream.
  • each of the plurality of video frames includes an image composed of virtual game screens.
  • the second coding module 24 is also used for: if there is an I frame in the M coded frames before the current video frame, then the current video frame is coded into a P frame, or, according to the secondary coding method and the preset coding mode Encodes the current video frame.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the encoding device shown in FIG. 9 can execute the method embodiment corresponding to FIG. 5 , and the aforementioned and other operations and/or functions of each module in the encoding device are to realize the corresponding process in the method embodiment corresponding to FIG. 5 , for the sake of brevity, it is not repeated here.
  • the encoding device has been described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
  • each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
  • the encoding processor is executed, or the combination of hardware and software modules in the encoding processor is used to complete the execution.
  • the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • Fig. 10 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may be the server in the foregoing method embodiments.
  • the electronic equipment may include:
  • a memory 210 and a processor 220 the memory 210 is used for storing computer-readable instructions, and transmitting the readable instruction codes to the processor 220 .
  • the processor 220 can invoke and execute computer-readable instructions from the memory 210, so as to implement the method in the embodiment of the present application.
  • the processor 220 may be configured to execute the above-mentioned method embodiments according to instructions in the computer-readable instructions.
  • the processor 220 may include but not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 210 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer-readable instructions can be divided into one or more modules, and the one or more modules are stored in the memory 210 and executed by the processor 220 to complete the present application provided method.
  • the one or more modules may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions in the electronic device.
  • the electronic device may further include: a transceiver 230 that may be connected to the processor 220 or the memory 210 .
  • the processor 220 can control the transceiver 230 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 230 may include a transmitter and a receiver.
  • the transceiver 230 may further include antennas, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • the present application also provides a computer storage medium, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a computer, the computer can execute the methods of the above-mentioned method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • the computer program product includes one or more computer readable instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种编码方法、实时通信方法、装置、设备及存储介质,包括:按照预编码方式对当前视频帧进行预编码,得到当前视频帧的预编码帧(S101);预编码方式,用于将一个第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧;若当前视频帧之前的M个编码帧中没有I帧,根据预编码帧和目标预编码帧,确定当前视频帧是否是场景切换帧(S102);场景切换帧为相对前一个视频帧发生了场景切换的视频帧,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,一个编码帧为对当前视频帧之前的一个视频帧编码得到,M为预设正整数;若当前视频帧是场景切换帧,将当前视频帧编码为I帧,若当前视频帧不是场景切换帧,将当前视频帧编码为P帧(S103)。

Description

编码方法、实时通信方法、装置、设备及存储介质
相关申请的交叉引用
本申请要求于2022年01月27日提交中国专利局,申请号为2022101016976,申请名称为“编码方法、实时通信方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及互联网技术领域,尤其涉及一种编码方法、实时通信方法、装置、设备及存储介质。
背景技术
视频是连续的图像序列,由连续的帧构成,一帧即为一幅图像。由于人眼的视觉暂留效应,当帧序列以一定的速率播放时,我们看到的就是动作连续的视频。由于连续的帧之间相似性较高,为便于储存传输,需要对原始的视频进行编码压缩,以去除空间、时间维度的冗余。目前的视频编码标准中通常包括I帧和P帧。I帧为关键帧,I帧为将当前的视频帧进行完整编码后得到,I帧仅使用帧内预测进行编码,解码器不需要其他帧的信息即可独立解码出I帧的内容,I帧一般作为后续帧的参考帧,也作为码流切换的切入点。一组视频编码图像序列一般都是以I帧为起始帧。P帧为前向预测编码帧,P帧为根据当前视频帧与前一视频帧之间的差异数据进行编码后得到。
由于I帧是对视频帧进行完整编码,I帧的编码效率低于P帧的编码效率,现有的视频编码方法中,是将第一个视频帧编码为I帧,按照固定周期进行后续I帧的编码,并将编码为I帧的视频帧之后的视频帧编码为P帧,即两个I帧之间有多个P帧。
然而,当发生视频帧场景切换时,若依然按照上述方法进行视频编码,存在编码效率较低的问题。
发明内容
本申请提供一种编码方法、实时通信方法、装置、设备及存储介质。
第一方面,提供一种编码方法,由服务器执行,包括:
按照预编码方式对视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;以及
若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧,若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧。
第二方面,提供一种实时通信方法,由服务器执行,包括:
对实时生成的视频进行视频图像采集,得到视频流;
按照预编码方式对所述视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个所述视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;
若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧,若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧,得到所述当前视频帧的编码帧;
根据所述当前视频帧的编码帧和所述当前视频帧之前的多个编码帧,得到码流;以及
向客户端发送所述码流,以使所述客户端根据所述码流显示视频画面。
第三方面,提供一种编码装置,包括:
第一编码模块,用于按照预编码方式对视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
确定模块,用于在所述当前视频帧之前的M个编码帧中没有I帧时,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;以及
第二编码模块,用于在所述当前视频帧是场景切换帧时,将所述当前视频帧编码为I帧,在所述当前视频帧不是所述场景切换帧时,将所述当前视频帧编码为P帧。
第四方面,提供一种实时通信装置,包括:
采集模块,用于对实时生成的视频进行视频图像采集,得到视频流;
第一编码模块,用于按照预编码方式对所述视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个所述视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
确定模块,用于若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;
第二编码模块,用于若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧, 若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧,得到所述当前视频帧的编码帧;
根据所述当前视频帧的编码帧和所述当前视频帧之前的多个编码帧,得到码流;以及
发送模块,用于向客户端发送所述码流,以使所述客户端根据所述码流显示视频画面。
第五方面,提供一种电子设备,包括:一个或多个处理器和存储器,该存储器用于存储计算机可读指令,该一个或多个处理器用于调用并运行该存储器中存储的计算机可读指令,执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
第六方面,提供一种计算机可读存储介质,用于存储计算机可读指令,计算机可读指令使得计算机执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
第七方面,提供一种计算机程序产品,包括计算机可读指令,该计算机可读指令使得计算机执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
预编码方式
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种编码方法的应用场景示意图;
图2为本申请实施例提供的一种编码方法的流程图;
图3为本申请实施例提供的一种编码方法的流程示意图;
图4为本申请实施例提供的一种编码方法的流程框图;
图5为本申请实施例提供的一种实时通信方法的流程图;
图6为本申请实施例提供的一种最优解码配置的获取方法的流程图;
图7为本申请实施例提供的一种最优解码配置的获取方法的流程图;
图8为本申请实施例提供的一种编码装置的结构示意图;
图9为本申请实施例提供的一种实时通信装置的结构示意图;
图10是本申请实施例提供的电子设备的示意性框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示 或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或服务器不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在介绍本申请技术方案之前,下面先对本申请相关知识进行介绍:
1、云技术(Cloud technology),是指在广域网或局域网内将硬件、软件、网络等系列资源统一起来,实现数据的计算、储存、处理和共享的一种托管技术。云技术(Cloud technology)基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术、应用技术等的总称,可以组成资源池,按需所用,灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源,如视频网站、图片类网站和更多的门户网站。伴随着互联网行业的高度发展和应用,将来每个物品都有可能存在自己的识别标志,都需要传输到后台系统进行逻辑处理,不同程度级别的数据将会分开处理,各类行业数据皆需要强大的系统后盾支撑,只能通过云计算来实现。
2、云游戏(Cloud gaming),又可称为游戏点播(gaming on demand),是一种以云计算技术为基础的在线游戏技术。云游戏技术使图形处理与数据运算能力相对有限的轻端设备(thin client)能运行高品质游戏。在云游戏场景下,游戏并不在玩家游戏终端,而是在云服务器中运行,并由云服务器将游戏场景渲染为视频音频流,通过网络传输给玩家游戏终端。玩家游戏终端无需拥有强大的图形运算与数据处理能力,仅需拥有基本的流媒体播放能力与获取玩家输入指令并发送给云服务器的能力即可。
3、I帧(Intra picture),也称为关键帧或帧内编码帧,I帧为将当前的视频帧进行完整编码后得到,I帧仅使用帧内预测进行编码,解码器不需要其他帧的信息即可独立解码出I帧的内容,I帧一般作为后续帧的参考帧,也作为码流切换的切入点。
4、P帧(predictive-frame),为前向预测编码帧,P帧为根据当前视频帧与前一视频帧之间的差异数据进行编码后得到。
如上,现有的视频编码方法中,当发生视频帧场景切换时,编码效率较低。为了解决这一技术问题,本申请中,在对视频流中的当前视频帧进行编码时,进行一次预编码和一次二次编码,先按照预编码方式对当前视频帧进行预编码,得到当前视频帧的预编码帧,若当前视频帧的前M个编码帧中没有I帧,则根据预编码帧和目标预编码帧确定当前视频帧是场景切换帧时(即发生了视频帧场景切换),将当前视频帧编码为I帧,否则将当前视频帧编码为P帧,其中的目标预编码帧为对当前视频帧的前一个视频帧预编码得到。从而,实现了在发生视频帧场景切换时插入I帧,插入I帧再编入P帧时,P帧占用的字节数减少,从而提高了编码效率。由于I帧的编码效率低于P帧的编码效率,本申请中是在确定当前视频帧的前M个编码帧中没有I帧并且发生视频帧场景切换时才插入I帧,可避免I帧插入过多,提高编码效率。
应理解的是,本申请技术方案可以应用于如下场景,但不限于:
示例性的,图1为本申请实施例提供的一种编码方法的应用场景示意图,如图1所示,终端设备110可以与服务器120进行通信,其中,终端设备110具有流媒体播放功能,服务器120具有图形处理功能,例如:图像分割、图像融合功能,服务器120还具有视频音频流的数据传输功能,例如:视频编码功能。
在一些可实现方式中,图1所示的应用场景中还可以包括:基站、核心网侧设备等,此外,图1示例性地示出了一个终端设备、一台服务器,实际上可以包括其他数量的终端设备和服务器,本申请对此不做限制。
在一些可实现方式中,图1中的服务器120可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。本申请对此不做限制。
在云游戏场景中,云服务器是指在云端运行游戏的服务器,并具备视频增强(编码前处理)、视频编码等功能,但不限于此。
终端设备是指一类具备丰富人机交互方式、拥有接入互联网能力、通常搭载各种操作系统、具有较强处理能力的设备。终端设备可以是智能手机、客厅电视、平板电脑、车载终端、玩家游戏终端,如掌上游戏主机等,但不限于此。
以云游戏场景为例,云游戏厂商主要依赖硬件厂商的硬件编码器对游戏内容进行编码传输。硬件编码器的特点是编码速度极快,但是中间编码数据难以提取和利用。同时,作为用户的云游戏厂商也难以对硬件编码过程进行修改,大部分时间只能使用硬件厂商提供的能力。本申请提供的编码方法,为自适应I帧插入方法,可实现在发生视频帧场景切换时插入I帧,例如在游戏场景切换的时刻插入I帧,可提高编码器的编码效率。本申请提供的编码方法,可适用于云游戏场景且兼容各式硬件编码器。
下面将对本申请技术方案进行详细阐述:
图2为本申请实施例提供的一种编码方法的流程图,该方法例如可以由如图1所示的服务器120执行,但不限于此,如图2所示,该方法包括如下步骤:
S101、按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧。
其中,预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧。预编码是指在编码处理之前额外增加的一次编码处理,在预编码之后进行的编码即为二次编码。其中,对当前视频帧进行预编码,可以得到预编码帧的数据量大小。
其中,预编码的编码模式以及二次编码(是指S103执行的编码)的编码模式,采用可以使得编码后的编码帧数据大小能够较好地反映出帧间预测的匹配度的编码模式。若帧间关联度大,则当前视频帧编码后的帧数据量比较小,反之,当前视频帧编码后帧数据量比较大。
可选的,预编码的编码模式以及二次编码(是指S103执行的编码)的编码模式可以 是固定量化参数(CQP)模式,使用CQP模式进行预编码和二次编码,可以使得在帧间关联度大的情况下,当前视频帧编码后的帧数据量比较小,反之,在帧间关联度小的情况下,当前视频帧编码后帧数据量比较大。需要说明的是,本申请实施例中预设编码模式还可以使用其他编码模式,本申请对此不做限制。
由于预编码是为了得到预编码帧的数据量大小,为了节省预编码的时间,可选的,S101中按照预编码方式对当前视频帧进行预编码,得到当前视频帧的预编码帧,可以为:
根据预设的采样尺寸对当前视频帧进行下采样,得到下采样的视频帧,按照预编码方式对下采样的视频帧进行预编码,得到当前视频帧的预编码帧。
其中,采样尺寸可以是预设尺寸的采样窗口、预设的采样面积、预设的采样宽度和预设的采样长度等。采样宽度例如可以为原始视频帧的宽度的二分之一,采样长度例如可以为原始视频帧的长度的二分之一,即下采样的视频帧的宽度为当前视频帧的宽度的二分之一,即下采样的视频帧的长度为当前视频帧的长度的二分之一。预设的采样宽度和预设的采样长度还可以是其他数值,本申请实施例对此不做限制。通过对当前视频帧进行下采样,得到下采样的视频帧,再对下采样的视频帧进行预编码,可以节省预编码的时间,进而提高总体的编码效率。
S102、若当前视频帧之前的M个编码帧中没有I帧,根据预编码帧和目标预编码帧,确定当前视频帧是否是场景切换帧。
其中,场景切换帧为相对前一个视频帧发生了场景切换的视频帧,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,编码帧为对当前视频帧之前的视频帧编码得到,M为预设正整数。
其中,编码帧为对当前视频帧之前的视频帧编码得到,具体可以为对当前视频帧之前的视频帧在预编码之后进行二次编码得到。
其中,M为预设正整数,为了避免I帧插入过多,本申请实施例中通过预设M的方式来避免I帧插入过多,M的取值例如为128,在其他实施例中,M的取值也可以为其他正整数,例如64、256等。
本申请实施例中,由于预编码方式,用于将一个第一个视频帧预编码为I帧,将第一个视频帧之后的视频帧预编码为P帧,因此当前视频帧的预编码帧为P帧,目标预编码帧为当前视频帧的前一个视频帧的预编码帧,目标预编码帧也为P帧,可以根据连续的两个P帧确定当前视频帧是否是场景切换帧。场景切换帧为相对前一个视频帧发生了场景切换的视频帧,其中,场景切换可以包括:对于连续的两个视频帧而言,视频帧的场景、图像背景或其它信息等发生了变化。
在一种可实施的方式中,可以根据预编码帧所包含的数据量大小和目标预编码帧所包含的数据量大小,确定当前视频帧是否是场景切换帧。
可选的,预编码帧所包含的数据量大小可以用字节数或其他内容量化参数来表征。示例性的,可以根据预编码帧的字节数和目标预编码帧的字节数,确定当前视频帧是否是场 景切换帧。
可选的,基于字节数来确定当前视频帧是否是场景切换帧,具体可以是将字节数的比值与预设阈值进行比较,也可以是将字节数的差值与设定差值进行比较,再或者将字节数分别与一个基准字节数进行比较等。
示例性的,若预编码帧的字节数和目标预编码帧的字节数的比值大于或等于预设阈值,则确定当前视频帧是场景切换帧;若预编码帧的字节数和目标预编码帧的字节数的比值小于预设阈值,则确定当前视频帧不是场景切换帧。可选的,预设阈值例如等于2,还可以是其他数值,本申请实施例对此不做限制。
一般情况下,由于视频纹理的帧间关联性明显强于帧内关联性,I帧的数据量大小会明显大于P帧的数据量大小。在CQP的编码模式下,这个规律十分稳定。在当前视频帧发生场景切换的情况下,帧间关联突然减弱,导致P帧数据量增大。场景差距足够大的情况下编码器会部分弃用帧间关联性而改用帧内关联性对当前帧进行编码,导致大量帧内编码块的产生。对于连续的两个P帧,若第2个P帧的字节数与第1个P帧的字节数的比值大于或等于预设阈值,预设阈值例如为2,即第2个P帧的字节数大于或等于第1个P帧的字节数的2倍,则确定第2个P帧对应的当前视频帧为场景切换帧。
在本实施例中,通过字节数的比值与预设阈值之间的比较来确定当前视频帧是否是场景切换帧,能够有效避免产生帧内编码块的情况对判断结果的影响,提高判断结果的周准确性。
进一步地,若当前视频帧之前的M个编码帧中有I帧,则将当前视频帧编码为P帧,或者,按照二次编码方式和预设编码模式对当前视频帧进行编码。
本申请实施例中,在一种可实施的方式中,二次编码方式可以是:从第一个视频帧开始,根据预设周期进行I帧的编码,将编码为I帧的视频帧之后的视频帧编码为P帧。例如预设周期是200帧,即,第一个视频帧编码为I帧,第201个视频帧编码为I帧,第401个视频帧编码为I帧,…,每个编码为I帧的视频帧之后的视频帧编码为P帧,两个I帧之间的编码帧为P帧。
本申请实施例中,上述二次编码方式的实施方式为一种可实施的方式,可选的,二次编码方式还可以是其它实施方式,本申请实施例对此不做限制。
其中,预设编码模式可以为CQP模式。使用CQP模式进行预编码和二次编码,可以使得在帧间关联度大的情况下,当前视频帧编码后的帧数据量比较小,在帧间关联度小的情况下,当前视频帧编码后帧数据量比较大。
具体来说,本申请实施例中,若当前视频帧之前的M个编码帧中有I帧,则按照二次编码方式和预设编码模式对当前视频帧进行二次编码的这种方式中,具体可以是在按照二次编码方式和预设编码模式对当前视频帧进行二次编码之前,先对当前视频帧进行预编码,得到当前视频帧的预编码帧,然后确定当前视频帧之前的M个编码帧中是否有I帧,若没有I帧,才进行I帧的插入(即编码)。若确定当前视频帧之前的M个编码帧中有I 帧,则按照二次编码方式和预设编码模式对当前视频帧进行二次编码,例如,若二次编码方式为上述可实施的周期性编码方式(即按照固定周期进行I帧和P帧的编码),则将当前视频帧编码为P帧。
在本实施例中,通过对确定当前视频帧之前的M个编码帧中是否有I帧,来确定是否将当前视频帧编码为I帧,能够确保场景切换时,能够编码得到I帧,确保视频在播放过程中不会出现I帧的丢失,提高视频质量。同时,避免在编码时插入过多的I帧,对编码效率造成影响。
可选的,若当前视频帧之前的M个编码帧中没有I帧之前,本实施例的方法还可以包括:
统计当前视频帧之前的编码帧的数量,当当前视频帧之前的编码帧的数量到达M时,确认当前视频帧之前的M个编码帧中是否没有I帧。
S103、若当前视频帧是场景切换帧,将当前视频帧编码为I帧,若当前视频帧不是场景切换帧,将当前视频帧编码为P帧。
本实施例中,实现了在场景切换处插入I帧,可以提高编码器的编码效率,同时能够确保视频画面在播放过程中的流畅性。例如在云游戏中,可使得云游戏业务在相同画质的情况下更省带宽并且能够避免游戏画面出现卡顿,提高游戏画面的流畅性。
本申请提供的编码方法,通过先按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧,若当前视频帧的前M个编码帧中没有I帧,则根据预编码帧和目标预编码帧确定当前视频帧是场景切换帧时(即发生了视频帧场景切换),将当前视频帧编码为I帧,否则将当前视频帧编码为P帧,其中的目标预编码帧为对当前视频帧的前一个视频帧预编码得到。从而,实现了在发生视频帧场景切换时插入I帧,插入I帧再编入P帧时,P帧占用的字节数减少,从而提高了编码效率。由于I帧的编码效率低于P帧的编码效率,本申请中是在确定当前视频帧的前M个编码帧中没有I帧并且发生视频帧场景切换时才插入I帧,可避免I帧插入过多,提高编码效率。
下面结合一个具体的实施例,对本申请提供的编码方法的技术方案进行详细说明。
图3为本申请实施例提供的一种编码方法的流程示意图,如图3所示,本实施例的方法可以包括:
S201、按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧。
其中,预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧。
图4为本申请实施例提供的一种编码方法的流程框图,结合图4,当前视频帧为视频帧序列中的任一视频帧,具体地,S202可以包括:
S2011、根据预设的采样尺寸对当前视频帧进行下采样,得到下采样的视频帧。
其中,采样尺寸可以是预设尺寸的采样窗口、预设的采样面积、预设的采样宽度和预 设的采样长度等。采样宽度例如可以为原始视频帧的宽度的二分之一,采样长度例如可以为原始视频帧的长度的二分之一,即下采样的视频帧的宽度为当前视频帧的宽度的二分之一,即下采样的视频帧的长度为当前视频帧的长度的二分之一。预设的采样宽度和预设的采样长度还可以是其他数值,本申请实施例对此不做限制。
S2012、按照预编码方式对下采样的视频帧进行预编码,得到当前视频帧的预编码帧。
通过对当前视频帧进行下采样,得到下采样的视频帧,再对下采样的视频帧进行预编码,可以节省预编码的时间,进而提高总体的编码效率。
S202、判断当前视频帧之前的M个编码帧中是否有I帧,M为预设正整数。
具体地,统计当前视频帧之前的编码帧的数量,当当前视频帧之前的编码帧的数量到达M时,判断当前视频帧之前的M个编码帧中是否有I帧。
若是,则执行S205,将当前视频帧编码为P帧。若否,则执行S203。本实施例中,例如M的取值等于128,在其他实施例中,M的取值也可以为其他正整数,例如64、256等。
本实施例中,预编码的编码模式以及二次编码(指的是S204或S205执行的编码)的编码模式例如可以是CQP模式,使用CQP模式编码,可以使得若帧间关联度大,则当前视频帧编码后的帧数据量比较小,反之,当前视频帧编码后帧数据量比较大。
S203、进行场景切换帧的判断,确定预编码帧的字节数和目标预编码帧的字节数的比值是否大于或等于预设阈值。
其中,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,若是,则确定当前视频帧是场景切换帧,则执行S204,若否,则确定当前视频帧不是场景切换帧,则执行S205。其中,预设阈值例如为2。
S204、将当前视频帧编码为I帧。
S205、将当前视频帧编码为P帧。
通过S204或S205对当前视频帧编码后,得到编码后的码流。
本实施例提供的编码方法,通过进行预编码和二次编码,预编码的编码帧用来判断是否发生场景切换,若在确定当前视频帧的前M个编码帧中没有I帧且发生场景切换则将当前视频帧编码为I帧,否则将当前视频帧编码为P帧。从而,实现了在发生视频帧场景切换时插入I帧,插入I帧再编入P帧时,P帧占用的字节数减少,从而提高了编码效率。由于I帧的编码效率低于P帧的编码效率,本申请中是在确定当前视频帧的前M个编码帧中没有I帧并且发生视频帧场景切换时才插入I帧,可避免I帧插入过多,提高编码效率。
图5为本申请实施例提供的一种实时通信方法的流程图,如图5所示,本实施例的方法可以包括:
S301、服务器对实时生成的视频进行视频图像采集,得到视频流。
其中,视频流包括的多个视频帧中,每一个视频帧包括虚拟游戏画面组成的图像。
S302、服务器按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧 的预编码帧,预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧。
S303、若当前视频帧之前的M个编码帧中没有I帧,服务器根据预编码帧和目标预编码帧,确定当前视频帧是否是场景切换帧,场景切换帧为相对前一个视频帧发生了场景切换的视频帧,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,一个编码帧为对当前视频帧之前的一个视频帧编码得到,M为预设正整数。
S304、若当前视频帧是场景切换帧,服务器将当前视频帧编码为I帧,若当前视频帧不是场景切换帧,服务器将当前视频帧编码为P帧,得到当前视频帧的编码帧。
本实施例中,服务器对当前视频帧编码的具体实施方式可参见图2所示实施例中的描述,此处不再赘述。
S305、服务器根据当前视频帧的编码帧和当前视频帧之前的多个编码帧,得到码流。
S306、服务器向客户端发送码流。
S307、客户端根据码流显示视频画面。
本实施例提供的实时通信方法,通过服务器将视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧,若当前视频帧的前M个编码帧中没有I帧,则根据预编码帧和目标预编码帧确定当前视频帧是场景切换帧时(即发生了视频帧场景切换),将当前视频帧编码为I帧,否则将当前视频帧编码为P帧,其中的目标预编码帧为对当前视频帧的前一个视频帧预编码得到。从而,实现了在发生视频帧场景切换时插入I帧,插入I帧再编入P帧时,P帧占用的字节数减少,从而提高了编码效率。由于I帧的编码效率低于P帧的编码效率,本申请中是在确定当前视频帧的前M个编码帧中没有I帧并且发生视频帧场景切换时才插入I帧,可避免I帧插入过多,提高编码效率。
应理解的是,在云游戏场景中,只有解码端即上述终端设备对上述编码后的码流具有解码能力时,上述的图像编码方法才有实际意义,下面将提供一种最优解码配置的获取方法。
图6为本申请实施例提供的一种最优解码配置的获取方法的流程图,如图6所示,该方法包括:
S401、云服务器向终端设备发送解码能力问询请求。
S402、云服务器接收终端设备的解码能力响应数据,解码能力响应数据包括:终端设备的解码能力。
S403、云服务器根据终端设备的解码能力、云游戏类型和当前网络状态确定最优解码配置。
S404、云服务器向终端设备发送最优解码配置。
S405、终端设备通过最优解码配置对视频流的码流进行解码。
可选地,图7为本申请实施例提供的一种最优解码配置的获取方法的流程图,如图7所示,云服务器可以通过安装在终端设备的客户端向终端设备发送解码能力请求,终端设 备也可以通过该客户端向云服务器返回解码能力响应。其中,在云游戏场景中,该客户端可以是云游戏客户端。
可选地,解码能力问询请求用于请求获取表示终端设备的解码能力的反馈数据。
可选地,解码能力请求包括以下至少一项,但不限于此:协议版本号、具体解码协议查询。
可选地,协议版本号指的是云服务器支持的最低协议版本,该协议可以是解码协议。
可选地,具体解码协议查询指的是云服务器所要查询的解码协议,例如是视频解码协议H264或者H265等。
示例性地,解码能力请求的代码实现可以如下:
[codec_ability];编解码能力
version=1.0;云服务器支持的最低协议版本
type=16,17;查询H264,H265能力
关于该代码中各个数据结构的解释可参考下面的表1,本申请对此不再赘述。
其中,终端设备解码能力的数据结构可以如表1所示:
表1终端设备的解码能力的数据结构
Figure PCTCN2022137893-appb-000001
Figure PCTCN2022137893-appb-000002
其中,在各解码协议定义见表2:
表2解码协议
解码协议 枚举定义
H264 16
H265 17
AV1 48
终端设备在各解码协议支持的Profile定义见表3:
表3解码协议支持的Profile定义
Figure PCTCN2022137893-appb-000003
Figure PCTCN2022137893-appb-000004
终端设备在各解码协议支持的Level定义见表4
表4解码协议支持的Level定义
Figure PCTCN2022137893-appb-000005
Figure PCTCN2022137893-appb-000006
Figure PCTCN2022137893-appb-000007
终端设备所支持的Profile为和Level以二元组的方式列出,如设备A支持H264能力:(Baseline,Level51),(Main,Level51),(High,Level51)。
可选地,解码能力响应除包括终端设备的解码能力之外,还可以包括:对于云服务器所要查询的解码协议是否查询成功的标识、终端设备支持的协议版本号。
可选地,若对于云服务器所要查询的解码协议查询成功,则对于云服务器所要查询的解码协议是否查询成功的标识可以用0表示,若对于云服务器所要查询的解码协议查询失败,则对于云服务器所要查询的解码协议是否查询成功的标识可以用错误码表示,如001等。
可选地,协议版本号指的是终端设备支持的最低协议版本,该协议可以是解码协议。
可选地,终端设备的解码能力包括以下至少一项,但不限于此:终端设备支持的解码协议类型、该解码协议支持的Profile、Level以及性能等。
示例1,解码能力响应的代码实现可以如下:
Figure PCTCN2022137893-appb-000008
Figure PCTCN2022137893-appb-000009
关于该代码中各个数据结构的解释可参考下面的表1,本申请对此不再赘述。
示例2,若终端设备只支持部分解码协议,则返回支持的解码协议信息,这种情况的解码能力响应的代码实现可以如下:
Figure PCTCN2022137893-appb-000010
关于该代码中各个数据结构的解释可参考下面的表1,本申请对此不再赘述。
示例3,若终端设备不支持解码协议,则返回codecs=0,这种情况的解码能力响应的 代码实现可以如下:
[codec_ability];编解码能力
state=0;查询成功返回状态码0
version=1.0;终端设备协议版本
codecs=0;支持0个硬件codec
关于该代码中各个数据结构的解释可参考下面的表1,本申请对此不再赘述。
示例4,若对终端设备的解码能力请求失败,则返回具体的错误码,这种情况的解码能力响应的代码实现可以如下:
[codec_ability];编解码能力
state=-1;查询失败返回状态码-1
version=0.9;终端设备协议版本
关于该代码中各个数据结构的解释可参考下面的表1,本申请对此不再赘述。
可选地,对于越复杂的云游戏类型,云服务器在终端设备的解码能力范围之内选择越高能力,如在上述示例1中,选择profile3以及performances3,其中,云服务器可以按照云游戏类型与终端设备的解码能力之间的映射关系,选择最优解码配置,也可以按照其他选择规则来选择最优解码配置。
可选地,对于网络状态越差,云服务器可以在终端设备的解码能力范围之内选择越高能力,如在上述示例1中,选择profile3以及performances3,其中,云服务器可以按照网络状态与终端设备的解码能力之间的映射关系,选择最优解码配置,也可以按照其他选择规则来选择最优解码配置。
可选地,云服务器可以根据按照云游戏类型、网络状态与终端设备的解码能力之间的映射关系,选择最优解码配置,也可以按照其他选择规则来选择最优解码配置。
总之,本申请对如何确定最优解码配置不做限制。
综上,通过本实施例提供的技术方案,使得终端设备通过最优解码配置对视频流的码流进行解码,从而可以提高解码效果。
图8为本申请实施例提供的一种编码装置的结构示意图,如图8所示,该编码装置可以包括:第一编码模块11、确定模块12和第二编码模块13,
其中,第一编码模块11按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧,预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧;
确定模块12用于在当前视频帧之前的M个编码帧中没有I帧时,根据预编码帧和目标预编码帧,确定当前视频帧是否是场景切换帧,场景切换帧为相对前一个视频帧发生了场景切换的视频帧,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,一个编码帧为对当前视频帧之前的一个视频帧编码得到,M为预设正整数;以及
第二编码模块13用于在当前视频帧是场景切换帧时,将当前视频帧编码为I帧,在 当前视频帧不是场景切换帧时,将当前视频帧编码为P帧。
可选的,第一编码模块11还用于:若当前视频帧之前的M个编码帧中有I帧,则将当前视频帧编码为P帧,或者,按照二次编码方式和预设编码模式对当前视频帧进行编码。
可选的,第一编码模块11用于:根据预设的采样宽度和预设的采样长度对当前视频帧进行下采样,得到下采样的视频帧;按照预编码方式对下采样的视频帧进行预编码,得到当前视频帧的预编码帧。
可选的,确定模块12还用于:统计当前视频帧之前的编码帧的数量;当当前视频帧之前的编码帧的数量到达M时,确认当前视频帧之前的M个编码帧中是否没有I帧。
可选的,M的取值包括128。
可选的,确定模块12用于:根据预编码帧的大小和目标预编码帧的大小,确定当前视频帧是否是场景切换帧。
可选的,确定模块12用于:根据预编码帧的字节数和目标预编码帧的字节数,确定当前视频帧是否是场景切换帧。
可选的,确定模块12具体用于:若预编码帧的字节数和目标预编码帧的字节数的比值大于或等于预设阈值,则确定当前视频帧是场景切换帧;若预编码帧的字节数和目标预编码帧的字节数的比值小于预设阈值,则确定当前视频帧不是场景切换帧。
可选的,预设编码模式为固定量化参数CQP模式,预编码的编码模式为CQP模式。
可选的,二次编码方式包括:从第一个视频帧开始,根据预设周期进行I帧的编码,将编码为I帧的视频帧之后的视频帧编码为P帧。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图8所示的编码装置可以执行图2对应的方法实施例,并且编码装置中的各个模块的前述和其它操作和/或功能分别为了实现图2对应的方法实施例中的相应流程,为了简洁,在此不再赘述。
图9为本申请实施例提供的一种实时通信装置的结构示意图,如图9所示,该实时通信装置可以包括:采集模块21、第一编码模块22、确定模块23、第二编码模块24和发送模块25。
其中,接收模块21用于对实时生成的视频进行视频图像采集,得到视频流;
第一编码模块22用于按照预编码方式对视频流中的当前视频帧进行预编码,得到当前视频帧的预编码帧,预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,第一个视频帧之后的视频帧预编码为P帧;
确定模块23用于若当前视频帧之前的M个编码帧中没有I帧,根据预编码帧和目标预编码帧,确定当前视频帧是否是场景切换帧,场景切换帧为相对前一个视频帧发生了场景切换的视频帧,目标预编码帧为对当前视频帧的前一个视频帧预编码得到,一个编码帧为对当前视频帧之前的一个视频帧编码得到,M为预设正整数;
第二编码模块24用于若当前视频帧是场景切换帧,将当前视频帧编码为I帧,若当 前视频帧不是场景切换帧,将当前视频帧编码为P帧,得到当前视频帧的编码帧;
根据当前视频帧的编码帧和当前视频帧之前的多个编码帧,得到码流;
发送模块25用于向客户端发送码流,以使客户端根据码流显示视频画面。
可选的,多个视频帧中的每一个视频帧包括虚拟游戏画面组成的图像。
可选的,第二编码模块24还用于:若当前视频帧之前的M个编码帧中有I帧,则将当前视频帧编码为P帧,或者,按照二次编码方式和预设编码模式对当前视频帧进行编码。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图9所示的编码装置可以执行图5对应的方法实施例,并且编码装置中的各个模块的前述和其它操作和/或功能分别为了实现图5对应的方法实施例中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的编码装置。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图10是本申请实施例提供的电子设备的示意性框图。该电子设备可以是上述方法实施例中的服务器。
如图10所示,该电子设备可包括:
存储器210和处理器220,该存储器210用于存储计算机可读指令,并将该可读指令代码传输给该处理器220。换言之,该处理器220可以从存储器210中调用并运行计算机可读指令,以实现本申请实施例中的方法。
例如,该处理器220可用于根据该计算机可读指令中的指令执行上述方法实施例。
在本申请的一些实施例中,该处理器220可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器210包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory, RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机可读指令可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器210中,并由该处理器220执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述该计算机可读指令在该电子设备中的执行过程。
如图10所示,该电子设备还可包括:收发器230,该收发器230可连接至该处理器220或存储器210。
其中,处理器220可以控制该收发器230与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器230可以包括发射机和接收机。收发器230还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
本申请还提供了一种计算机存储介质,其上存储有计算机可读指令,该计算机可读指令被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机可读指令。在计算机上加载和执行该计算机可读指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以 硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
以上该,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (20)

  1. 一种编码方法,其特征在于,由服务器执行,包括:
    按照预编码方式对视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
    若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;以及
    若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧,若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    若所述当前视频帧之前的M个编码帧中有I帧,则将所述当前视频帧编码为P帧。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    若所述当前视频帧之前的M个编码帧中有I帧,则按照二次编码方式和预设编码模式对所述当前视频帧进行编码。
  4. 根据权利要求1所述的方法,其特征在于,所述按照预编码方式对所述当前视频帧进行预编码,得到所述当前视频帧的预编码帧,包括:
    根据预设的采样尺寸对所述当前视频帧进行下采样,得到下采样的视频帧;以及
    按照所述预编码方式对所述下采样的视频帧进行预编码,得到所述当前视频帧的预编码帧。
  5. 根据权利要求1所述的方法,其特征在于,若所述当前视频帧之前的M个编码帧中没有I帧之前,所述方法还包括:
    统计所述当前视频帧之前的编码帧的数量;以及
    当所述当前视频帧之前的编码帧的数量到达M时,确认所述当前视频帧之前的M个编码帧中是否有I帧。
  6. 根据权利要求1所述的方法,其特征在于,所述M的取值包括128。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,包括:
    根据所述预编码帧所包含的数据量大小和所述目标预编码帧所包含的数据量大小,确定所述当前视频帧是否是场景切换帧。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述预编码帧所包含的数据量大小和所述目标预编码帧所包含的数据量大小,确定所述当前视频帧是否是场景切换帧,包括:
    根据所述预编码帧的字节数和所述目标预编码帧的字节数,确定所述当前视频帧是否 是场景切换帧。
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述预编码帧的字节数和所述目标预编码帧的字节数,确定所述当前视频帧是否是场景切换帧,包括:
    若所述预编码帧的字节数和所述目标预编码帧的字节数的比值大于或等于预设阈值,则确定所述当前视频帧是场景切换帧;以及
    若所述预编码帧的字节数和所述目标预编码帧的字节数的比值小于所述预设阈值,则确定所述当前视频帧不是场景切换帧。
  10. 根据权利要求3所述的方法,其特征在于,所述预设编码模式为固定量化参数CQP模式,所述预编码的编码模式为所述CQP模式。
  11. 根据权利要求3所述的方法,其特征在于,所述二次编码方式包括:
    从第一个视频帧开始,根据预设周期进行I帧的编码,将编码为I帧的视频帧之后的视频帧编码为P帧。
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,所述方法应用于云游戏场景,所述方法还包括:
    向终端设备发送解码能力问询请求;
    接收所述终端设备的解码能力响应数据,所述解码能力响应数据包括:所述终端设备的解码能力;
    根据所述终端设备的解码能力、云游戏类型和当前网络状态,确定最优解码配置;以及
    向所述终端设备发送所述最优解码配置,以使所述终端设备通过所述最优解码配置对所述视频流的码流进行解码。
  13. 一种实时通信方法,其特征在于,由服务器执行,包括:
    对实时生成的视频进行视频图像采集,得到视频流;
    按照预编码方式对所述视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个所述视频流中的第一个视频帧预编码为I帧,所述第一个视频帧之后的视频帧预编码为P帧;
    若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,一个所述编码帧为对所述当前视频帧之前的一个视频帧编码得到,所述M为预设正整数;
    若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧,若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧,得到所述当前视频帧的编码帧;
    根据所述当前视频帧的编码帧和所述当前视频帧之前的多个编码帧,得到码流;以及
    向客户端发送所述码流,以使所述客户端根据所述码流显示视频画面。
  14. 根据权利要求13所述的方法,其特征在于,所述多个视频帧中的每一个视频帧 包括虚拟游戏画面组成的图像。
  15. 根据权利要求13所述的方法,其特征在于,若所述当前视频帧之前的M个编码帧中有I帧,则将所述当前视频帧编码为P帧,或者,按照二次编码方式和预设编码模式对所述当前视频帧进行编码。
  16. 一种编码装置,其特征在于,包括:
    第一编码模块,用于按照预编码方式对视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,所述第一个视频帧之后的视频帧预编码为P帧;
    确定模块,用于在所述当前视频帧之前的M个编码帧中没有I帧时,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,一个所述编码帧为对所述当前视频帧之前的一个视频帧编码得到,所述M为预设正整数;以及
    第二编码模块,用于在所述当前视频帧是场景切换帧时,将所述当前视频帧编码为I帧,在所述当前视频帧不是所述场景切换帧时,将所述当前视频帧编码为P帧。
  17. 一种实时通信装置,其特征在于,包括:
    采集模块,用于对实时生成的视频进行视频图像采集,得到视频流;
    第一编码模块,用于按照预编码方式对所述视频流中的当前视频帧进行预编码,得到所述当前视频帧的预编码帧,所述预编码方式,用于将一个视频流中的第一个视频帧预编码为I帧,将所述第一个视频帧之后的视频帧预编码为P帧;
    确定模块,用于若所述当前视频帧之前的M个编码帧中没有I帧,根据所述预编码帧和目标预编码帧,确定所述当前视频帧是否是场景切换帧,所述场景切换帧为相对前一个视频帧发生了场景切换的视频帧,所述目标预编码帧为对所述当前视频帧的前一个视频帧预编码得到,所述编码帧为对所述当前视频帧之前的视频帧编码得到,所述M为预设正整数;
    第二编码模块,用于若所述当前视频帧是场景切换帧,将所述当前视频帧编码为I帧,若所述当前视频帧不是所述场景切换帧,将所述当前视频帧编码为P帧,得到所述当前视频帧的编码帧;
    根据所述当前视频帧的编码帧和所述当前视频帧之前的多个编码帧,得到码流;以及
    发送模块,用于向客户端发送所述码流,以使所述客户端根据所述码流显示视频画面。
  18. 一种电子设备,其特征在于,包括:
    处理器和存储器,所述存储器用于存储计算机可读指令,所述处理器用于调用并运行所述存储器中存储的计算机可读指令,以执行权利要求1至15中任一项所述的方法。
  19. 一种计算机可读存储介质,其特征在于,用于存储计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
  20. 一种计算机程序产品,包括计算机可读指令,该计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
PCT/CN2022/137893 2022-01-27 2022-12-09 编码方法、实时通信方法、装置、设备及存储介质 WO2023142716A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/514,741 US20240098310A1 (en) 2022-01-27 2023-11-20 Encoding method, real-time communication method, apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210101697.6 2022-01-27
CN202210101697.6A CN116567228A (zh) 2022-01-27 2022-01-27 编码方法、实时通信方法、装置、设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/514,741 Continuation US20240098310A1 (en) 2022-01-27 2023-11-20 Encoding method, real-time communication method, apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2023142716A1 true WO2023142716A1 (zh) 2023-08-03

Family

ID=87470351

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137893 WO2023142716A1 (zh) 2022-01-27 2022-12-09 编码方法、实时通信方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20240098310A1 (zh)
CN (1) CN116567228A (zh)
WO (1) WO2023142716A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116801034A (zh) * 2023-08-25 2023-09-22 海马云(天津)信息技术有限公司 客户端保存音视频数据的方法和装置
CN117354524A (zh) * 2023-12-04 2024-01-05 腾讯科技(深圳)有限公司 编码器编码性能测试方法、装置、设备及计算机介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277519A (zh) * 2017-06-30 2017-10-20 武汉斗鱼网络科技有限公司 一种判断视频帧的帧类型的方法及电子设备
CN109413422A (zh) * 2018-11-05 2019-03-01 建湖云飞数据科技有限公司 结合图像质量和运动幅度的自适应插入i帧方法
CN109862359A (zh) * 2018-12-29 2019-06-07 北京数码视讯软件技术发展有限公司 基于分层b帧的码率控制方法、装置和电子设备
CN111263154A (zh) * 2020-01-22 2020-06-09 腾讯科技(深圳)有限公司 一种视频数据处理方法、装置及存储介质
CN112019850A (zh) * 2020-08-27 2020-12-01 广州市百果园信息技术有限公司 基于场景切换的图像组划分方法、视频编码方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277519A (zh) * 2017-06-30 2017-10-20 武汉斗鱼网络科技有限公司 一种判断视频帧的帧类型的方法及电子设备
CN109413422A (zh) * 2018-11-05 2019-03-01 建湖云飞数据科技有限公司 结合图像质量和运动幅度的自适应插入i帧方法
CN109862359A (zh) * 2018-12-29 2019-06-07 北京数码视讯软件技术发展有限公司 基于分层b帧的码率控制方法、装置和电子设备
CN111263154A (zh) * 2020-01-22 2020-06-09 腾讯科技(深圳)有限公司 一种视频数据处理方法、装置及存储介质
CN112019850A (zh) * 2020-08-27 2020-12-01 广州市百果园信息技术有限公司 基于场景切换的图像组划分方法、视频编码方法及装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116801034A (zh) * 2023-08-25 2023-09-22 海马云(天津)信息技术有限公司 客户端保存音视频数据的方法和装置
CN116801034B (zh) * 2023-08-25 2023-11-03 海马云(天津)信息技术有限公司 客户端保存音视频数据的方法和装置
CN117354524A (zh) * 2023-12-04 2024-01-05 腾讯科技(深圳)有限公司 编码器编码性能测试方法、装置、设备及计算机介质
CN117354524B (zh) * 2023-12-04 2024-04-09 腾讯科技(深圳)有限公司 编码器编码性能测试方法、装置、设备及计算机介质

Also Published As

Publication number Publication date
CN116567228A (zh) 2023-08-08
US20240098310A1 (en) 2024-03-21

Similar Documents

Publication Publication Date Title
WO2023142716A1 (zh) 编码方法、实时通信方法、装置、设备及存储介质
US10321138B2 (en) Adaptive video processing of an interactive environment
CN111277826B (zh) 一种视频数据处理方法、装置及存储介质
WO2021147448A1 (zh) 一种视频数据处理方法、装置及存储介质
CN112333448B (zh) 视频编码、解码方法和装置、电子设备和存储介质
CN112351285B (zh) 视频编码、解码方法和装置、电子设备和存储介质
WO2021057705A1 (zh) 视频编解码方法和相关装置
WO2021238546A1 (zh) 视频编码方法、视频播放方法、相关设备及介质
WO2021052500A1 (zh) 视频图像的传输方法、发送设备、视频通话方法和设备
WO2024169391A1 (zh) 一种视频数据处理方法、装置、计算机设备以及存储介质
CN112866746A (zh) 一种多路串流云游戏控制方法、装置、设备及存储介质
WO2021057686A1 (zh) 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
US20230388526A1 (en) Image processing method and apparatus, computer device, storage medium and program product
WO2023169424A1 (zh) 编解码方法及电子设备
WO2023071469A1 (zh) 视频处理方法、电子设备及存储介质
US12034944B2 (en) Video encoding method and apparatus, video decoding method and apparatus, electronic device and readable storage medium
WO2011106937A1 (zh) 一种图像编码方法和装置
CN115866297A (zh) 视频处理方法、装置、设备及存储介质
WO2021057478A1 (zh) 视频编解码方法和相关装置
WO2012154157A1 (en) Apparatus and method for dynamically changing encoding scheme based on resource utilization
WO2023142662A1 (zh) 图像编码方法、实时通信方法、设备、存储介质及程序产品
CN113259673B (zh) 伸缩性视频编码方法、装置、设备及存储介质
US12022088B2 (en) Method and apparatus for constructing motion information list in video encoding and decoding and device
WO2023130893A1 (zh) 流媒体传输方法、装置、电子设备及计算机可读存储介质
CN112351284B (zh) 视频编码、解码方法和装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22923505

Country of ref document: EP

Kind code of ref document: A1

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112024013435

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 2022923505

Country of ref document: EP

Effective date: 20240827