WO2023142662A1 - 图像编码方法、实时通信方法、设备、存储介质及程序产品 - Google Patents

图像编码方法、实时通信方法、设备、存储介质及程序产品 Download PDF

Info

Publication number
WO2023142662A1
WO2023142662A1 PCT/CN2022/135614 CN2022135614W WO2023142662A1 WO 2023142662 A1 WO2023142662 A1 WO 2023142662A1 CN 2022135614 W CN2022135614 W CN 2022135614W WO 2023142662 A1 WO2023142662 A1 WO 2023142662A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
image frame
idr
image
encoding
Prior art date
Application number
PCT/CN2022/135614
Other languages
English (en)
French (fr)
Inventor
曹健
曹洪彬
杨小祥
陈思佳
黄永铖
张佳
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US18/334,441 priority Critical patent/US20230328259A1/en
Publication of WO2023142662A1 publication Critical patent/WO2023142662A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/179Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scene or a shot

Definitions

  • the embodiments of the present application relate to the technical field of image processing, and in particular, to an image coding method, a real-time communication method, a device, a storage medium, and a program product.
  • the video or image processing process based on the cloud scene is relatively common.
  • the process can be roughly as follows: the cloud server generates video, collects video images, encodes the collected video images, and obtains the code stream of video images.
  • the code stream is sent to the terminal device, and the terminal device decodes the code stream, and finally displays the video image according to the decoding result.
  • the cloud server can encode the image frame in this case as an instant decoding refresh frame (Instantaneous Decoding Refresh Frame, IDR), because the IDR frame only adopts the intra-frame prediction method.
  • IDR Instant Decoding Refresh Frame
  • the cloud server calculates the similarity between the image frame and the previous image frame of the image frame, and if the similarity is greater than a preset threshold, the image frame is encoded as an IDR frame. That is to say, there is a preprocessing process before encoding each image frame.
  • the time consumption of the preprocessing process may be much longer than the encoding time, resulting in a large delay in the entire image processing process, especially for cloud scenarios with high real-time requirements. In other words, such a large delay will reduce the user experience.
  • Embodiments of the present application provide an image coding method, a real-time communication method, electronic equipment, a computer-readable storage medium, and a computer program product, which can reduce the time delay of the entire image processing process, especially for cloud scenarios with high real-time requirements. Since the image processing delay is reduced, the user experience can be improved.
  • An embodiment of the present application provides an image encoding method, including: acquiring a first image frame in a video stream; encoding the first image frame into a first non-IDR frame; if it is determined according to the first non-IDR frame that a scene switch has occurred , the next image frame of the first image frame is encoded as an IDR frame.
  • An embodiment of the present application provides a real-time communication method, including: acquiring user operation information sent by a terminal device; generating a video stream in real time according to the user operation information; acquiring the first image frame in the video stream; encoding the first image frame into a first Non-IDR frame; if it is determined that a scene switch has occurred according to the first non-IDR frame, the next image frame of the first image frame is encoded as an IDR frame to obtain the corresponding coded stream of the next image frame; the next image The encoded code stream corresponding to the frame is sent to the terminal device.
  • An embodiment of the present application provides an image encoding device, including: an acquisition module, an encoding module, and a judgment module; an acquisition module configured to acquire a first image frame in a video stream; an encoding module configured to encode the first image frame into a second image frame A non-IDR frame; the encoding module is further configured to encode the next image frame of the first image frame into an IDR frame if it is determined according to the first non-IDR frame that a scene switch has occurred.
  • An embodiment of the present application provides a real-time communication device, including: a processing module and a communication module; a communication module configured to obtain user operation information sent by a terminal device; a processing module configured to generate a video stream in real time according to the user operation information, and obtain the video stream
  • the first image frame in the first image frame is encoded as the first non-IDR frame; if it is determined that a scene switch has occurred according to the first non-IDR frame, the next image frame of the first image frame is encoded as an IDR frame to obtain The coded stream corresponding to the next image frame; the communication module is further configured to send the coded stream corresponding to the next image frame to the terminal device.
  • An embodiment of the present application provides an electronic device, including: a processor and a memory, the memory is configured to store a computer program, the processor is configured to invoke and run the computer program stored in the memory, and execute the computer program provided by the embodiment of the present application. method.
  • An embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method provided in the embodiment of the present application is implemented.
  • An embodiment of the present application provides a computer program product, including computer program instructions.
  • the computer program instructions are executed by a processor, the method provided in the embodiment of the present application is implemented.
  • An embodiment of the present application provides a computer program.
  • the computer program is executed by a processor, the method provided in the embodiment of the present application is implemented.
  • inter-frame scene switching is judged to determine whether a scene switching occurs, and when it is determined that a scene switching occurs, the next The image frame is encoded as an IDR frame; compared with the related art that performs a preprocessing process before encoding each image frame, the embodiment of the present application does not need to use the preprocessing process, but judges whether to If a scene switch occurs, the next image frame will be encoded as an IDR frame, which can reduce the delay of the entire image encoding process and improve the encoding efficiency of the image frame, especially for cloud scenes with high real-time requirements As far as the image processing time delay is reduced, the user experience can be improved.
  • FIG. 1 is a schematic diagram of a cloud game scene provided by an embodiment of the present application
  • FIG. 2 is a flow chart of an image encoding method provided in an embodiment of the present application.
  • FIG. 3 is a flow chart of another image coding method provided by an embodiment of the present application.
  • FIG. 4 is a flow chart of another image encoding method provided by the embodiment of the present application.
  • FIG. 5 is a flow chart of another image encoding method provided by the embodiment of the present application.
  • FIG. 6 is a flow chart of another image coding method provided by the embodiment of the present application.
  • FIG. 7 is an interactive flow chart of a real-time communication method provided by an embodiment of the present application.
  • FIG. 8 is a flowchart of a method for acquiring a target decoding configuration provided in an embodiment of the present application.
  • FIG. 9 is a flow chart of a coding and decoding coordination method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an image encoding device provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a real-time communication device provided by an embodiment of the present application.
  • Fig. 12 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • Video encoding Through compression technology, the original video format file is converted into another video format file, and the converted data can be called a code stream.
  • Video Decoding The reverse process of video encoding.
  • Intra-frame prediction Use the encoded pixels in the same image frame to predict the current pixel without referring to other encoded image frames.
  • Inter-frame prediction Use pixels in other encoded image frames to predict the current pixel, that is, you need to refer to the encoded image frame.
  • IDR frame A coded frame type defined in video coding technology.
  • the IDR frame only adopts intra-frame prediction coding, and the decoder can independently decode the content of the IDR frame without the information of other frames.
  • the IDR frame is generally used as a reference frame for subsequent frames, and also as an entry point for code stream switching.
  • P frame forward predictive coded frame
  • P frame is predicted from the coded frame before it
  • the encoder compares the same information or data between the current P frame and the coded frame before it, that is, considers the characteristics of motion to frame compression.
  • B frame bidirectional predictive interpolation coded frame, when encoding an image frame into a B frame, the encoder compresses the frame according to the difference between the data of the previous frame, this frame and the next frame adjacent to the B frame, and also That is, only the difference between the current frame and the previous and subsequent frames is recorded.
  • a P frame may include: intra-frame predicted pixels and inter-frame predicted pixels
  • a B frame may also include: intra-frame predicted pixels and inter-frame predicted pixels.
  • Non-IDR frame coded frames other than the IDR frame can be called non-IDR frames, and the non-IDR frames can be P frames or B frames, but are not limited thereto.
  • the general encoding frame is an IDR frame or a P frame, that is, in this case, a non-IDR frame refers to a P frame .
  • Embodiments of the present application provide an image encoding method, a real-time communication method, electronic equipment, a computer-readable storage medium, and a computer program product, which do not require preprocessing of image frames, and considering that the encoding speed of the encoder is very fast, in During the encoding process, it is judged whether a scene switch occurs, and if a scene switch occurs, the next image frame is encoded as an IDR frame.
  • Cloud gaming also known as gaming on demand, is an online gaming technology based on cloud computing technology. Cloud gaming technology enables thin clients with relatively limited graphics processing and data computing capabilities to run high-quality games.
  • the game is not run on the player's game terminal, but in the cloud server, and the cloud server renders the game scene into a video and audio stream, which is transmitted to the player's game terminal through the network.
  • the player's game terminal does not need to have powerful graphics computing and data processing capabilities, but only needs to have basic streaming media playback capabilities and the ability to obtain player input instructions and send them to the cloud server.
  • FIG. 1 is a schematic diagram of a cloud game scene provided by the embodiment of the present application.
  • the cloud server 110 and the player's game terminal 120 can communicate, and the cloud server 110 can run the game and collect game video images , encode the collected video images to obtain the code stream of the video image, the cloud server can send the code stream to the terminal device, the terminal device decodes the code stream, and finally displays the video image according to the decoding result.
  • connection between the cloud server 110 and the player's game terminal 120 can be through Long Term Evolution (Long Term Evolution, LTE), New Radio (New Radio, NR) technology, Wireless Fidelity (Wireless Fidelity, Wi-Fi) technology, etc. to achieve communication, but not limited thereto.
  • Long Term Evolution Long Term Evolution, LTE
  • New Radio New Radio
  • Wireless Fidelity Wireless Fidelity, Wi-Fi
  • the cloud server refers to the server that runs the game on the cloud and has functions such as video enhancement (pre-encoding processing), video encoding, etc., but is not limited thereto.
  • Terminal equipment refers to a type of equipment that has rich human-computer interaction methods, has the ability to access the Internet, is usually equipped with various operating systems, and has strong processing capabilities.
  • the terminal device may be a smart phone, a living room TV, a wearable device, a virtual reality (VR, Virtual Reality) device, a tablet computer, a vehicle terminal, a player game terminal, such as a handheld game console, etc., but is not limited thereto.
  • Fig. 2 is a flow chart of an image encoding method provided by an embodiment of the present application.
  • the method can be executed by a cloud server.
  • the cloud server can be the cloud server 110 in Fig. 1 .
  • the method includes:
  • S220 Encode the first image frame into a first non-IDR frame
  • the next image frame has been encoded as an IDR frame, that is to say, in a video, except that the first image frame is encoded as an IDR frame, there are still encoding errors due to scene switching.
  • the obtained IDR frame then the first image frame here refers to any image frame in a video stream except the first image frame and the encoded image frame, wherein the encoded image frame is determined to be encoded as IDR frame of the image frame.
  • the video stream is a video stream generated in real time.
  • the user operation information can be It is used to indicate the user's operation on the cloud game, such as: the user's operation on the joystick or buttons, etc., and the cloud server can generate a video stream in real time based on the user operation information.
  • the user's operation on the rocker or button includes: up, down, left, and right operations on the rocker or button, and the up, down, left, and right operations are used to control the movement of the virtual object displayed on the terminal, or to control the virtual object to perform corresponding operations. operations, etc., but not limited to this.
  • the cloud server can obtain the video data and render the video data into a video stream in real time.
  • the video stream can be a cloud game video in a cloud game scene, or a live video in an interactive live broadcast, or a video in a video conference or video call, which is not discussed in this embodiment of the present application. Do limit.
  • the cloud server encodes the first image frame as an IDR frame, and encodes the second image frame as a non-IDR frame, but after judging the non-IDR frame, it is determined that there is a scene switch , then the third image frame can be encoded as an IDR frame.
  • the above-mentioned first image frame may be the second image frame here.
  • the cloud server After the cloud server encodes the first image frame into the first non-IDR frame, it can use the following implementable methods to determine whether scene switching occurs:
  • the cloud server can determine the intra-frame predicted pixel ratio of the first non-IDR frame; according to the intra-frame predicted pixel ratio of the first non-IDR frame, it is judged whether a scene switch occurs, but not limited thereto:
  • the intra-prediction pixel ratio of the first non-IDR frame is greater than the preset threshold, it is determined that a scene switch has occurred; if the intra-prediction pixel ratio of the first non-IDR frame is less than or equal to the preset threshold, then determine No scene switch occurs. Or, if the intra-prediction pixel ratio of the first non-IDR frame is greater than or equal to a preset threshold, it is determined that a scene switch has occurred; if the intra-prediction pixel ratio of the first non-IDR frame is less than a preset threshold, then it is determined that no scene occurs switch.
  • the ratio of intra-frame predicted pixels refers to the ratio of intra-frame predicted pixels to all pixels in the first non-IDR frame.
  • the intra-frame prediction pixel refers to a pixel using an intra-frame prediction manner.
  • the above-mentioned preset threshold may be negotiated between the cloud server and the terminal device, or may be predefined, or may be specified by the cloud server, or specified by the terminal device. Do limit.
  • the preset threshold may be 60%, 70%, 80%, 90%, etc., which is not limited in this embodiment of the present application.
  • the second possible implementation is that if the inter-predicted pixel ratio of the first non-IDR frame is smaller than the preset threshold, it is determined that a scene switch has occurred; if the inter-predicted pixel ratio of the first non-IDR frame is greater than or equal to the preset threshold, then it is determined that No scene switch occurs. Or, if the inter-predicted pixel ratio of the first non-IDR frame is less than or equal to the preset threshold, it is determined that a scene switch has occurred; if the inter-predicted pixel ratio of the first non-IDR frame is greater than the preset threshold, then it is determined that no scene occurs switch.
  • the ratio of inter-frame prediction pixels refers to the ratio of inter-frame prediction pixels to all pixels in the first non-IDR frame.
  • the inter-frame prediction pixel refers to a pixel adopting an inter-frame prediction manner.
  • the above-mentioned preset threshold may be negotiated between the cloud server and the terminal device, or may be predefined, or may be specified by the cloud server, or specified by the terminal device. Do limit.
  • the preset threshold may be 10%, 20%, 30%, 40%, etc., which is not limited in the present application.
  • the next image frame of the first image frame is encoded as a non-IDR frame.
  • the cloud server can obtain the first image frame; encode the first image frame into the first non-IDR frame; judge whether scene switching has occurred according to the first non-IDR frame; switch, the next image frame of the first image frame is encoded as an IDR frame. Because the encoding speed of the cloud server is usually very fast, therefore, in the embodiment of the present application, it is not necessary to use the preprocessing process, but to determine whether a scene switch occurs during the encoding process, and if a scene switch occurs, the next image frame Encoded into IDR frames, which can reduce the delay of the entire image processing process, especially for cloud scenarios with high real-time requirements, because the image processing delay is reduced, which can improve user experience.
  • Fig. 3 is a flow chart of another image encoding method provided by the embodiment of the present application.
  • This method can be executed by a cloud server.
  • the cloud server can be the cloud server 110 in Fig. 1.
  • the example does not limit the subject of execution of the image coding method, as shown in Figure 3, the method includes:
  • S320 Encode the first image frame into a first non-IDR frame
  • S330 Determine whether scene switching occurs according to the first non-IDR frame
  • the encoder For intra-frame prediction pixels, the encoder needs to allocate a larger code rate for such pixels, and for inter-frame prediction pixels, the encoder needs to allocate a smaller code rate for such pixels, based on Therefore, for the first non-IDR frame, the encoder needs to allocate a larger bit rate for it, but in the case of a certain encoding bit rate, if the bit rate allocated for the first non-IDR frame is larger, the encoder will The bit rate allocated to other subsequent image frames is reduced, thereby reducing the overall quality of the video.
  • the cloud server can discard the code stream corresponding to the first non-IDR frame, encode the previous image frame of the first image frame into the second non-IDR frame, and obtain the code stream corresponding to the second non-IDR frame to solve the above-mentioned problem. technical problem. This is because the repeated frame, that is, the previous image frame is encoded again. Since the two image frames are the same, when the inter-frame encoding method is used, the encoding bit rate can be greatly reduced, so that under a certain encoding bit rate, the encoding The device can give more bit rates to other subsequent image frames, thereby improving the overall image quality of the video.
  • FIG. 4 is a flow chart of another image encoding method provided in the embodiment of the present application, and the method includes the following steps:
  • the capacity of the image memory is the size of one frame of image.
  • S403 The image acquisition terminal inputs the latest acquired image to the encoder
  • S550 Acquire a second image frame, where the second image frame is an image frame that is separated from the first image frame by a preset distance in the video stream, and there is no other IDR frame between the second image frame and the first image frame;
  • S560 Encode the second image frame into an IDR frame.
  • this example is a combination example of the embodiment corresponding to FIG. 3 and the solution of fixedly inserting IDR frames.
  • the embodiment corresponding to FIG. 2 can also be combined with the solution of fixedly inserting IDR frames. This will not be repeated here.
  • the solution of inserting an IDR frame fixedly refers to inserting an IDR frame at intervals of fixed image frames.
  • the encoder can encode the first frame as an IDR frame, the 11th frame as an IDR frame, and the 21st frame as an IDR frame, that is, insert an IDR frame every 10 frames, and the remaining frames Both are non-IDR frames.
  • the above-mentioned second image frame is an IDR frame inserted according to the solution of inserting an IDR frame fixedly, and the IDR frame is the next IDR frame of the IDR frame corresponding to the first image frame.
  • the above-mentioned preset distance is the interval fixed image frame inserted into an IDR frame every interval of the fixed image frame.
  • the target image frame includes 100 image frames
  • an image frame is inserted every 10 image frames
  • the non-IDR frame corresponding to the 80th image frame is judged, it is determined that a scene switch has occurred
  • the 81st image frame can be encoded as an IDR frame, and then according to the solution of inserting the IDR frame fixedly, the 91st image frame should be an IDR frame.
  • the embodiment of the present application is not limited to only inserting IDR frames in the case of scene switching, and may also be combined with a solution of fixedly inserting IDR frames, thereby improving the flexibility of inserting IDR frames.
  • S620 Encode the first image frame into a first non-IDR frame
  • S630 Determine whether the image frame is encoded into an IDR based on scene switching in the preset image frame sequence to which the first image frame belongs;
  • this embodiment is an improved solution of the embodiment corresponding to FIG. 3 .
  • a similar improved solution may also be adopted for the embodiment corresponding to FIG. 2 , which will not be repeated in this application.
  • the preset image frame sequence to which the above-mentioned first image frame belongs may be negotiated between the cloud server and the terminal device, may also be predefined, may also be specified by the cloud server, or specified by the terminal device , which is not limited in this embodiment of the present application.
  • the cloud server and the terminal device negotiate to judge whether scene switching occurs only once for every 10 image frames, then for the first image frame to the tenth image frame, the cloud server judges whether scene switching occurs once, for From the eleventh image frame to the twentieth image frame, the cloud server judges whether a scene switch occurs once, and for the twenty-first image frame to the thirtieth image frame, the cloud server judges whether a scene switch occurs once, so that analogy. Based on this, for any one of the first image frame to the tenth image frame, the preset image frame sequence to which it belongs is the image frame sequence formed by the first image frame to the tenth image frame.
  • the insertion frequency of IDR frames can be reduced, and the IDR frame generally corresponds to a relatively high code rate.
  • the code rate consumption can be reduced, thereby Can improve video quality.
  • FIG. 7 is an interactive flow chart of a real-time communication method provided by an embodiment of the present application.
  • the method can be executed by a remote server and a terminal device.
  • the method includes:
  • S710 the terminal device sends user operation information to the cloud server
  • S720 The cloud server generates video streams in real time according to user operation information
  • S730 the cloud server acquires the first image frame in the video stream
  • the cloud server encodes the first image frame into a first non-IDR frame
  • the cloud server encodes the next image frame of the first image frame into an IDR frame, and obtains an encoded code stream corresponding to the next image frame;
  • the cloud server can obtain the video data and render the video data into a video stream in real time.
  • the cloud server before the cloud server determines whether a scene switch occurs according to the first non-IDR frame, it further includes: determining whether there is a scene switch based on the preset image frame sequence to which the first image frame belongs , the situation of encoding the image frame as IDR; correspondingly, if there is no case of encoding the image frame as IDR based on scene switching in the preset image frame sequence, then the cloud server may encode the image frame as IDR according to the first non-IDR frame Determine whether a scene switch has occurred.
  • the real-time communication method since there is no need to use the existing preprocessing process, it is judged whether a scene switch has occurred during the encoding process. If a scene switch occurs, the next image frame is encoded as an IDR frame, thereby reducing the delay of the entire image processing process, thereby reducing the communication delay, thereby improving the user experience.
  • the encoding end adopts the repeated frame encoding method. Since the repeated frames are the same, when the inter-frame encoding method is used, the encoding rate can be greatly reduced, so that under the condition of a certain encoding rate, the encoder can Give more bitrates to other subsequent image frames, which can improve the overall quality of the video.
  • the above-mentioned image encoding method has practical significance only when the decoding end, that is, the above-mentioned terminal device has the ability to decode the code stream of the above-mentioned video stream.
  • the following will provide an acquisition of the target decoding configuration method.
  • FIG. 8 is a flow chart of a method for acquiring a target decoding configuration provided in an embodiment of the present application. As shown in FIG. 8, the method includes:
  • the cloud server receives a decoding capability response of the terminal device, where the decoding capability response includes: the decoding capability of the terminal device;
  • the cloud server determines the target decoding configuration according to the decoding capability of the terminal device, the cloud game type and the current network state;
  • the target decoding configuration can be the optimal decoding configuration.
  • the cloud server can send a decoding capability request to the terminal device through the client installed on the terminal device, and the terminal device can also return the decoding capability to the cloud server through the client.
  • Capability response correspondingly, the cloud server receives a decoding capability response, the decoding capability response includes: the decoding capability of the terminal device, and then determines the target decoding configuration (that is, the optimal decoding configuration ).
  • the client may be a cloud game client.
  • the decoding capability request includes at least one of the following, but not limited thereto: protocol version number, decoding protocol query.
  • the protocol version number refers to the minimum protocol version supported by the cloud server, and the protocol may be a decoding protocol.
  • the decoding protocol query refers to the decoding protocol to be queried by the cloud server, such as the video decoding protocol H264 or H265.
  • code implementation of the decoding capability request can be as follows:
  • codec_ability [codec_ability] ; codec capability
  • the decoding capability of the terminal device includes at least one of the following, but is not limited thereto: the type of decoding protocol supported by the terminal device, the Profile, Level, and performance supported by the decoding protocol, and the like.
  • Example 4 if the request for the decoding capability of the terminal device fails, an error code is returned.
  • the code implementation of the decoding capability response can be as follows:
  • the cloud server can select a higher capability within the decoding capability range of the terminal device, such as in the above example 1, select profile3 and performances3, wherein the cloud server can be based on the network status and The mapping relationship between the decoding capabilities of the terminal devices, the target decoding configuration can be selected, or the target decoding configuration can be selected according to other selection rules.
  • the terminal device can decode the code stream of the video stream through the target decoding configuration, so that the decoding effect can be improved.
  • An encoding module 1020 configured to encode the first image frame into a first non-IDR frame
  • the encoding module 1020 is further configured to encode the next image frame of the first image frame into an IDR frame if it is determined according to the first non-IDR frame that a scene switch has occurred.
  • the judging module 1030 is further configured to: determine the intra-predicted pixel ratio of the first non-IDR frame; and determine whether scene switching occurs according to the intra-predicted pixel ratio of the first non-IDR frame.
  • the judging module 1030 is further configured to: if the intra-frame predicted pixel ratio of the first non-IDR frame is greater than a preset threshold, determine that a scene switch has occurred; if the intra-frame predicted pixel ratio of the first non-IDR frame is less than or equal to the preset threshold, it is determined that scene switching does not occur.
  • the first image frame is any image frame in the video stream except the first image frame and any image frame determined to be encoded as an IDR frame.
  • the apparatus further includes: a communication module 1070 and a determination module 1080, wherein the communication module 1070 is configured to send a decoding capability request to the terminal device; receive a decoding capability response of the terminal device, and the decoding capability response includes: the terminal device's Decoding capability; the determination module 1080 is configured to determine the target decoding configuration according to the decoding capability of the terminal device, the cloud game type and the current network state; the communication module 1070 is also configured to send the target decoding configuration to the terminal device, so that the terminal device passes the target decoding Configure to decode the bitstream of the video stream.
  • the communication module 1070 is configured to send a decoding capability request to the terminal device; receive a decoding capability response of the terminal device, and the decoding capability response includes: the terminal device's Decoding capability; the determination module 1080 is configured to determine the target decoding configuration according to the decoding capability of the terminal device, the cloud game type and the current network state; the communication module 1070 is also configured to send the target decoding configuration to the terminal device,
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the device shown in FIG. 10 can execute the above-mentioned embodiment of the image coding method, and the aforementioned and other operations and/or functions of the various modules in the device are to realize the corresponding processes in the above-mentioned respective image coding methods, and for the sake of brevity, are not repeated here repeat.
  • the device in the embodiment of the present application is described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
  • Each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware decoding processing
  • the processor is executed, or the combination of hardware and software modules in the decoding processor is used to complete the execution.
  • Fig. 11 is a schematic diagram of a real-time communication device provided by an embodiment of the present application.
  • the device includes: a processing module 1110 and a communication module 1120, wherein the communication module 1120 is configured to acquire user operation information sent by a terminal device information; the processing module 1110 is configured to: generate a video stream in real time according to user operation information, obtain the first image frame in the video stream, encode the first image frame into a first non-IDR frame, and if it is determined according to the first non-IDR frame that a When the scene is switched, the next image frame of the first image frame is encoded as an IDR frame to obtain the encoded code stream corresponding to the next image frame; the communication module 1120 is also configured to send the encoded code stream corresponding to the next image frame to the terminal equipment.
  • the processing module 1110 is further configured to: if the intra-frame predicted pixel ratio of the first non-IDR frame is greater than a preset threshold, determine that a scene switch has occurred; if the intra-frame predicted pixel ratio of the first non-IDR frame is smaller than or is equal to the preset threshold, then it is determined that scene switching does not occur.
  • the processing module 1110 is further configured to add the first image frame into the reference frame list; if it is determined that a scene switch occurs, delete the first image frame from the reference frame list.
  • the processing module 1110 is further configured to obtain a second image frame; encode the second image frame into an IDR frame; wherein, the second image frame is separated from the first image frame by a preset distance in the video stream, and the second image frame is There are no other IDR frame image frames between the second image frame and the first image frame.
  • the processing module 1110 is further configured to determine whether the image frame is encoded into an IDR based on scene switching in the preset image frame sequence to which the first image frame belongs; correspondingly, the processing module 1110 also The configuration is as follows: if there is no case of encoding the image frame into IDR based on scene switching in the preset image frame sequence, it is judged according to the first non-IDR frame whether scene switching occurs.
  • the first image frame is any image frame in the video stream except the first image frame and any image frame determined to be encoded as an IDR frame.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the device shown in FIG. 11 can execute the above real-time communication method embodiment, and the aforementioned and other operations and/or functions of each module in the device are to realize the corresponding processes in the above-mentioned real-time communication methods, and for the sake of brevity, are not repeated here repeat.
  • the device in the embodiment of the present application is described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional module may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
  • Each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware decoding processing
  • the processor is executed, or the combination of hardware and software modules in the decoding processor is used to complete the execution.
  • the software module may be located in mature storage media in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • Fig. 12 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may include: a memory 1210 and a processor 1220 , the memory 1210 is configured to store a computer program, and transmit the program code to the processor 1220 .
  • the processor 1220 can call and run a computer program from the memory 1210, so as to implement the method in the embodiment of the present application.
  • the processor 1220 may be configured to execute the above-mentioned method embodiments according to instructions in the computer program.
  • the processor 1220 may include but not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 1210 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program can be divided into one or more modules, and the one or more modules are stored in the memory 1210 and executed by the processor 1220 to complete the method.
  • the one or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device.
  • the electronic equipment may also include:
  • Transceiver 1230 the transceiver 1230 can be connected to the processor 1220 or the memory 1210 .
  • the processor 1220 can control the transceiver 1230 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 1230 may include a transmitter and a receiver.
  • the transceiver 1230 may also include an antenna, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • the present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种图像编码方法、实时通信方法、设备、存储介质及程序产品,该方法包括:获取视频流中的第一图像帧;将第一图像帧编码为第一非IDR帧;若根据第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。

Description

图像编码方法、实时通信方法、设备、存储介质及程序产品
相关申请的交叉引用
本申请基于申请号为202210103019.3、申请日为2022年01月27日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请实施例涉及图像处理技术领域,尤其涉及一种图像编码方法、实时通信方法、设备、存储介质及程序产品。
背景技术
基于云场景中的视频或者图像的处理过程比较普遍,该过程大致可以是:云端服务器生成视频,进行视频图像采集,对采集到的视频图像进行编码,得到视频图像的码流,云端服务器可以将码流发送给终端设备,终端设备对该码流进行解码,最后按照解码结果进行视频图像的展示。
通常视频中可能存在场景切换的情况,针对这种情况下的图像帧,一般更依赖于帧内像素,以采用帧内预测方式对这类图像帧进行编码,换句话讲,针对场景切换的情况,云端服务器可以将这种情况下的图像帧编码为即时解码刷新帧(Instantaneous Decoding Refresh Frame,IDR),这是因为IDR帧仅采用帧内预测方式。
相关技术中,云端服务器在对每一图像帧进行编码之前,计算该图像帧与该图像帧的前一图像帧的相似度,若该相似度大于预设阈值,则将该图像帧编码为IDR帧。也就是说,在对每一图像帧进行编码之前存在一个预处理过程。然而,由于通常云端服务器的编码器编码速度非常快,该预处理过程的耗时可能远远大于编码耗时,导致整个图像处理过程时延较大,尤其对于实时性要求比较高的云场景而言,这种时延较大的情况,会降低用户体验感。
发明内容
本申请实施例提供一种图像编码方法、实时通信方法、电子设备、计算机可读存储介质及计算机程序产品,可以降低整个图像处理过程时延,尤其对于实时性要求比较高的云场景而言,由于降低了图像处理时延,从而可以提高用户体验感。
本申请实施例提供一种图像编码方法,包括:获取视频流中的第一图像帧;将第一图像帧编码为第一非IDR帧;若根据所述第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。
本申请实施例提供一种实时通信方法,包括:获取终端设备发送的用户操作信息;根据用户操作信息实时生成视频流;获取视频流中的第一图像帧;将第一图像帧编码为第一非IDR帧;若根据所述第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,得到下一图像帧对应的编码码流;将下一图像帧对应的编码码流发送给终端设备。
本申请实施例提供一种图像编码装置,包括:获取模块、编码模块和判断模块;获取模块,配置为获取视频流中的第一图像帧;编码模块,配置为将第一图像帧编码为第一非IDR帧;编码模块还配置为若根据所述第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。
本申请实施例提供一种实时通信装置,包括:处理模块和通信模块;通信模块,配置为获取终端设备发送的用户操作信息;处理模块,配置为根据用户操作信息实时生成视频流, 获取视频流中的第一图像帧,将第一图像帧编码为第一非IDR帧;若根据第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,得到下一图像帧对应的编码码流;通信模块,还配置为将下一图像帧对应的编码码流发送给终端设备。
本申请实施例提供一种电子设备,包括:处理器和存储器,该存储器,配置为存储计算机程序,该处理器,配置为调用并运行该存储器中存储的计算机程序,执行本申请实施例提供的方法。
本申请实施例提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时,实现本申请实施例提供的方法。
本申请实施例提供一种计算机程序产品,包括计算机程序指令,该计算机程序指令被处理器执行时,实现本申请实施例提供的方法。
本申请实施例提供一种计算机程序,计算机程序被处理器执行时,实现本申请实施例提供的方法。
本申请实施例提供的技术方案可以带来如下有益效果:
在本申请实施例中,在将第一图像帧编码为第一非IDR帧后,进行帧间场景切换的判断,以确定是否发生了场景切换,并在确定发生了场景切换时,将下一图像帧编码为IDR帧;相较于相关技术中在对每一图像帧进行编码之前均执行预处理过程的方案来说,本申请实施例无需采用预处理过程,而是在编码过程中判断是否发生了场景切换,如果发生了场景切换,则将下一图像帧编码为IDR帧,从而可以降低整个图像编码过程的时延,提高图像帧的编码效率,尤其对于实时性要求比较高的云场景而言,由于降低了图像处理时延,从而可以提高用户体验感。
附图说明
图1为本申请实施例提供的云游戏场景的示意图;
图2为本申请实施例提供的一种图像编码方法的流程图;
图3为本申请实施例提供的另一种图像编码方法的流程图;
图4为本申请实施例提供的再一种图像编码方法的流程图;
图5为本申请实施例提供的又一种图像编码方法的流程图;
图6为本申请实施例提供的另一种图像编码方法的流程图;
图7为本申请实施例提供的一种实时通信方法的交互流程图;
图8为本申请实施例提供的一种目标解码配置的获取方法的流程图;
图9为本申请实施例提供的一种编解码协同方法的流程图;
图10为本申请实施例提供的一种图像编码装置的示意图;
图11为本申请实施例提供的一种实时通信装置的示意图;
图12是本申请实施例提供的电子设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是一部分实施例,而不是全部的实施例。基于本申请实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
需要说明的是,本申请实施例中的术语“第一”、“第二”等是用于区别类似的对象,而 不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或服务器不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。
视频编码:通过压缩技术,将原始视频格式的文件转换成另一种视频格式文件的方式,转化成的数据可以称为码流。
视频解码:视频编码的逆向过程。
帧内预测:使用同一图像帧内已编码像素,来预测当前的像素,而无需参考其他已编码图像帧。
帧间预测:使用其他已编码图像帧中的像素,来预测当前的像素,也就是说,需要参考已编码图像帧。
IDR帧:视频编码技术中定义的一种编码帧类型,IDR帧仅采用帧内预测编码,解码器不需要其他帧的信息即可独立解码出IDR帧的内容。IDR帧一般作为后续帧的参考帧,也作为码流切换的切入点。
P帧:前向预测编码帧,P帧由它前面的已编码帧预测而来,编码器通过比较当前P帧与其前面已编码帧之间的相同信息或数据,也即考虑运动的特性进行帧间压缩。
B帧:双向预测内插编码帧,当把一图像帧编码成B帧时,编码器根据该B帧相邻的前一帧、本帧以及后一帧数据的不同点来压缩本帧,也即仅记录本帧与前后帧的差值。
需要说明的是,在一些情况下,P帧可以包括:帧内预测像素和帧间预测像素,同样的,B帧也可以包括:帧内预测像素和帧间预测像素。
非IDR帧:除了IDR帧以外的编码帧都可以称为非IDR帧,该非IDR帧可以是P帧或者B帧,但不限于此。
需要说明的是,对于一些对实时性要求较高的云场景,如云游戏场景,一般编码帧是IDR帧或者P帧,也就是说,在这种情况下,非IDR帧指的是P帧。
本申请实施例提供一种图像编码方法、实时通信方法、电子设备、计算机可读存储介质及计算机程序产品,无需对图像帧进行预处理,并且考虑到编码器编码速度非常快这一特性,在编码过程中来判断是否发生了场景切换,如果发生了场景切换,则将下一图像帧编码为IDR帧。
应理解的是,本申请实施例可以应用于云游戏场景,但不限于此:
云游戏(Cloud gaming)又可称为游戏点播(gaming on demand),是一种以云计算技术为基础的在线游戏技术。云游戏技术使图形处理与数据运算能力相对有限的轻端设备(thin client)能运行高品质游戏。在云游戏场景下,游戏并不在玩家游戏终端,而是在云端服务器中运行,并由云端服务器将游戏场景渲染为视频音频流,通过网络传输给玩家游戏终端。玩家游戏终端无需拥有强大的图形运算与数据处理能力,仅需拥有基本的流媒体播放能力与获取玩家输入指令并发送给云端服务器的能力即可。
示例性地,图1为本申请实施例提供的云游戏场景的示意图,如图1所示,云端服务器110与玩家游戏终端120之间可以通信,云端服务器110可以运行游戏,并采集游戏视频图像,对采集的视频图像进行编码,得到视频图像的码流,云端服务器可以将码流发送给终端设备,终端设备对该码流进行解码,最后按照解码结果进行视频图像的展示。
在一些实施例中,云端服务器110与玩家游戏终端120之间可以通过长期演进技术(Long Term Evolution,LTE)、新空口(New Radio,NR)技术、无线保真(Wireless Fidelity,Wi-Fi)技术等实现通信,但不限于此。
在云游戏场景中,云端服务器是指在云端运行游戏的服务器,并具备视频增强(编码 前处理)、视频编码等功能,但不限于此。
终端设备是指一类具备丰富人机交互方式、拥有接入互联网能力、通常搭载各种操作系统、具有较强处理能力的设备。终端设备可以是智能手机、客厅电视、可穿戴设备、虚拟现实(VR,Virtual Reality)设备、平板电脑、车载终端、玩家游戏终端,如掌上游戏主机等,但不限于此。
下面将对本申请实施例进行详细阐述:
图2为本申请实施例提供的一种图像编码方法的流程图,该方法可以由云端服务器执行,例如在云游戏场景中,该云端服务器可以是图1中的云端服务器110,本申请实施例对该图像编码方法的执行主体不做限制,如图2所示,该方法包括:
S210:获取视频流中的第一图像帧;
S220:将第一图像帧编码为第一非IDR帧;
S230:若根据第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。
应理解的是,在不考虑场景切换的情况时,通常一个视频中的第一个图像帧会被编码为IDR帧,后续的图像帧会被编码为非IDR帧,也就是说,一个视频被编码成IDR帧、非IDR帧、非IDR帧……
在考虑场景切换的情况下,从上述步骤S230可知,下一图像帧已经被编码为IDR帧,也就是说,一个视频中除了第一个图像帧被编码为IDR帧,还存在由于场景切换编码得到的IDR帧,那么这里的第一图像帧指的是一个视频流中除第一个图像帧以及已编码图像帧以外的任一图像帧,其中,已编码图像帧为已被确定编码为IDR帧的图像帧。
在一些实施例中,在实时通信场景中,该视频流就是实时生成的视频流,比如:在云游戏场景中,只要云端服务器获取到用户操作信息,在实际应用中,该用户操作信息,可以用于指示用户针对该云游戏的操作,比如:用户对摇杆或者按键的操作等,云端服务器便可以基于该用户操作信息实时生成视频流。
在一些实施例中,用户对摇杆或者按键的操作包括:针对该摇杆或者按键的上下左右操作,该上下左右操作用于控制终端所显示的虚拟对象的移动,或控制该虚拟对象执行相应的操作等,但不限于此。
在一些实施例中,用户操作信息与视频数据之间可以具有一定的对应关系,也即,用户针对游戏的控制操作与游戏画面具有对应关系,例如:用户按下某按键,则表示用户所操控的游戏角色需要拿起虚拟枪,那么游戏角色拿起虚拟枪就对应相应的视频数据,基于这种对应关系,云端服务器可以获取视频数据,并将这些视频数据实时渲染成视频流。
应理解的是,该视频流可以是云游戏场景中的云游戏视频,也可以是在交互类直播中的直播视频,又或者是视频会议或视频通话中的视频,本申请实施例对此不做限制。
示例性地,针对某视频流,假设云端服务器将第一个图像帧编码为IDR帧,将第二个图像帧编码为非IDR帧,但是经过对该非IDR帧判断,确定存在场景切换的情况,这时可以将第三个图像帧编码为IDR帧。而上述第一图像帧可以是这里的第二个图像帧。
需要说明的是,本申请实施例对云端服务器的编码方式不做限制。
在云端服务器将第一图像帧编码为第一非IDR帧之后,其可以采用如下可实现方式来判断是否发生了场景切换:
在一些实施例中,云端服务器可以确定第一非IDR帧的帧内预测像素比例;根据第一非IDR帧的帧内预测像素比例,判断是否发生了场景切换,但不限于此:
可实现方式一,若第一非IDR帧的帧内预测像素比例大于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧内预测像素比例小于或等于预设阈值,则确定未发生场景切换。或者,若第一非IDR帧的帧内预测像素比例大于或等于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧内预测像素比例小于预设阈值,则确定未发生场景切换。
应理解的是,对于场景切换的情况,由于该场景切换下的图像帧与其之前的图像帧的 像素差异较大,因此,这种情况下,针对该场景切换下的图像帧,云端服务器虽然将其编码为非IDR帧,但是该非IDR帧中的大多数像素采用的是帧内预测方式,基于此,如果一个非IDR帧中的帧内预测像素比例大于预设阈值,则确定发生了场景切换。或者,如果一个非IDR帧中的帧内预测像素比例大于或等于预设阈值,则确定发生了场景切换。
应理解的是,帧内预测像素比例指的是第一非IDR帧中帧内预测像素占所有像素的比例。其中,帧内预测像素指的是采用帧内预测方式的像素。
示例性地,假设第一非IDR帧中存在100个像素,而帧内预测像素是80个,那么该第一非IDR帧的帧内预测像素比例是80/100=80%。
在一些实施例中,上述预设阈值可以是云端服务器和终端设备协商的,也可以是预定义的,还可以是云端服务器指定的,又或者是终端设备指定的,本申请实施例对此不做限制。
在一些实施例中,上述预设阈值的取值可以是60%,70%,80%,90%等,本申请实施例对此不做限制。
可实现方式二,若第一非IDR帧的帧间预测像素比例小于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧间预测像素比例大于或等于预设阈值,则确定未发生场景切换。或者,若第一非IDR帧的帧间预测像素比例小于或等于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧间预测像素比例大于预设阈值,则确定未发生场景切换。
应理解的是,对于场景切换的情况,由于该场景切换下的图像帧与其之前的图像帧的像素差异较大,因此,这种情况下,针对该场景切换下的图像帧,云端服务器虽然将其编码为非IDR帧,但是该非IDR帧中的大多数像素采用的是帧内预测方式,少数像素采用的是帧间预测方式,基于此,如果一个非IDR帧中的帧间预测像素比例小于预设阈值,则确定发生了场景切换。或者,如果一个非IDR帧中的帧间预测像素比例小于或等于预设阈值,则确定发生了场景切换。
应理解的是,帧间预测像素比例指的是第一非IDR帧中帧间预测像素占所有像素的比例。其中,帧间预测像素指的是采用帧间预测方式的像素。
示例性地,假设第一非IDR帧中存在100个像素,而帧间预测像素是20个,那么该第一非IDR帧的帧间预测像素比例是20/100=20%。
在一些实施例中,上述预设阈值可以是云端服务器和终端设备协商的,也可以是预定义的,还可以是云端服务器指定的,又或者是终端设备指定的,本申请实施例对此不做限制。
在实际应用中,上述预设阈值的取值可以是10%,20%,30%,40%等,本申请对此不做限制。
在一些实施例中,若确定未发生场景切换,则将第一图像帧的下一图像帧编码为非IDR帧。
综上,在本申请实施例中,云端服务器可以获取第一图像帧;将第一图像帧编码为第一非IDR帧;根据第一非IDR帧判断是否发生了场景切换;若确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。由于通常云端服务器的编码速度非常快,因此,在本申请实施例中,无需采用预处理过程,而是在编码过程中判断是否发生了场景切换,如果发生了场景切换,则将下一图像帧编码为IDR帧,从而可以降低整个图像处理过程时延,尤其对于实时性要求比较高的云场景而言,由于降低了图像处理时延,从而可以提高用户体验感。
图3为本申请实施例提供的另一种图像编码方法的流程图,该方法可以由云端服务器执行,例如在云游戏场景中,该云端服务器可以是图1中的云端服务器110,本申请实施例对该图像编码方法的执行主体不做限制,如图3所示,该方法包括:
S310:获取视频流中的第一图像帧;
S320:将第一图像帧编码为第一非IDR帧;
S330:根据第一非IDR帧判断是否发生了场景切换;
S340:若确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧。
应理解的是,从上述判断是否发生了场景切换的方法可知,若确定发生了场景切换,说明第一非IDR帧中的大多数像素采用的是帧内预测方式,少数像素采用的是帧间预测方式,而由于对于帧内预测像素来讲,编码器需要为这类像素分配较大的码率,对于帧间预测像素来讲,编码器需要为这类像素分配较小的码率,基于此,对于第一非IDR帧来讲,编码器需要为它分配较大的码率,但是在编码码率一定的情况下,如果为第一非IDR帧分配的码率较大,导致编码器分配给后续其他图像帧的码率降低,从而降低视频整体的画质。基于此,云端服务器可以丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧,得到第二非IDR帧对应的码流,来解决上述技术问题。这是因为对重复帧,即上一图像帧再次进行编码,由于这两个图像帧相同,当采用帧间编码方式时,可以大大降低编码码率,从而在编码码率一定的情况下,编码器可以给后续其他图像帧更多的码率,从而可以提高视频整体的画质。
示例性地,图4为本申请实施例提供的再一种图像编码方法的流程图,该方法包括如下步骤:
S401:令i=0;
其中,i是计数值,在编码器的初始化阶段,设置i=0;
S402:初始化图像存储器;
其中,该图像存储器的容量为一帧图像的大小。编码器初始化完成后,进入S403。
S403:图像采集端向编码器输入最新采集到的图像;
S404:判断i的大小,若i的值为0,进入S405,若i的值不为0,S409;
S405:将最新采集到的图像编码为IDR帧,得到码流;
S406:更新图像存储器,存储该最新采集到的图像;
S407:计数值i增加1;
S408:编码器向终端设备传输上述码流,进入S403;
S409:将最新采集到的图像编码为非IDR帧,得到码流和帧内预测像素比例,并将最新采集到的图像加入编码器的参考帧列表中;
S410:判断帧内预测像素比例是否大于预设阈值,若帧内预测像素比例大于预设阈值,则确定发生了场景切换,进入S411;否则,进入S406;
S411:丢弃最新采集到的图像对应的码流;
S412:将最新采集到的图像移出参考帧列表;
即后续帧不会再参考该帧图像进行预测编码;
S413:将图像存储器中存储的图像编码为非IDR帧,得到码流;
S414:将计数值i重置为0,进入S408。
示例性地,针对某视频流,假设该视频流包括5个图像帧,按照图4对应的方法流程,云端服务器首先令i=0,并初始化图像存储器,假设图像采集端先采集到的是第一个图像帧,这时编码器可以将第一个图像帧编码为IDR帧,得到码流,将该码流输入给终端设备,并将第一图像帧存储至图像存储器中,此时令i=i+1,即i=1。若图像采集端采集到的是第二个图像帧,由于i=1,所以编码器可以将第二个图像帧编码为非IDR帧,得到码流以及该非IDR帧的帧内预测像素比例,并将第二个图像帧加入编码器的参考帧列表中,若该非IDR帧的帧内预测像素比例大于预设阈值,则确定发生了场景切换,并丢弃第二个图像帧对应的码流,将第二个图像帧移出参考帧列表,将图像存储器中存储的第一个图像帧编码为非IDR帧,得到码流,将该码流输入至终端设备,并将计数值i重置为0。若图像采集端采集到的是第三个图像帧,由于i=0,这时编码器可以将第三个图像帧编码为IDR帧,得到码流,将该码流输入给终端设备,并将第三图像帧存储至图像存储器中,此时令i=i+1, 即i=1。若图像采集端采集到的是第四个图像帧,由于i=1,所以编码器可以将第四个图像帧编码为非IDR帧,得到码流以及该非IDR帧的帧内预测像素比例,并将第四个图像帧加入编码器的参考帧列表中,若该非IDR帧的帧内预测像素比例小于或等于预设阈值,则确定未发生场景切换,更新图像存储器,即将图像存储器中的第三个图像帧更新为第四个图像帧,并将第四个图像帧对应的码流输入给终端设备,此时令i=i+1,即i=2。若图像采集端采集到的是第五个图像帧,由于i=5,所以编码器可以将第五个图像帧编码为非IDR帧,得到码流以及该非IDR帧的帧内预测像素比例,并将第五个图像帧加入编码器的参考帧列表中,若该非IDR帧的帧内预测像素比例小于或等于预设阈值,则确定未发生场景切换,更新图像存储器,即将图像存储器中的第四个图像帧更新为第五个图像帧,并将第五个图像帧对应的码流输入给终端设备,此时令i=i+1,即i=3。此时,5个图像帧已遍历完毕,则结束。
综上,在本申请实施例中,若确定发生了场景切换,则除了将第一图像帧的下一图像帧编码为IDR帧,还丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧,得到第二非IDR帧对应的码流。采用这种对重复帧的编码方式,由于重复帧相同,当采用帧间编码方式时,可以大大降低编码码率,从而在编码码率一定的情况下,编码器可以给后续其他图像帧更多的码率,从而可以提高视频整体的画质。
需要说明的是,本申请实施例不局限于:只在场景切换情况下插入IDR帧,还可以结合固定插入IDR帧的方案,以提高插入IDR帧的灵活性。
示例性,图5为本申请实施例提供的又一种图像编码方法的流程图,该方法可以由云端服务器执行,例如在云游戏场景中,该云端服务器可以是图1中的云端服务器110,本申请对该图像编码方法的执行主体不做限制,如图5所示,该方法包括:
S510:获取第一图像帧;
S520:将第一图像帧编码为第一非IDR帧;
S530:根据第一非IDR帧判断是否发生了场景切换;
S540:若确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧;
S550:获取第二图像帧,其中,第二图像帧是视频流中与第一图像帧间隔预设距离,且第二图像帧与第一图像帧之间无其他IDR帧的图像帧;
S560:将第二图像帧编码为IDR帧。
应理解的是,本示例是图3对应的实施例与固定插入IDR帧的方案的结合示例,实际上,也可以对图2对应的实施例与固定插入IDR帧的方案进行结合,本申请对此不再赘述。
应理解的是,固定插入IDR帧的方案指的是每间隔固定图像帧插入一个IDR帧。例如:针对某视频流,编码器可以将第1帧编码为IDR帧,第11帧编码为IDR帧,第21帧编码为IDR帧,也就是说,每间隔10帧插入一个IDR帧,其余帧均为非IDR帧。
应理解的是,上述第二图像帧就是按照固定插入IDR帧的方案所插入的IDR帧,且该IDR帧是第一图像帧对应的IDR帧的下一个IDR帧。上述预设距离就是每间隔固定图像帧插入一个IDR帧中的所间隔的固定图像帧。
示例性的,假设目标图像帧包括100个图像帧,假设每间隔10个图像帧插入一个图像帧,假设经过对第80个图像帧对应的非IDR帧进行判断,确定发生了场景切换,这时可以将第81个图像帧编码为IDR帧,再按照该固定插入IDR帧的方案,第91个图像帧应该为IDR帧。
综上,本申请实施例不局限于只在场景切换情况下插入IDR帧,还可以结合固定插入IDR帧的方案,从而提高了插入IDR帧的灵活性。
图6为本申请实施例提供的另一种图像编码方法的流程图,该方法可以由云端服务器执行,例如在云游戏场景中,该云端服务器可以是图1中的云端服务器110,本申请实施例对该图像编码方法的执行主体不做限制,如图6所示,该方法包括:
S610:获取视频流中的第一图像帧;
S620:将第一图像帧编码为第一非IDR帧;
S630:判断在第一图像帧所属的预设图像帧序列内是否已存在基于场景切换,将图像帧编码为IDR的情况;
S640:若在预设图像帧序列内不存在基于场景切换,将图像帧编码为IDR的情况,则根据第一非IDR帧判断是否发生了场景切换;
S650:若确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧。
应理解的是,本实施例是图3对应的实施例的改进方案,实际上,也可以对图2对应的实施例采用类似的改进方案,本申请对此不再赘述。
在一些实施例中,上述第一图像帧所属的预设图像帧序列可以是云端服务器和终端设备协商的,也可以是预定义的,还可以是云端服务器指定的,又或者是终端设备指定的,本申请实施例对此不做限制。
示例性地,假设云端服务器和终端设备协商针对每10个图像帧,只判断一次是否发生场景切换,那么对于第一个图像帧至第十个图像帧,云端服务器判断一次是否发生场景切换,对于第十一个图像帧至第二十个图像帧,云端服务器判断一次是否发生场景切换,对于第二十一个图像帧至第三十个图像帧,云端服务器判断一次是否发生场景切换,以此类推。基于此,对于第一个图像帧至第十个图像帧中的任一个图像帧,它所属的预设图像帧序列是第一个图像帧至第十个图像帧构成的图像帧序列。对于第十一个图像帧至第二十个图像帧中的任一个图像帧,它所属的预设图像帧序列是第十一个图像帧至第二十个图像帧构成的图像帧序列。对于第二十一个图像帧至第三十个图像帧中的任一个图像帧,它所属的预设图像帧序列是第二十一个图像帧至第三十个图像帧构成的图像帧序列,以此类推。
需要说明的是,一旦在预设图像帧序列内已经存在基于场景切换,则不对第一非IDR帧判断是否发生了场景切换,相反,一旦在预设图像帧序列内已经不存在基于场景切换,则对第一非IDR帧判断是否发生了场景切换。
示例性地,假设云端服务器和终端设备协商针对每10个图像帧,只判断一次是否发生场景切换,那么对于第一个图像帧至第十个图像帧,假设针对第一个图像帧未判断是否发生了场景切换,针对第二个图像帧判断是否发生了场景切换,那么针对第三个图像帧至第十个图像帧无需再判断是否发生了场景切换。
综上,通过本实施例提供的技术方案,可以降低IDR帧的插入频率,而IDR帧一般对应的码率较大,通过这种降低IDR帧的插入频率的方式,可以降低码率消耗,从而可以提高视频画质。
图7为本申请实施例提供的一种实时通信方法的交互流程图,该方法可以由远端服务器和终端设备执行,该方法包括:
S710:终端设备向云端服务器发送用户操作信息;
S720:云端服务器根据用户操作信息实时生成视频流;
S730:云端服务器获取视频流中的第一图像帧;
S740:云端服务器将第一图像帧编码为第一非IDR帧;
S750:云端服务器根据第一非IDR帧判断是否发生了场景切换;
S760:若确定发生了场景切换,则云端服务器将第一图像帧的下一图像帧编码为IDR帧,得到下一图像帧对应的编码码流;
S770:云端服务器将下一图像帧对应的编码码流发送给终端设备。
在一些实施例中,在实时通信场景中,该视频流就是实时生成的视频流,比如:在云游戏场景中,只要云端服务器获取到用户操作信息,在实际应用中,该用户操作信息,可以用于指示用户针对该云游戏的操作,比如:用户对摇杆或者按键的操作等,云端服务器便可以基于该用户操作信息实时生成视频流。
在一些实施例中,用户对摇杆或者按键的操作包括:针对该摇杆或者按键的上下左右操作,该上下左右操作用于控制终端所显示的虚拟对象的移动,或控制该虚拟对象执行相应的操作等,但不限于此。
在一些实施例中,用户操作信息与视频数据之间可以具有一定的对应关系,也即,用户针对游戏的控制操作与游戏画面具有对应关系,例如:用户按下某按键,则表示用户所操控的游戏角色需要拿起枪,那么用户拿起枪就对应相应的视频数据,基于此,基于这种对应关系,云端服务器可以获取视频数据,并将这些视频数据实时渲染成视频流。
在一些实施例中,若确定发生了场景切换,则云端服务器丢弃所述第一非IDR帧对应的码流,并将所述第一图像帧的上一图像帧编码为第二非IDR帧,得到所述上一图像帧对应的编码码流;将所述上一图像帧对应的编码码流发送给所述终端设备。
在一些实施例中,云端服务器根据所述第一非IDR帧的帧内预测像素比例判断是否发生了场景切换。
需要说明的是,关于云端服务器如何根据第一非IDR帧判断是否发生了场景切换,以及如何已经重复帧编码可参考上文,本申请对此不再赘述。
在一些实施例中,云端服务器还可以获取第二图像帧;将所述第二图像帧编码为IDR帧;其中,所述第二图像帧是所述视频流中与所述第一图像帧间隔预设距离,且所述第二图像帧与所述第一图像帧之间无其他IDR帧的图像帧。
需要说明的是,关于该可实现方式的内容可参考上本,本申请实施例对此不再赘述。
在一些实施例中,云端服务器在根据所述第一非IDR帧判断是否发生了场景切换之前,还包括:判断在所述第一图像帧所属的预设图像帧序列内是否已存在基于场景切换,将图像帧编码为IDR的情况;相应的,若在所述预设图像帧序列内不存在基于场景切换,将图像帧编码为IDR的情况,则云端服务器可以根据所述第一非IDR帧判断是否发生了场景切换。
需要说明的是,关于该可实现方式的内容可参考上本,本申请实施例对此不再赘述。
通过本申请实施例提供的实时通信方法,由于无需采用现有的预处理过程,而是在编码过程中来判断是否发生了场景切换,如果发生了场景切换,则将下一图像帧编码为IDR帧,从而可以降低整个图像处理过程时延,进而降低了通信时延,从而可以提高用户体验感。
此外,在该实时通信方法中,编码端采用重复帧编码方式,由于重复帧相同,当采用帧间编码方式时,可以大大降低编码码率,从而在编码码率一定的情况下,编码器可以给后续其他图像帧更多的码率,从而可以提高视频整体的画质。
应理解的是,在云游戏场景中,只有解码端即上述终端设备对上述视频流的码流具有解码能力时,上述的图像编码方法才有实际意义,下面将提供一种目标解码配置的获取方法。
图8为本申请实施例提供的一种目标解码配置的获取方法的流程图,如图8所示,该方法包括:
S810:云端服务器向终端设备发送解码能力请求;
S820:云端服务器接收终端设备的解码能力响应,解码能力响应包括:终端设备的解码能力;
S830:云端服务器根据终端设备的解码能力、云游戏类型和当前网络状态确定目标解码配置;
S840:云端服务器向终端设备发送目标解码配置;
S850:终端设备通过目标解码配置对视频流的码流进行解码。
这里,目标解码配置可以为最优解码配置,如图9所示,云端服务器可以通过安装在终端设备的客户端向终端设备发送解码能力请求,终端设备也可以通过该客户端向云端服务器返回解码能力响应,相应的,云端服务器接收解码能力响应,该解码能力响应包括: 终端设备的解码能力,然后根据终端设备的解码能力、云游戏类型和当前网络状态确定目标解码配置(即最优解码配置)。其中,在云游戏场景中,该客户端可以是云游戏客户端。
在一些实施例中,解码能力请求用于请求获取终端设备的解码能力。
在一些实施例中,解码能力请求包括以下至少一项,但不限于此:协议版本号、解码协议查询。
在一些实施例中,协议版本号指的是云端服务器支持的最低协议版本,该协议可以是解码协议。
在一些实施例中,解码协议查询指的是云端服务器所要查询的解码协议,例如是视频解码协议H264或者H265等。
示例性地,解码能力请求的代码实现可以如下:
[codec_ability]                  ;编解码能力
version=1.0                      ;云端服务器支持的最低协议版本
type=16,17                       ;查询H264,H265能力
关于该代码中各个数据结构的解释可参考表1,本申请对此不再赘述。
其中,终端设备解码能力的数据结构可以如表1所示:
表1
Figure PCTCN2022135614-appb-000001
Figure PCTCN2022135614-appb-000002
其中,在各解码协议定义见表2:
表2
Figure PCTCN2022135614-appb-000003
终端设备在各解码协议支持的Profile定义见表3:
表3
Figure PCTCN2022135614-appb-000004
终端设备在各解码协议支持的Level定义见表4
表4
Figure PCTCN2022135614-appb-000005
Figure PCTCN2022135614-appb-000006
Figure PCTCN2022135614-appb-000007
终端设备所支持的Profile为和Level以二元组的方式列出,如设备A支持H264能力:(Baseline,Level51),(Main,Level51),(High,Level51)。
在一些实施例中,解码能力响应除包括终端设备的解码能力之外,还可以包括:对于云端服务器所要查询的解码协议是否查询成功的标识、终端设备支持的协议版本号。
在一些实施例中,若对于云端服务器所要查询的解码协议查询成功,则对于云端服务器所要查询的解码协议是否查询成功的标识可以用0表示,若对于云端服务器所要查询的解码协议查询失败,则对于云端服务器所要查询的解码协议是否查询成功的标识可以用错误码表示,如001等。
在一些实施例中,协议版本号指的是终端设备支持的最低协议版本,该协议可以是解码协议。
在一些实施例中,终端设备的解码能力包括以下至少一项,但不限于此:终端设备支持的解码协议类型、该解码协议支持的Profile、Level以及性能等。
示例1,解码能力响应的代码实现可以如下:
Figure PCTCN2022135614-appb-000008
Figure PCTCN2022135614-appb-000009
关于该代码中各个数据结构的解释可参考表1,本申请对此不再赘述。
示例2,若终端设备只支持部分解码协议,则返回支持的解码协议信息,这种情况的解码能力响应的代码实现可以如下:
Figure PCTCN2022135614-appb-000010
关于该代码中各个数据结构的解释可参考表1,本申请实施例对此不再赘述。
示例3,若终端设备不支持解码协议,则返回codecs=0,这种情况的解码能力响应的代码实现可以如下:
Figure PCTCN2022135614-appb-000011
关于该代码中各个数据结构的解释可参考表1,本申请对此不再赘述。
示例4,若对终端设备的解码能力请求失败,则返回错误码,这种情况的解码能力响应的代码实现可以如下:
[codec_ability]                ;编解码能力
state=-1                            ;查询失败返回状态码-1
version=0.9                      ;终端设备协议版本
关于该代码中各个数据结构的解释可参考表1,本申请对此不再赘述。
在一些实施例中,对于越复杂的云游戏类型,云端服务器在终端设备的解码能力范围之内选择越高能力,如在上述示例1中,选择profile3以及performances3,其中,云端服务器可以按照云游戏类型与终端设备的解码能力之间的映射关系,选择目标解码配置,也可以按照其他选择规则来选择目标解码配置。
在一些实施例中,对于网络状态越差,云端服务器可以在终端设备的解码能力范围之内选择越高能力,如在上述示例1中,选择profile3以及performances3,其中,云端服务器可以按照网络状态与终端设备的解码能力之间的映射关系,选择目标解码配置,也可以 按照其他选择规则来选择目标解码配置。
在一些实施例中,云端服务器可以根据按照云游戏类型、网络状态与终端设备的解码能力之间的映射关系,选择目标解码配置,也可以按照其他选择规则来选择目标解码配置。
总之,本申请对如何确定目标解码配置不做限制。
综上,通过本实施例提供的技术方案,使得终端设备通过目标解码配置对视频流的码流进行解码,从而可以提高解码效果。
在一些实施例中,当云端服务器基于上述图像编码的实施例实现视频流的编码后,可以将编码得到的编码码流发送给终端设备,相应的,终端设备对接收到的编码码流进行解码,本申请实施例还提供一种图像解码方法,终端设备获取视频流对应的编码码流,该视频流包括第一图像帧及第一图像帧的下一图像帧;其中,编码码流包括:对第一图像帧编码编码得到的第一非IDR帧、以及根据第一非IDR帧确定发生了场景切换时,对第一图像帧的下一图像帧编码得到的IDR帧;终端设备对该编码码流进行解码,得到视频流并播放。
图10为本申请实施例提供的一种图像编码装置的示意图,如图10所示,该装置包括:获取模块1010和编码模块1020,其中,
获取模块1010,配置为获取视频流中的第一图像帧;
编码模块1020,配置为将第一图像帧编码为第一非IDR帧;
编码模块1020,还配置为若根据第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧。
在一些实施例中,该装置还包括:判断模块1030,配置为根据第一非IDR帧,判断是否发生了场景切换。
在一些实施例中,该装置还包括:丢弃模块1040,配置为若确定发生了场景切换,则丢弃第一非IDR帧对应的码流,编码模块1020还配置为将第一图像帧的上一图像帧编码为第二非IDR帧。
在一些实施例中,判断模块1030还配置为:确定第一非IDR帧的帧内预测像素比例;根据第一非IDR帧的帧内预测像素比例,判断是否发生了场景切换。
在一些实施例中,判断模块1030,还配置为:若第一非IDR帧的帧内预测像素比例大于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧内预测像素比例小于或等于预设阈值,则确定未发生场景切换。
在一些实施例中,该装置还包括:加入模块1050和删除模块1060,其中,在编码模块1020将第一图像帧编码为第一非IDR帧之后,加入模块1050,配置为将第一图像帧加入参考帧列表中;若确定发生了场景切换,则删除模块106,配置为将第一图像帧从参考帧列表中删除。
在一些实施例中,获取模块1010还配置为获取第二图像帧;编码模块1020还配置为将第二图像帧编码为IDR帧;其中,第二图像帧是视频流中与第一图像帧间隔预设距离,且第二图像帧与第一图像帧之间无其他IDR帧的图像帧。
在一些实施例中,判断模块1030还配置为:在根据第一非IDR帧判断是否发生了场景切换之前,获取所述第一图像帧所属的预设图像帧序列;判断在第一图像帧所属的预设图像帧序列内是否已存在基于场景切换,将图像帧编码为IDR的情况;相应的,判断模块1030还配置为:若在预设图像帧序列内不存在基于场景切换,将图像帧编码为IDR的情况,则根据第一非IDR帧判断是否发生了场景切换。
在一些实施例中,第一图像帧是视频流中除第一个图像帧以及已被确定编码为IDR帧以外的任一图像帧。
在一些实施例中,该装置还包括:通信模块1070和确定模块1080,其中,通信模块1070配置为向终端设备发送解码能力请求;接收终端设备的解码能力响应,解码能力响应包括:终端设备的解码能力;确定模块1080,配置为根据终端设备的解码能力、云游戏类型和当前网络状态确定目标解码配置;通信模块1070,还配置为向终端设备发送目标解码 配置,以使终端设备通过目标解码配置对视频流的码流进行解码。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。图10所示的装置可以执行上述图像编码方法实施例,并且装置中的各个模块的前述和其它操作和/或功能分别为了实现上述各个图像编码方法中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的装置。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。在一些实施例中,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图11为本申请实施例提供的一种实时通信装置的示意图,如图11所示,该装置包括:处理模块1110和通信模块1120,其中,通信模块1120,配置为获取终端设备发送的用户操作信息;处理模块1110配置为:根据用户操作信息实时生成视频流,获取视频流中的第一图像帧,将第一图像帧编码为第一非IDR帧,若根据第一非IDR帧确定发生了场景切换,则将第一图像帧的下一图像帧编码为IDR帧,得到下一图像帧对应的编码码流;通信模块1120,还配置为将下一图像帧对应的编码码流发送给终端设备。
在一些实施例中,处理模块1110还配置为:若确定发生了场景切换,则丢弃第一非IDR帧对应的码流,并将第一图像帧的上一图像帧编码为第二非IDR帧,得到上一图像帧对应的编码码流;通信模块1120还配置为将上一图像帧对应的编码码流发送给终端设备。
在一些实施例中,处理模块1110还配置为:根据第一非IDR帧的帧内预测像素比例,确定是否发生了场景切换。
在一些实施例中,处理模块1110还配置为:若第一非IDR帧的帧内预测像素比例大于预设阈值,则确定发生了场景切换;若第一非IDR帧的帧内预测像素比例小于或等于预设阈值,则确定未发生场景切换。
在一些实施例中,处理模块1110还配置为获取用户操作信息与视频数据之间的映射关系;相应的,处理模块1110还配置为:根据用户操作信息和映射关系,实时用户操作信息对应的视频数据;对视频数据进行实时渲染,以得到视频流。
在一些实施例中,处理模块1110还配置为将第一图像帧加入参考帧列表中;若确定发生了场景切换,则将第一图像帧从参考帧列表中删除。
在一些实施例中,处理模块1110还配置为获取第二图像帧;将第二图像帧编码为IDR帧;其中,第二图像帧是视频流中与第一图像帧间隔预设距离,且第二图像帧与第一图像帧之间无其他IDR帧的图像帧。
在一些实施例中,处理模块1110,还配置为判断在第一图像帧所属的预设图像帧序列内是否已存在基于场景切换,将图像帧编码为IDR的情况;相应的,处理模块1110还配置为:若在预设图像帧序列内不存在基于场景切换,将图像帧编码为IDR的情况,则根据第一非IDR帧判断是否发生了场景切换。
在一些实施例中,第一图像帧是视频流中除第一个图像帧以及已被确定编码为IDR帧以外的任一图像帧。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。图11所示的装置可以执行上述实时通信方法实施例,并且装置中的各个模块的前述和其它操作和/或功能分别为了实现上述各个实时通信方法中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的装置。应理解,该功能模块 可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。在实际应用中,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图12是本申请实施例提供的电子设备的示意性框图。如图12所示,该电子设备可包括:存储器1210和处理器1220,该存储器1210,配置为存储计算机程序,并将该程序代码传输给该处理器1220。换言之,该处理器1220可以从存储器1210中调用并运行计算机程序,以实现本申请实施例中的方法。
例如,该处理器1220可配置为根据该计算机程序中的指令执行上述方法实施例。
在本申请的一些实施例中,该处理器1220可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器1210包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器1210中,并由该处理器1220执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序在该电子设备中的执行过程。
如图12所示,该电子设备还可包括:
收发器1230,该收发器1230可连接至该处理器1220或存储器1210。
其中,处理器1220可以控制该收发器1230与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器1230可以包括发射机和接收机。收发器1230还可以包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
本申请还提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、 计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
以上该,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (19)

  1. 一种图像编码方法,所述方法包括:
    获取视频流中的第一图像帧;
    将所述第一图像帧编码为第一非即时解码刷新帧IDR帧;
    若根据所述第一非IDR帧确定发生了场景切换,则将所述第一图像帧的下一图像帧编码为IDR帧。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    若根据所述第一非IDR帧确定发生了场景切换,则丢弃所述第一非IDR帧对应的码流,并将所述第一图像帧的上一图像帧编码为第二非IDR帧。
  3. 根据权利要求1或2所述的方法,其中,所述方法还包括:
    获取所述第一非IDR帧的帧内预测像素比例;
    根据所述第一非IDR帧的帧内预测像素比例,确定是否发生了场景切换。
  4. 根据权利要求3所述的方法,其中,所述根据所述第一非IDR帧的帧内预测像素比例,确定是否发生了场景切换,包括:
    若所述第一非IDR帧的帧内预测像素比例大于预设阈值,则确定发生了场景切换;
    若所述第一非IDR帧的帧内预测像素比例小于或等于所述预设阈值,则确定未发生场景切换。
  5. 根据权利要求1或2所述的方法,其中,所述将所述第一图像帧编码为第一非即时解码刷新帧IDR帧之后,所述方法还包括:
    将所述第一图像帧加入参考帧列表中;
    若根据所述第一非IDR帧确定发生了场景切换,将所述第一图像帧从所述参考帧列表中删除。
  6. 根据权利要求1或2所述的方法,其中,所述方法还包括:
    获取所述视频流中的第二图像帧;
    将所述第二图像帧编码为IDR帧;
    其中,所述第二图像帧与所述第一图像帧间隔预设距离,且所述第二图像帧与所述第一图像帧之间无其他IDR帧。
  7. 根据权利要求1或2所述的方法,其中,所述将所述第一图像帧的下一图像帧编码为IDR帧之前,所述方法还包括:
    获取所述第一图像帧所属的预设图像帧序列;
    确定所述预设图像帧序列内是否存在基于场景切换,被编码为IDR的图像帧;
    若所述预设图像帧序列内不存在基于场景切换,被编码为IDR的图像帧,则根据所述第一非IDR帧,确定是否发生了场景切换。
  8. 根据权利要求1或2所述的方法,其中,所述第一图像帧是所述视频流中除第一个图像帧及已编码图像帧以外的任一图像帧;
    其中,所述已编码图像帧为已被确定编码为IDR帧的图像帧。
  9. 根据权利要求1或2所述的方法,其中,所述方法应用于云游戏场景,所述方法还包括:
    向终端设备发送解码能力请求;
    接收所述终端设备的解码能力响应,所述解码能力响应包括:所述终端设备的解码能力;
    根据所述终端设备的解码能力、云游戏类型和当前网络状态,确定目标解码配置;
    向所述终端设备发送所述目标解码配置,以使所述终端设备通过所述目标解码配置,对所述视频流的码流进行解码。
  10. 一种实时通信方法,所述方法包括:
    获取终端设备发送的用户操作信息;
    根据所述用户操作信息实时生成视频流;
    获取所述视频流中的第一图像帧;
    将所述第一图像帧编码为第一非IDR帧;
    若根据所述第一非IDR帧确定发生了场景切换,则将所述第一图像帧的下一图像帧编码为IDR帧,得到所述下一图像帧对应的编码码流;
    将所述下一图像帧对应的编码码流发送给所述终端设备。
  11. 根据权利要求10所述的方法,其中,所述方法还包括:
    若根据所述第一非IDR帧确定发生了场景切换,则丢弃所述第一非IDR帧对应的码流,并将所述第一图像帧的上一图像帧编码为第二非IDR帧,得到所述上一图像帧对应的编码码流;
    将所述上一图像帧对应的编码码流,发送给所述终端设备。
  12. 根据权利要求10或11所述的方法,其中,所述方法还包括:
    获取所述第一非IDR帧的帧内预测像素比例;
    根据所述第一非IDR帧的帧内预测像素比例,确定是否发生了场景切换。
  13. 根据权利要求10或11所述的方法,其中,所述根据所述用户操作信息实时生成视频流之前,所述方法还包括:
    获取所述用户操作信息与视频数据之间的映射关系;
    所述根据所述用户操作信息实时生成视频流,包括:
    根据所述用户操作信息和所述映射关系,实时获取所述用户操作信息对应的所述视频数据;
    对所述视频数据进行实时渲染,得到所述视频流。
  14. 一种图像解码方法,所述方法包括:
    获取视频流对应的编码码流,所述视频流包括第一图像帧及所述第一图像帧的下一图像帧;
    其中,所述编码码流包括:对所述第一图像帧编码编码得到的第一非IDR帧、以及根据所述第一非IDR帧确定发生了场景切换时,对所述第一图像帧的下一图像帧编码得到的IDR帧;
    对所述编码码流进行解码,得到所述视频流。
  15. 一种图像编码装置,所述装置包括:获取模块、编码模块;
    所述获取模块,配置为获取视频流中的第一图像帧;
    所述编码模块,配置为将所述第一图像帧编码为第一非IDR帧;
    所述编码模块,还配置为若根据所述第一非IDR帧确定发生了场景切换,则将所述第一图像帧的下一图像帧编码为IDR帧。
  16. 一种实时通信装置,包括:处理模块和通信模块;
    所述通信模块,配置为获取终端设备发送的用户操作信息;
    所述处理模块,配置为根据所述用户操作信息实时生成视频流,获取所述视频流中的第一图像帧,将所述第一图像帧编码为第一非IDR帧,若根据所述第一非IDR帧确定发生了场景切换,则将所述第一图像帧的下一图像帧编码为IDR帧,得到所 述下一图像帧对应的编码码流;
    所述通信模块,还配置为将所述下一图像帧对应的编码码流,发送给所述终端设备。
  17. 一种电子设备,包括:处理器和存储器;
    所述存储器,配置为存储计算机程序;
    所述处理器,配置为调用并运行所述存储器中存储的计算机程序,以执行权利要求1至14中任一项所述的方法。
  18. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序由处理器执行时,实现如权利要求1至14中任一项所述的方法。
  19. 一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中,处理器从所述计算机可读存储介质读取并执行所述计算机指令,以实现如权利要求1至14中任一项所述的方法。
PCT/CN2022/135614 2022-01-27 2022-11-30 图像编码方法、实时通信方法、设备、存储介质及程序产品 WO2023142662A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/334,441 US20230328259A1 (en) 2022-01-27 2023-06-14 Image encoding method, real-time communication method, device, storage medium, and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210103019.3A CN116567243A (zh) 2022-01-27 2022-01-27 图像编码方法、实时通信方法、设备及存储介质
CN202210103019.3 2022-01-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/334,441 Continuation US20230328259A1 (en) 2022-01-27 2023-06-14 Image encoding method, real-time communication method, device, storage medium, and program product

Publications (1)

Publication Number Publication Date
WO2023142662A1 true WO2023142662A1 (zh) 2023-08-03

Family

ID=87470344

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135614 WO2023142662A1 (zh) 2022-01-27 2022-11-30 图像编码方法、实时通信方法、设备、存储介质及程序产品

Country Status (3)

Country Link
US (1) US20230328259A1 (zh)
CN (1) CN116567243A (zh)
WO (1) WO2023142662A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100054329A1 (en) * 2008-08-27 2010-03-04 Novafora, Inc. Method and System for Encoding Order and Frame Type Selection Optimization
CN102630013A (zh) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 基于场景切换的码率控制视频压缩方法和装置
US9788077B1 (en) * 2016-03-18 2017-10-10 Amazon Technologies, Inc. Rendition switching
CN111970510A (zh) * 2020-07-14 2020-11-20 浙江大华技术股份有限公司 视频处理方法、存储介质及计算装置
CN113766226A (zh) * 2020-06-05 2021-12-07 深圳市中兴微电子技术有限公司 图像编码方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100054329A1 (en) * 2008-08-27 2010-03-04 Novafora, Inc. Method and System for Encoding Order and Frame Type Selection Optimization
CN102630013A (zh) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 基于场景切换的码率控制视频压缩方法和装置
US9788077B1 (en) * 2016-03-18 2017-10-10 Amazon Technologies, Inc. Rendition switching
CN113766226A (zh) * 2020-06-05 2021-12-07 深圳市中兴微电子技术有限公司 图像编码方法、装置、设备及存储介质
CN111970510A (zh) * 2020-07-14 2020-11-20 浙江大华技术股份有限公司 视频处理方法、存储介质及计算装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Master Thesis", 15 March 2011, UNIVERSITY OF ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA, CN, article XIONG, RONG: "A H.264 Rate Control Algorithm and Its Application and Research in Scene Switching", pages: 1 - 69, XP009547655 *

Also Published As

Publication number Publication date
CN116567243A (zh) 2023-08-08
US20230328259A1 (en) 2023-10-12

Similar Documents

Publication Publication Date Title
JP5746392B2 (ja) モバイルデバイスからワイヤレスディスプレイにコンテンツを送信するシステムおよび方法
CN114501062B (zh) 视频渲染协同方法、装置、设备及存储介质
WO2021147448A1 (zh) 一种视频数据处理方法、装置及存储介质
JP2017208849A (ja) 小待ち時間レート制御システムおよび方法
KR101266667B1 (ko) 장치 내 제어기에서 프로그래밍되는 압축 방법 및 시스템
WO2023142716A1 (zh) 编码方法、实时通信方法、装置、设备及存储介质
WO2019184639A1 (zh) 一种双向帧间预测方法及装置
US11438645B2 (en) Media information processing method, related device, and computer storage medium
WO2019128668A1 (zh) 视频码流处理方法、装置、网络设备和可读存储介质
WO2021057705A1 (zh) 视频编解码方法和相关装置
WO2019184556A1 (zh) 一种双向帧间预测方法及装置
CN115243074A (zh) 视频流的处理方法及装置、存储介质、电子设备
WO2021057686A1 (zh) 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
WO2021093882A1 (zh) 一种视频会议方法、会议终端、服务器及存储介质
JP2011511554A (ja) ビデオデータをストリーミングするための方法
US20230388526A1 (en) Image processing method and apparatus, computer device, storage medium and program product
WO2023142662A1 (zh) 图像编码方法、实时通信方法、设备、存储介质及程序产品
WO2023071469A1 (zh) 视频处理方法、电子设备及存储介质
CN110572672A (zh) 视频编解码方法和装置、存储介质及电子装置
CN116980392A (zh) 媒体流处理方法、装置、计算机设备和存储介质
CN115988269A (zh) 一种视频播放方法、装置、系统、电子设备和存储介质
JP2024517915A (ja) データ処理方法、装置、コンピュータ機器及びコンピュータプログラム
WO2021057478A1 (zh) 视频编解码方法和相关装置
WO2016032383A1 (en) Sharing of multimedia content
WO2023169424A1 (zh) 编解码方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22923453

Country of ref document: EP

Kind code of ref document: A1