WO2023061129A1 - Video encoding method and apparatus, device, and storage medium - Google Patents

Video encoding method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2023061129A1
WO2023061129A1 PCT/CN2022/118265 CN2022118265W WO2023061129A1 WO 2023061129 A1 WO2023061129 A1 WO 2023061129A1 CN 2022118265 W CN2022118265 W CN 2022118265W WO 2023061129 A1 WO2023061129 A1 WO 2023061129A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
target image
image frame
encoding
frame type
Prior art date
Application number
PCT/CN2022/118265
Other languages
French (fr)
Chinese (zh)
Inventor
包佳晶
张樱凡
Original Assignee
百果园技术(新加坡)有限公司
包佳晶
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 包佳晶 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2023061129A1 publication Critical patent/WO2023061129A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Definitions

  • the present application relates to the technical field of digital signal processing, and in particular to a video encoding method, device, equipment and storage medium.
  • the encoding device Before video data transmission, in order to reduce the amount of transmitted data and shorten the transmission time, the encoding device needs to encode the video image frame, compress the data amount of the video image frame, and increase the transmission speed.
  • the encoding device encodes the target image frame according to the determined frame type and encoding parameters to obtain a code stream corresponding to the target image frame.
  • the frame type and encoding parameters of the target image frame are predicted, and the encoding is performed according to the prediction result, so that stable video playback quality can be obtained.
  • Embodiments of the present application provide a video encoding method, device, equipment, and storage medium.
  • the technical solution is as follows:
  • a video coding method is provided, the method is executed by a computer device, and the method includes:
  • the first method refers to not according to the The method of determining the frame type and encoding parameters with reference to the image frame
  • a video encoding device includes:
  • An image frame acquisition module configured to acquire a target image frame to be encoded
  • the first encoding determination module is configured to determine the frame type and encoding parameters of the target image frame in a first manner when the number of reference image frames of the target image frame does not meet the analysis condition; wherein, the first One method refers to a method of determining the frame type and encoding parameters not according to the reference image frame;
  • An image frame encoding module configured to encode the target image frame according to the frame type and encoding parameters of the target image frame.
  • a computer device the computer device includes a processor and a memory, a computer program is stored in the memory, and the processor executes the computer program to implement the above video coding method .
  • a computer-readable storage medium where a computer program is stored in the storage medium, and the computer program is used to be executed by a processor, so as to implement the above video encoding method.
  • a computer program product which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.
  • the target image frame to be encoded does not meet the analysis conditions, the target image frame is directly encoded without waiting for the number of reference image frames of the target image frame to meet the analysis requirements, eliminating multiple other references waiting for the target image frame
  • the waiting time caused by the image frame reduces the encoding delay.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of a video coding method provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a video encoding method provided by another embodiment of the present application.
  • FIG. 4 is a schematic diagram of a video encoding method provided by another embodiment of the present application.
  • FIG. 5 is a schematic diagram of a video encoding method provided by another embodiment of the present application.
  • FIG. 6 is a schematic diagram of a video encoding process provided by another embodiment of the present application.
  • FIG. 7 is a block diagram of a video encoding device provided by an embodiment of the present application.
  • Fig. 8 is a block diagram of a video encoding device provided by another embodiment of the present application.
  • a frame refers to a still picture in a video.
  • FPS Fraes Per Second, frame rate
  • frame rate refers to the number of frames to refresh the picture per second, or the number of times the graphics processor refreshes per second. The higher the frame rate, the more pictures are refreshed in one second, the more realistic the video picture, and the smoother the action in the video.
  • DR Data Rate, code stream
  • Video coding refers to the process of removing redundant information in video images through specific compression techniques.
  • GOP Group of Picture, key frame period
  • I frames Intra Picture, internal picture
  • QP Quantizer Parameter, quantization parameter
  • the Y (Lumina, brightness) U (Chrominance, chroma) V (Chroma, chroma) signal is introduced and explained below.
  • YUV is a digital representation of color
  • Y is used to represent the grayscale value of the pixel
  • U and V are used to represent the color of the pixel, including the color and saturation of the pixel. Since the sensitivity of the human eye to changes in hue is lower than that of the human eye to brightness changes, greater compression can be performed in the chrominance dimension during the video encoding process, so that the code stream obtained after video encoding is smaller.
  • sampling formats of YUV images include:
  • the three component information of each pixel is complete (each component is usually 8 bits), and each uncompressed A pixel takes 3 bytes.
  • each macropixel consisting of two horizontally adjacent pixels requires 4 bytes of memory (2 bytes for brightness and 1 byte for each of the two chroma).
  • each scan line includes a chrominance component stored at a sampling rate of 2:1.
  • Adjacent scan lines store different chroma components, for example, if the first line is 4:2:0, the second line is 4:0:2, the third line is 4:2:0, and so on.
  • the sampling rate in the horizontal and vertical directions is 2:1, which is equivalent to a chroma sampling rate of 4:1.
  • each macropixel consisting of 2x2 adjacent pixels in 2 rows and 2 columns requires 6 bytes of memory (4 bytes for brightness, 1 byte for each of the two chroma ).
  • the code stream after sampling is: Y0 U0; Y1; Y2 U2; Y3; Y5V5; Y6; Y7V7; Y8.
  • Frame types mainly include the following categories:
  • I frame also known as a key frame
  • the I frame is used to represent the details of the image background and the moving subject, and the amount of information occupied by the data is relatively large.
  • the compression rate of the I frame is the smallest.
  • the I frame does not depend on other frames for decoding, and can be used as a reference image frame in the decoding process of other frame types.
  • the I frame is the basic frame of the GOP (the first frame in the GOP), and one I frame is included in a group of GOPs.
  • IDR frame is a kind of I frame, before decoding the IDR frame, the decoding device will clear the forward and backward reference buffer (the reference buffer is used to store in An image frame that has a reference role in the decoding process), so that any frame after the IDR frame cannot be decoded with reference to any frame before the IDR appears.
  • the IDR frame can prevent the transmission of wrong image frames from affecting the decoding results of other frames for a long time.
  • P Predictive-coded Picture, forward predictive coded image
  • the difference information between the current image frame and the previous frames is recorded in the P frame.
  • the compression rate of the P frame is relatively high.
  • B frame records the difference between the current frame and the preceding and following frames, and the maximum amount of compression in the encoding process.
  • B frame When decoding a B frame, it is necessary to refer to a forward or/and backward I frame or P frame, and the forward direction refers to an image frame that appears earlier than the current frame on the time axis.
  • Video coding includes predictive coding, transform coding, quantization and entropy coding, wherein predictive coding includes intra-frame predictive coding and inter-frame predictive coding.
  • Intra-frame predictive coding uses the correlation of the video space domain to compare adjacent macroblocks in the same frame.
  • Inter-frame predictive coding exploits temporal-spatial correlation of video.
  • the inter-frame predictive coding uses the pixels of the adjacent coded image frame to predict the corresponding pixel of the current image frame after coding, and obtains the residual signal.
  • Inter-frame predictive coding includes: motion estimation, MV prediction and weighted prediction, motion compensation, etc.
  • motion estimation is used to match the macroblock of the current image frame with the closest macroblock in the reference image frame, and motion estimation is used to find Referring to the macroblocks in the image frame, the macroblocks in the reference image frame that are most similar to the macroblocks in the current image frame are determined using matching criteria such as minimum mean square error and minimum mean absolute error.
  • Quantization refers to the processing of time-discrete signals to make them discrete in amplitude. Quantization can be divided into uniform quantization and non-uniform quantization, among which non-uniform quantization is better for small signals.
  • FIG. 1 shows a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • the implementation environment may include an encoding device 10 and a decoding device 20 .
  • the coding device 10 is a device having functions such as video coding, video data storage, and video data transceiving.
  • the encoding device 10 may be a device such as a computer, a mobile phone, a tablet computer, a smart TV, a video camera, or a vehicle system.
  • the encoding device 10 encodes a target image frame to be encoded in a certain frame.
  • the encoding device 10 directly specifies the encoding parameters of the target image frame, and encodes the target image frame according to the specified encoding parameters; optionally,
  • the encoding device 10 first predicts the frame type and encoding parameters of the target image frame according to other reference image frames, and then encodes the target image frame according to the encoding parameters.
  • the encoding device 10 encodes the target image frame and sends the code stream to the decoding device 20 in the form of a video data stream.
  • the decoding device 20 is a device having functions such as video encoding and video data sending and receiving.
  • the decoding device may be a server or a terminal device, and the server may be a background server running a target application program.
  • the target application program includes a video application program, a live broadcast application program, a social application program with a video communication function, etc., and the type of the target application program is not limited here.
  • the decoding device 20 may be a server, or a server cluster composed of multiple servers, or a cloud computing service center.
  • FIG. 2 shows a flow chart of a video coding method provided by an embodiment of the present application.
  • the execution subject of each step of the method may be a coding device, and the method may include the following steps (210-230) At least one step of:
  • Step 210 acquiring a target image frame to be encoded.
  • An image frame is a still picture, and multiple image frames are switched continuously to form a video.
  • the image frame is a signal in YUV format. Coding refers to the process in which a coding device compresses a picture in one format to obtain a picture in another format. The data volume of the encoded picture is smaller than the data volume of the original picture before encoding.
  • the encoding device acquires the target image frame to be encoded, and encodes the acquired target image frame.
  • the encoding device acquires the target image frame from a certain video file in the video database; optionally, the encoding device acquires the target image frame through a device with a video frame acquisition function such as a camera.
  • the target image frame can be a frame in a video file, or a frame of a scene captured by a screen capture device during a live video broadcast or a video call. limited.
  • the encoding device in the application scenario of video playback, can be set on the background server of the target application program, and the encoding device obtains a certain target image frame to be encoded from the database of the server, wherein the target image frame comes from the received to the video file corresponding to the play instruction command.
  • the camera captures a frame of live images every 25 ms, and the encoding device obtains the target image frame to be encoded through the camera.
  • Step 220 in the case that the number of reference image frames of the target image frame does not meet the analysis conditions, determine the frame type and encoding parameters of the target image frame according to the first method; wherein, the first method refers to determining the frame without referring to the reference image frame Type and way of encoding parameters.
  • the reference image frame and the target image frame are image frames in the same video file, or image frames collected by the same image acquisition device within a certain time interval.
  • the reference image frame may be an image frame that has been encoded, or an image frame that has not been encoded.
  • the reference image frame and the target image frame have part of the same features, and the target image frame can be encoded according to the information in the reference image frame.
  • the reference image frame is stored in the encoding device.
  • the analysis condition refers to the condition for judging the frame type of the target image frame and the way of determining the coding parameters.
  • the frame type is used to represent the compressibility of the target image frame.
  • the frame type includes I frame, P frame and B frame.
  • the coding parameters include code rate control parameters, and the coding parameters are used to keep the code rate stable during the coding process and reduce the distortion rate of the image frame during the coding process.
  • the analysis condition includes that the number of reference image frames of the target image frame is greater than or equal to a first threshold.
  • the encoding device determines the encoding method for acquiring the target image frame according to the number of reference image frames in the encoding device. If the number of reference image frames of the target image frame does not meet the analysis condition, Then the encoding device adopts the first method to determine the frame type and encoding parameters of the target image frame; otherwise, if the number of reference image frames of the target image frame satisfies the analysis condition, the encoding device uses other methods except the first method to determine the target image frame frame type and encoding parameters.
  • after obtaining the target image frame to be encoded it also includes: adding the target image frame to the pre-coding queue; when the length of the pre-coding queue is less than the rated length, determining the reference image frame of the target image frame The number of does not meet the analysis condition; when the length of the precoding queue is equal to the rated length, it is determined that the number of reference image frames of the target image frame meets the analysis condition.
  • the length of the pre-coding queue refers to the number of image frames stored in the pre-coding queue.
  • the nominal length is equal to the size of the pre-coding queue
  • the size of the pre-coding queue refers to the maximum number of image frames that can be accommodated in the predefined queue.
  • the encoding device adopts a first method to determine the frame type and encoding parameters of the acquired target image frame.
  • the encoding device after the encoding device acquires the target image frame to be encoded, it adds the target image frame to a pre-encoding queue, and multiple image frames stored in the pre-encoding queue are used as reference image frames of the target image frame.
  • the encoding device stores the frame type and encoding parameters of the target image frame to be encoded in the pre-encoding queue, for example, the encoding device stores the target image frame to be encoded, the frame type and encoding parameter corresponding to the target image frame Stored in the same unit of the pre-encoding queue; for another example, the encoding device separately stores the frame type and encoding parameters of a target image frame to be encoded in a unit of the pre-encoding queue, and this unit is used to store the The unit of the target image frame has a mapping relationship, and the encoding device determines the frame type and encoding parameters corresponding to the target image frame through the mapping relationship.
  • Step 230 Encode the target image frame according to the frame type and encoding parameters of the target image frame.
  • the encoding device encodes the target image frame according to the frame type and encoding parameters of the target image frame, obtains a code stream corresponding to the target image frame, and outputs the code stream corresponding to the target image frame.
  • the encoding device encodes a target image frame according to the frame type and encoding parameters of the target image frame to be encoded, please refer to the relevant background introduction, and details are not repeated here.
  • this method determines the frame type and coding parameters of the target image frame through the first method for the target image frame to be encoded whose number of reference image frames does not meet the analysis conditions, because the first method does not need to refer to the target image frame
  • the image frame determines the frame type and encoding parameters of the target image frame, thus eliminating the waiting time for the encoding device to acquire a sufficient number of reference image frames when the number of reference image frames of the target image frame does not meet the analysis requirements.
  • This method does not require Introducing too many parameters and complex calculation methods can effectively reduce the encoding delay.
  • FIG. 3 shows a flowchart of a video encoding method provided by another embodiment of the present application.
  • Step 310 acquiring a target image frame to be encoded.
  • the encoding device after obtaining the target image frame to be encoded, it also includes: adding the target image frame to the pre-encoding queue; when the length of the pre-encoding queue is less than the rated length, the encoding device determines the reference of the target image frame The number of image frames does not meet the analysis condition; when the length of the pre-coding queue is equal to the rated length, the encoding device determines that the number of reference image frames of the target image frame meets the analysis condition.
  • Step 320 if the number of reference image frames of the target image frame does not satisfy the analysis condition, determine a predefined frame type as the frame type of the target image frame.
  • the predefined frame type is stored in the encoding device, or the predefined frame type is determined according to a picture capture tool or the video to be encoded.
  • the predefined frame type can be determined according to actual application conditions such as network transmission delay, encoding capability of encoding equipment, and video playback quality requirements, and is not limited here in this application.
  • the encoding device determines the frame type of the target image frame according to a predefined frame type.
  • step 320 in a possible implementation manner, includes step 322, obtaining a predefined frame type arrangement, and the predefined frame type arrangement refers to a predetermined arrangement of frame types corresponding to a plurality of image frames respectively. ; Determine the frame type of the target image frame from the predefined frame type arrangement based on the sequence number of the target image frame.
  • step 320 includes step 324, obtaining predefined frame type configuration information, and the predefined frame type configuration information includes the configuration quantity of at least one frame type; Type and predefined frame type configuration information to determine the frame type of the target image frame.
  • a predefined arrangement of frame types is stored in the encoding device.
  • at least one of I frame, P frame and B frame is included in the predefined frame type configuration information, for example, the predefined frame type arrangement may include: IIIIIIIIII, IPPPIPPPIP, IBPBPIBPP, etc.
  • the arrangement of predefined frame types can be set according to experience, actual video playback requirements and other factors, which is not limited in this application.
  • the sequence number of the target image frame is determined according to the order in which the encoding device acquires the target image frames. For example, if a certain target image frame is the ninth image frame acquired by the encoding device, the sequence number of the target image frame is 9. In some embodiments, after the encoding device acquires the target image frame, it acquires a predefined frame type arrangement, and the encoding device determines the frame type of the target image according to the predefined frame type arrangement and the sequence number of the target image.
  • the encoding device acquires a certain target image frame to be encoded, and the target image frame is the first target image frame acquired by the encoding device, then the sequence number of the target image frame is 1, and the reference image frame of the target image frame
  • the predefined frame type arrangement stored in the encoding device is IPPBPP, and the encoding device determines that the frame type of the target image frame is I according to the sequence number of the target image frame and the predefined frame type arrangement. frame.
  • the length of the predefined frame type arrangement is determined according to the length of the precoding queue. For example, if the size of the precoding queue is 10 frames, the length of the predefined frame type arrangement is greater than or equal to 10.
  • the sequence number of the target image is determined according to the number of image frames stored in the pre-encoding queue.
  • the encoding device adds the target image frame to the pre-encoding queue after acquiring a certain target image frame to be encoded; By judging that the length of the pre-encoding queue is less than the rated length, the encoding device determines the frame type and encoding parameters of the target image frame through the predefined frame type arrangement and the sequence number of the target image frame.
  • the encoding device obtains the predefined frame type
  • the arrangement is IBPBPIBPP, after the encoding device acquires a target image frame to be analyzed, the number of image frames stored in the pre-encoding queue is 4, and the sequence number of the target image frame is 4, so the frame type of the target image frame is B frame.
  • the frame type of the encoded image frame refers to the frame type of the image frame whose encoding process is completed by the encoding device.
  • the encoding device counts the frame types of encoded image frames through variables, for example, a first variable, a second variable and a third variable are set in the encoding device, wherein the first variable is used to record the frame type Be the number of encoded image frames of I frame, the second variable is used to record the number of encoded image frames of P frame as the frame type, and the third variable is used to record the number of encoded image frames of B frame as the frame type; encoding After the device encodes the target image frame, it updates the value of the variable corresponding to the frame type. For example, the current value of the second variable is 3, and the value of the second variable is updated to 4 after the encoding device encodes a target image frame whose frame type is P frame.
  • the predefined configuration information is determined by the coding device, and the coding device determines according to requirements such as coding quality requirements and video playback quality.
  • the method of determining the predefined frame type configuration information depends on the actual situation. To limit.
  • the predefined frame type configuration information includes 5 I frames
  • the encoded image frame includes 1 I frame
  • the encoding device acquires a target image frame to be encoded and stores the encoding device In the pre-coding queue, when the number of image frames in the pre-coding queue is smaller than the size of the pre-coding queue, the coding device determines that the frame type of the target image frame is an I frame based on preconfigured frame type configuration information.
  • the encoding device when the encoding device stores an acquired target image frame in the pre-encoding queue, the encoding device according to the pre-configured frame type configuration information, the frame type of the encoded image frame and the pre-encoding queue Length, to determine the frame type of the target image frame, for example, the pre-configured frame type configuration information includes 3 I frames, the encoded image frame includes 2 I frames, the length of the pre-coding queue is 8, and the pre-coding queue The size is 10, and the encoding device determines that the frame type of the target image frame is an I frame according to the above information.
  • the predefined frame type configuration information includes the quantity of at least one frame type and the ratio of two different frame type data, for example, a certain predefined frame type configuration information includes 3 I frames and I frame Quantity/P frame quantity is less than or equal to 4.
  • the encoding device configures the predefined frame type configuration information to include 3 I frames, and the number of I frames/the number of P frames is less than or equal to 4 in consideration of the compression rate of video encoding, etc.
  • the coded image frame includes 3 I frames and 6 P frames.
  • the coding device acquires a target image frame to be coded. When the number of reference image frames of the target image frame does not meet the analysis conditions, based on the above The information identifies the frame type of the target image frame as a P frame.
  • Step 330 when the number of reference image frames of the target image frame does not satisfy the analysis condition, determine a predefined frame type as the frame type of the target image frame.
  • step 330 includes the following several sub-steps, step 332, obtains the predefined coding parameter configuration information, and the coding parameter configuration information includes at least one set of correspondence between frame types and coding parameters; step 334, from the coding parameter configuration information In, the encoding parameters corresponding to the frame type of the target image frame are determined as the encoding parameters of the target image frame.
  • Encoding parameters include bit rate, QP, etc.
  • the encoding device acquires a certain target image frame to be encoded, and stores the target image frame in a pre-encoding queue.
  • the encoding device passes a predefined The frame type arrangement IPPIPP and the sequence number 5 of the target image frame determine the frame type of the target image frame, and the encoding parameters of the target image frame are determined through predefined encoding parameters.
  • the encoding device acquires a target image frame to be encoded, and stores the target image frame in a pre-encoding queue.
  • the encoding device judges whether the reference image frame of the target image frame satisfies the encoding condition, that is, determines whether the number of image frames stored in the pre-encoding queue reaches the size of the pre-encoding queue.
  • the encoding device determines the predefined encoding parameters as the target image Encoding parameters for the frame.
  • the encoding device acquires the encoding parameters of the target image frame through a rate control model.
  • a rate control model For details about how the encoding device determines the encoding parameters of the target image frame through the rate control model, please refer to the following embodiment of encoding the target image frame in the second manner.
  • Step 340 in the case where the frame type of the target image frame is a key frame, use the intra prediction mode to encode the target image frame according to the encoding parameters of the target image frame; or, when the frame type of the target image frame is a non-key frame In this case, the target image frame is encoded according to the encoding parameters of the target image frame by using an inter-frame prediction mode.
  • the encoding device encodes the target image frame according to the frame type and encoding parameters of the target image frame.
  • the specific encoding process please refer to the detailed introduction in the background technology, and details are not repeated here.
  • Different encoding methods may be used to encode the target image frame according to the signal type of the target image frame and the encoding characteristics of the encoding device, and the specific encoding method is not limited in this application.
  • FIG. 4 shows a schematic diagram of a video encoding method provided by another embodiment of the present application.
  • the encoding device obtains a target image frame to be encoded, and determines that the number of reference image frames of the target image frame does not meet the analysis conditions by judging. As shown in FIG. 4
  • the image frame to be encoded acquired by the encoding device Be the 3rd frame image frame, the quantity of the reference image frame of the 3rd frame image frame is 2 (the reference image frame of the 3rd frame image frame includes the 1st frame and the 2nd frame), the analysis condition is 30, the encoding device passes the first The method determines that the frame type of the third image frame is an I frame (key frame), and obtains the encoding parameters corresponding to the third image frame through the code rate control model.
  • the encoding device performs intra-frame predictive encoding on the third image frame to obtain the residual value, and after encoding processes such as quantization and entropy encoding, finally obtains the code stream corresponding to the third image frame, and the encoding device outputs the corresponding code stream of the target image frame. Code stream, and prepare to acquire a new target image frame.
  • the target image frame is added to the pre-encoding queue; when the pre-encoding queue does not meet the rated length, the encoding device uses the first method to determine The frame type of the target image frame is a P frame, and the encoding parameters corresponding to the target image frame are determined through a code rate prediction model; the encoding device adds the frame type and encoding parameters of the target image frame to the pre-encoding queue; The target image frame is subjected to inter-frame prediction to obtain a residual value, and after encoding processing such as quantization and entropy encoding, the code stream corresponding to the key frame is finally obtained, and the encoding device outputs the code stream corresponding to the target image frame.
  • the encoded target image frame is stored in the encoding queue.
  • the encoding device does not use the image frames of the target image to predict the frame type and encoding parameters of the target image frame, and directly determines the target through the predefined frame type arrangement
  • the frame type of the image frame can save the time for the encoding device to determine the frame type of the target image frame, and avoid the situation that the encoding device obtains a sufficient number of reference image frames when the number of reference image frames of the target image frame does not meet the analysis conditions.
  • the video encoding method under the condition that the number of reference image frames of the target image frame satisfies the analysis condition is introduced below through several embodiments.
  • the target image frame to be coded after obtaining the target image frame to be coded, it further includes: when the number of reference image frames satisfies the analysis condition, determining the frame type and encoding parameters of the target image frame in a second manner; wherein, the second The method refers to the method of determining the frame type and encoding parameters according to the reference image frame.
  • the second method means that before formally encoding the target image frame, the encoding device needs to predict the frame type and encoding parameters of the target image frame according to the target image frame and the reference image frame, so as to obtain a better compression rate and reduce the video encoding loss.
  • the degree of distortion ensures that the code stream obtained after video encoding can produce a video picture with a stable bit rate after decoding.
  • the encoding device after the encoding device acquires the target image frame, it adds the target image frame to the pre-coding queue, and when the length of the pre-coding queue satisfies the rated length, the encoding device determines the frame type of the target image frame according to the second method and encoding parameters.
  • the other multiple image frames are already encoded image frames.
  • the pre-encoding queue includes the frame types of the other multiple image frames and encoding parameters.
  • all image frames in the precoding queue except the target image frame are reference image frames of the target image frame.
  • determining the frame type and encoding parameters of the target image frame according to the second method includes: determining the frame type of the target image frame according to the reference image frame of the target image frame; The encoding parameters of the type target image frame, the configuration parameters are used to constrain the bit rate of the target image frame.
  • determining the frame type of the target image frame according to the reference image frame includes: generating a plurality of frame type arrangements, each frame type arrangement including an assumed frame type of the target image frame and an assumed frame type following the assumed frame type at least one frame type; calculate the distortion rates corresponding to multiple frame type arrangements; wherein, the distortion rate corresponding to the frame type arrangement is determined by encoding the target image frame and at least one reference image frame according to the frame type arrangement; from a variety of In the frame type arrangement, determine the target frame type arrangement with the smallest distortion rate; determine the assumed frame type in the target frame type arrangement as the frame type of the target image frame.
  • an assumed frame type of the target image frame includes I-frame, P-frame and B-frame. At least one frame type following the assumed frame type refers to the assumed frame type of the reference image frame determined according to the assumed frame type of the target image frame.
  • the encoding device first assumes a frame type of the target image frame, and determines the frame type of the reference image frame one by one according to the assumed frame type of the target image frame.
  • a target image frame to be coded has 5 reference image frames
  • the assumed frame type of the target image frame is determined as I frame, according to the picture content of the target image frame and the assumed frame of the target image frame Type
  • the assumed frame type of the first reference image frame determined is B frame
  • the second reference image frame is based on the picture content of the target image frame
  • the assumed frame type of the target image frame, the determined assumed frame type is a B frame, repeat the above steps until the assumed frame types of all reference image frames of the target image frame are determined, and a frame type finally obtained is arranged as IBBPBB
  • the target image The frame type of the frame is determined as a B frame, and the above steps are repeated to determine the assumed frame types of
  • the encoding device calculates distortion rates corresponding to multiple frame type arrangements, wherein the distortion rates corresponding to the frame type arrangements are determined by the encoding device through rough encoding of the target image frame and the reference image frame based on the frame type arrangement. The smaller the distortion rate corresponding to the frame type arrangement is, the better the encoding quality of this frame type arrangement is.
  • the encoding device selects the frame type arrangement with the smallest distortion rate as the target frame type sequence, and uses the assumed frame type of the target image frame in the target frame type arrangement as the frame type of the target image frame.
  • the encoding parameters of the target image frame are determined through a rate control model.
  • the encoding parameters of the target image frame at least include: a code rate and a quantization parameter of the target image frame.
  • determining the encoding parameters of the target image frame includes: determining the code rate of the target image frame according to the configuration parameters, and determining the quantization parameter of the target image frame according to the frame type of the target image frame . Or, determine the bit rate and quantization parameters of the target image frame according to the configuration parameters.
  • the encoding device can set different configuration parameters according to different encoding requirements.
  • the encoding device uses an ABR (Average Bitrate, average bit rate) control model as a configuration parameter.
  • ABR Average Bitrate, average bit rate
  • the encoding device can also use rate control models such as CRF (Constant Rate Factor, constant rate factor), CQP (Constant Quantizer Parameter, constant quantization parameter) and R-QP (Rate-Quantization Parameter, rate-quantization) as configuration parameters
  • the configuration parameters are determined according to conditions such as network communication quality and coding requirements, which are not limited in this application.
  • the encoding device determines the configuration parameters based on the R-QP code rate control model, and the configuration parameters include the calculation formula of the code rate R i of the initial QP and the quadratic model:
  • R i represents the code rate of the target image frame
  • a 0 , a 1 and a i are used to represent information related to the content of the target image frame
  • Q i refers to the distortion obtained after encoding the target image frame according to the initial QP Rate.
  • the encoding device After obtaining the code rate of the target frame type, the encoding device determines the QP of the target image frame according to the frame type of the target image frame in the RDO (Rate-Distortion Optimization) process.
  • RDO Rate-Distortion Optimization
  • the encoding device determines configuration parameters based on CQP, and according to the CQP method, the encoding device configures a fixed QP in the configuration parameters.
  • the encoding device configures different QPs for different frame types, for example, the encoding device configures the QP of the target image frame whose frame type is P frame as the first fixed QP.
  • the encoding device calculates the code rate of the target image frame according to the fixed QP in the configuration parameters.
  • FIG. 5 shows a schematic diagram of a video encoding method provided by another embodiment of the method.
  • the encoding device After the encoding device acquires the target image frame, it adds the target image frame to the pre-encoding queue, the length of the pre-encoding queue is n, and the rated length is n, where n is a positive integer.
  • the encoding device uses the first method to determine the frame type and encoding parameters of the target image frame; In one example, the first frame is input into the pre-encoding queue, the number of frames in the pre-encoding queue is 1, less than n, and the analysis condition is not met, and the encoding device determines that the frame type of the target image frame is an I frame through the first method, based on The code rate control model determines the encoding parameters corresponding to the first frame, and encodes the first frame, obtains the code stream corresponding to the first frame after encoding, and outputs the code stream.
  • the encoding device After time T1, the encoding device acquires the nth frame of the target image frame. At this time, the length of the pre-encoding queue is n, which meets the analysis conditions; the encoding device uses the second method to determine the type of the nth frame of the target image frame and the encoding parameters. In an example, the encoding device adds the n+1th frame to the pre-encoding queue, and the pre-encoding queue includes any frame from the second frame to the n+1th frame.
  • the encoding device generates m frame type arrangements according to the second frame to the n+1th frame, m is a positive integer, calculates the encoding distortion rates corresponding to the m frame type arrangements, and determines the frame type arrangement with the smallest distortion rate as the target frame type arrangement, and determine the frame type of the target image frame of the n+1th frame according to the arrangement of the target image frame type, and the encoding device determines the encoding parameters of the n+1th frame through a code rate control model.
  • the encoding device encodes the n+1th frame according to the predicted frame type and encoding parameters to obtain a code stream corresponding to the n+1th frame, and outputs the code stream.
  • the encoding device removes the second frame from the pre-encoding queue, and waits for the n+2 frame to be encoded to enter the pre-encoding queue.
  • FIG. 6 shows a schematic diagram of a video encoding process provided by another embodiment of the present application.
  • the encoding device obtains a target image frame to be encoded through the screen acquisition device; the encoding device adds the target image frame to the pre-encoding queue; the encoding device judges whether the length of the pre-encoding queue meets the analysis requirements , in some embodiments, the analysis requirement is that the number of image frames stored in the pre-encoding queue is the same as the size of the encoding queue; when the length of the pre-encoding queue does not meet the analysis requirements, the encoding device adopts the The frame type and encoding parameters of the target image frame, the encoding device adds the frame type and encoding parameters of the target image to the pre-encoding queue, and encodes the target image frame according to the frame type and encoding parameters to obtain the corresponding code stream, And output the code stream; in the case that the length of the pre-encoding queue meets the analysis requirements, the encoding device uses the second method to predict the frame type and encoding parameters of the target image frame, so that the frame type
  • the encoding device adds the frame type and encoding parameters of the target image to the pre-encoding queue, and encodes the target image frame according to the frame type and encoding parameters, obtains the corresponding code stream, and outputs the code stream; the encoding device passes In the second method, after obtaining the frame type and encoding parameters of the target image frame, the image frame added to the pre-encoding queue at the earliest is removed from the pre-encoding queue to ensure that the encoding device can add the target image frame obtained from the picture acquisition device next time to the in the precoding queue. After the encoding device finishes encoding a target image frame, it needs to judge whether to continue to repeat the above encoding process.
  • the encoding device receives an exit command, such as an exit command triggered by the user exiting the video playback interface on the terminal device, or the end of the screen capture device If the exit command is triggered by the work, the coding device stops the above coding process.
  • an exit command such as an exit command triggered by the user exiting the video playback interface on the terminal device, or the end of the screen capture device
  • the frame type and encoding parameters of the target image frame are directly determined using the first method, which reduces the waiting time of the encoding device and reduces the encoding delay.
  • using the second method to predict the frame type of the target image through the reference image frames helps to maintain the stability of the encoding process bit rate and avoid inter-frame quality fluctuations.
  • the second method is adopted, that is, using the forward reference image frame to predict the frame type of the target image frame can achieve the similar effect of using the backward reference image frame to predict the target image frame type, and reduces the coding cost. delay.
  • the first method when using the first method to directly determine the frame type and coding parameters of the target image frame, it will cause fluctuations in the quality between frames, but in the process of encoding the video stream, the first method is used to determine the frame type and coding parameters.
  • the number of image frames is much smaller than the number of image frames that determine the frame type and encoding parameters according to the second method, so this method is used to establish a reasonable code rate control model, and the frame will be reduced in a short period of time during the initial period of video playback. Inter-frame quality fluctuations, and the stability of the inter-frame quality of video images can be maintained afterwards, and the encoding delay is low, which is highly usable in live broadcast and RTC scenarios.
  • FIG. 7 shows a block diagram of a video coding apparatus provided by an embodiment of the present application.
  • the device has the function of realizing the above video coding method, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the device may be the electronic device described above, or may be set in the electronic device.
  • the apparatus 700 may include: an image frame acquisition module 710 , a first encoding determination module 720 and an image frame encoding module 730 .
  • the image frame acquisition module 710 is configured to acquire the target image frame to be encoded.
  • the first encoding determination module 720 is configured to determine the frame type and encoding parameters of the target image frame in a first manner when the number of reference image frames of the target image frame does not meet the analysis condition; wherein, the The first manner refers to a manner in which frame types and encoding parameters are not determined according to the reference image frame.
  • the image frame encoding module 730 is configured to encode the target image frame according to the frame type and encoding parameters of the target image frame.
  • the first encoding determining module 720 includes: a first frame type determining unit 722 configured to determine a predefined frame type as the frame type of the target image frame; the first The parameter determining unit 724 is configured to determine a predefined encoding parameter as an encoding parameter of the target image frame.
  • the first frame type determination unit 722 is configured to: acquire a predefined frame type arrangement, the predefined frame type arrangement refers to the frame types corresponding to the predetermined plurality of image frames respectively Arrangement mode: determine the frame type of the target image frame from the predefined frame type arrangement based on the sequence number of the target image frame; or obtain predefined frame type configuration information, the predefined frame
  • the type configuration information includes configuration quantities of at least one frame type; and the frame type of the target image frame is determined based on the frame type of the encoded image frame and the predefined frame type configuration information.
  • the first parameter determination unit 724 is configured to acquire predefined encoding parameter configuration information, where the encoding parameter configuration information includes at least one set of correspondence between frame types and encoding parameters; From the encoding parameter configuration information, determine an encoding parameter corresponding to the frame type of the target image frame as an encoding parameter of the target image frame.
  • the image frame encoding module 730 is configured to encode the target image frame by using an intra prediction mode according to encoding parameters of the target image frame when the frame type of the target image frame is a key frame; or , if the frame type of the target image frame is a non-key frame, encode the target image frame in an inter-frame prediction mode according to the encoding parameters of the target image frame.
  • the apparatus 700 further includes a second encoding determination module 740 configured to determine the encoding in a second manner when the number of the reference image frames satisfies the analysis condition.
  • the frame type and coding parameters of the target image frame wherein, the second method refers to a method of determining the frame type and coding parameters according to the reference image frame.
  • the second encoding determination module 740 includes: a second frame type determination unit 742 configured to determine the frame type of the target image frame according to a reference image frame of the target image frame; a second parameter determination unit 744 configured to The encoding parameters of the target image frame according to the configuration parameters and/or the frame type of the target image frame, the configuration parameters are used to constrain the code rate of the target image frame.
  • the second frame type determination unit 742 is configured to generate a plurality of frame type arrangements, each frame type arrangement includes a hypothetical frame type of the target image frame and a frame located in the hypothetical frame At least one frame type after the type; calculate the distortion rates corresponding to the multiple frame type arrangements; wherein, the distortion rate for the frame type arrangement is based on the frame type arrangement for the target image frame and at least one reference
  • the image frame is determined by encoding; from the various frame type arrangements, determine the target frame type arrangement with the smallest distortion rate; determine the assumed frame type in the target frame type arrangement as the frame type of the target image frame .
  • the apparatus 700 further includes an image frame adding module 750 configured to add the target image frame to a pre-coding queue; when the length of the pre-coding queue is less than the rated length In the case of , it is determined that the number of reference image frames of the target image frame does not meet the analysis condition; when the length of the precoding queue is equal to the rated length, determine the number of reference image frames of the target image frame meet the analysis conditions.
  • an image frame adding module 750 configured to add the target image frame to a pre-coding queue; when the length of the pre-coding queue is less than the rated length In the case of , it is determined that the number of reference image frames of the target image frame does not meet the analysis condition; when the length of the precoding queue is equal to the rated length, determine the number of reference image frames of the target image frame meet the analysis conditions.
  • the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to the needs.
  • the content structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the device and the method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
  • a computer device comprising a processor and a memory in which a computer program is stored.
  • the computer program is configured to be executed by one or more processors to implement the above video encoding method.
  • a computer device may be referred to as a video encoding device, and is used for encoding a video file, so as to reduce resources consumed in transmitting the video file.
  • a computer-readable storage medium in which a computer program is stored, and the computer program implements the above video encoding method when executed by a processor of a computer device.
  • the above-mentioned computer-readable storage medium may be storage devices such as ROM (Read-Only Memory, read-only memory) and RAM (Random Access Memory, random access memory).
  • ROM Read-Only Memory, read-only memory
  • RAM Random Access Memory, random access memory
  • a computer program product is also provided, which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.
  • the "plurality” mentioned herein refers to two or more than two.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently.
  • the character "/” generally indicates that the contextual objects are an "or” relationship.
  • the numbering of the steps described herein only exemplarily shows a possible sequence of execution among the steps. In some other embodiments, the above-mentioned steps may not be executed according to the order of the numbers, such as two different numbers The steps are executed at the same time, or two steps with different numbers are executed in the reverse order as shown in the illustration, which is not limited in this embodiment of the present application.

Abstract

A video encoding method and apparatus, a device, and a storage medium, relating to the technical field of digital signal processing. The method comprises: obtaining a target image frame to be encoded (210); in the case that the number of reference image frames of the target image frame does not meet an analysis condition, determining a frame type and an encoding parameter of the target image frame according to a first mode, wherein the first mode is a mode in which the frame type and the encoding parameter are determined not according to the reference image frames (220); and encoding the target image frame according to the frame type and the encoding parameter of the target image frame (230). According to the present method, the waiting process that an encoding device obtains the enough number of reference image frames in the case that the number of the reference image frames of an image frame to be encoded does not meet analysis is eliminated, the waiting time period of the encoding device is shortened, excessive parameters and complex calculation modes do not need to be introduced, such that the encoding time delay can be effectively reduced.

Description

视频编码方法、装置、设备及存储介质Video coding method, device, equipment and storage medium
本申请要求于2021年10月12日提交的、申请号为202111188246.2、发明名称为“视频编码方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202111188246.2 and titled "Video Coding Method, Device, Equipment, and Storage Medium" filed on October 12, 2021, the entire contents of which are incorporated by reference in this application middle.
技术领域technical field
本申请涉及数字信号处理技术领域,特别涉及一种视频编码方法、装置、设备及存储介质。The present application relates to the technical field of digital signal processing, and in particular to a video encoding method, device, equipment and storage medium.
背景技术Background technique
在视频数据传输前,出于减小传输的数据量、缩短传输时长等目的,编码设备需要对视频图像帧进行编码,压缩视频图像帧的数据量,提高传输速度。Before video data transmission, in order to reduce the amount of transmitted data and shorten the transmission time, the encoding device needs to encode the video image frame, compress the data amount of the video image frame, and increase the transmission speed.
相关技术中,在对某个待编码的目标图像帧进行编码前,需要根据其它多个参考图像帧对该目标图像帧的帧类型和编码参数进行预测,通过计算该目标图像帧采用不同编码方式会带来的编码损失,确定该目标图像帧最合适的帧类型和编码参数。编码设备根据确定的帧类型和编码参数对该目标图像帧进行编码,获得该目标图像帧对应的码流。通过目标图像帧的帧类型和编码参数进行预测,根据预测结果进行编码,能够获得稳定的视频播放质量。In related technologies, before encoding a certain target image frame to be encoded, it is necessary to predict the frame type and encoding parameters of the target image frame according to other multiple reference image frames, by calculating the target image frame using different encoding methods The encoding loss that will be brought, determine the most suitable frame type and encoding parameters for the target image frame. The encoding device encodes the target image frame according to the determined frame type and encoding parameters to obtain a code stream corresponding to the target image frame. The frame type and encoding parameters of the target image frame are predicted, and the encoding is performed according to the prediction result, so that stable video playback quality can be obtained.
然而,在一些情况下,根据其它多个参考图像帧对待编码的图像帧进行编码时,需要等待这些参考图像帧输入编码设备,会造成较大编码时延。However, in some cases, when encoding an image frame to be encoded according to other multiple reference image frames, it is necessary to wait for these reference image frames to be input into the encoding device, which will cause a large encoding delay.
发明内容Contents of the invention
本申请实施例提供了一种视频编码方法、装置、设备及存储介质。技术方案如下:Embodiments of the present application provide a video encoding method, device, equipment, and storage medium. The technical solution is as follows:
根据本申请实施例的一个方面,提供了一种视频编码方法,所述方法由计算机设备执行,所述方法包括:According to an aspect of an embodiment of the present application, a video coding method is provided, the method is executed by a computer device, and the method includes:
获取待编码的目标图像帧;Obtain the target image frame to be encoded;
在所述目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定所述目标图像帧的帧类型和编码参数;其中,所述第一方式是指不依据所述参考图像帧确定帧类型和编码参数的方式;In the case that the number of reference image frames of the target image frame does not satisfy the analysis condition, determine the frame type and encoding parameters of the target image frame according to the first method; wherein, the first method refers to not according to the The method of determining the frame type and encoding parameters with reference to the image frame;
根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码。Encoding the target image frame according to the frame type and encoding parameters of the target image frame.
根据本申请实施例的一个方面,提供了一种视频编码装置,所述装置包括:According to an aspect of an embodiment of the present application, a video encoding device is provided, and the device includes:
图像帧获取模块,用于获取待编码的目标图像帧;An image frame acquisition module, configured to acquire a target image frame to be encoded;
第一编码确定模块,用于在所述目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定所述目标图像帧的帧类型和编码参数;其中,所述第一方式是指不依据所述参考图像帧确定帧类型和编码参数的方式;The first encoding determination module is configured to determine the frame type and encoding parameters of the target image frame in a first manner when the number of reference image frames of the target image frame does not meet the analysis condition; wherein, the first One method refers to a method of determining the frame type and encoding parameters not according to the reference image frame;
图像帧编码模块,用于根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码。An image frame encoding module, configured to encode the target image frame according to the frame type and encoding parameters of the target image frame.
根据本申请实施例的一个方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序以实现上述视频编码方法。According to an aspect of an embodiment of the present application, there is provided a computer device, the computer device includes a processor and a memory, a computer program is stored in the memory, and the processor executes the computer program to implement the above video coding method .
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序用于被处理器执行,以实现上述视频编码方法。According to an aspect of the embodiments of the present application, a computer-readable storage medium is provided, where a computer program is stored in the storage medium, and the computer program is used to be executed by a processor, so as to implement the above video encoding method.
根据本申请的一个方面,提供了一种计算机程序产品,当所述计算机程序产品在计算机设备上运行时,使得计算机设备执行如上述视频编码方法。According to one aspect of the present application, a computer program product is provided, which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.
本申请实施例提供的技术方案可以带来如下有益效果:The technical solutions provided in the embodiments of the present application can bring the following beneficial effects:
在待编码的目标图像帧不满足分析条件的情况下,直接对该目标图像帧进行编码,不需要等待目标图像帧的参考图像帧数目满足分析要求,消除了等待目标图像帧的多个其它参考图像帧造成的等待时间,降低了编码时延。When the target image frame to be encoded does not meet the analysis conditions, the target image frame is directly encoded without waiting for the number of reference image frames of the target image frame to meet the analysis requirements, eliminating multiple other references waiting for the target image frame The waiting time caused by the image frame reduces the encoding delay.
附图说明Description of drawings
图1是本申请一个实施例提供的实施环境的示意图;FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application;
图2是本申请一个实施例提供的视频编码方法的流程图;FIG. 2 is a flowchart of a video coding method provided by an embodiment of the present application;
图3是本申请另一个实施例提供的视频编码方法的流程图;FIG. 3 is a flowchart of a video encoding method provided by another embodiment of the present application;
图4是本申请另一个实施例提供的视频编码方法的示意图;FIG. 4 is a schematic diagram of a video encoding method provided by another embodiment of the present application;
图5是本申请另一个实施例提供的视频编码方法的示意图;FIG. 5 is a schematic diagram of a video encoding method provided by another embodiment of the present application;
图6是本申请另一个实施例提供的视频编码过程的示意图;FIG. 6 is a schematic diagram of a video encoding process provided by another embodiment of the present application;
图7是本申请一个实施例提供的视频编码装置的框图;FIG. 7 is a block diagram of a video encoding device provided by an embodiment of the present application;
图8是本申请另一个实施例提供的视频编码装置的框图。Fig. 8 is a block diagram of a video encoding device provided by another embodiment of the present application.
具体实施方式Detailed ways
在介绍本申请技术方案之前,先对本申请涉及的一些背景技术知识进行介绍说明。以下相关技术作为可选方案与本申请实施例的技术方案可以进行任意结合,其均属于本申请实施例的保护范围。本申请实施例包括以下内容中的至少部分内容。Before introducing the technical solution of this application, some background technical knowledge involved in this application will be introduced. The following related technologies may be optionally combined with the technical solutions of the embodiments of the present application as optional solutions, and all of them belong to the protection scope of the embodiments of the present application. The embodiment of the present application includes at least part of the following contents.
首先对本申请中可能出现的名词进行介绍。First, the nouns that may appear in this application are introduced.
帧(Frame)是指视频中的静止画面。A frame refers to a still picture in a video.
FPS(Frames Per Second,帧率)是指每秒钟刷新图片的帧数,或者图形处理器每秒钟刷新的次数。帧率越高,表示一秒钟内刷新的图片数量越多,视频画面越逼真,视频中的动作越流畅。FPS (Frames Per Second, frame rate) refers to the number of frames to refresh the picture per second, or the number of times the graphics processor refreshes per second. The higher the frame rate, the more pictures are refreshed in one second, the more realistic the video picture, and the smoother the action in the video.
DR(Data Rate,码流)是指视频文件在单位时间内使用的数据流量,视频文件的码流越大,压缩比就越小,画面质量就越高。码流越大,说明单位时间内取样率越大,数据流,精度就越高,处理出来的文件就越接近原始文件,图像质量越好,画质越清晰,要求播放设备的解码能力也越高。DR (Data Rate, code stream) refers to the data flow used by a video file per unit time. The larger the code stream of a video file, the smaller the compression ratio and the higher the picture quality. The larger the code stream, the larger the sampling rate per unit time, the higher the accuracy of the data stream, the closer the processed file is to the original file, the better the image quality, the clearer the picture quality, and the higher the decoding ability of the playback device is required. high.
视频编码是指通过特定的压缩技术,去除视频画面中冗余信息的过程。Video coding refers to the process of removing redundant information in video images through specific compression techniques.
GOP(Group of Picture,关键帧的周期)是指两个相邻I帧(Intra Picture,内部画面)帧之间的距离,也即一个帧组的最大帧数,解码过程以GOP为单位进行解码。GOP (Group of Picture, key frame period) refers to the distance between two adjacent I frames (Intra Picture, internal picture) frames, that is, the maximum number of frames in a frame group, and the decoding process is decoded in units of GOP .
QP(Quantizer Parameter,量化参数),用于反映空间细节压缩情况。QP的值越小,量化越精细,图像中细节保留程度越大,图片的质量越高,产生的码流也越长,即数据量越大;QP越大,画面中一些细节丢失程度越大,码率越低图像失真程度越大。QP (Quantizer Parameter, quantization parameter), used to reflect the compression of spatial details. The smaller the value of QP, the finer the quantization, the greater the degree of detail retention in the image, the higher the quality of the picture, the longer the code stream generated, that is, the greater the amount of data; the larger the QP, the greater the loss of some details in the picture , the lower the code rate, the greater the degree of image distortion.
下面对Y(Lumina,明亮度)U(Chrominance,色度)V(Chroma,色度)信号进行介绍和说明。The Y (Lumina, brightness) U (Chrominance, chroma) V (Chroma, chroma) signal is introduced and explained below.
YUV是一种颜色数字化表示方式,Y用来表示像素的灰阶值,U和V用来表示像素的颜色,包括像素的色彩及饱和度。由于人眼对色调变化的敏感度低于人眼对明亮度变换的敏感度,在视频编码过程中可以在色度维度进行更大的压缩,使得视频编码后获得的码流更小。YUV is a digital representation of color, Y is used to represent the grayscale value of the pixel, U and V are used to represent the color of the pixel, including the color and saturation of the pixel. Since the sensitivity of the human eye to changes in hue is lower than that of the human eye to brightness changes, greater compression can be performed in the chrominance dimension during the video encoding process, so that the code stream obtained after video encoding is smaller.
YUV画面的采样格式包括:The sampling formats of YUV images include:
(1)4∶4∶4,即YUV三个信道的抽样率相同,在生成的图像里,每个象素的三个分量信息完整(每个分量通常8比特),未经压缩的每个像素占用3个字节。例如,有四个像素为:[Y0U0V0]、[Y1U1V1]、[Y2U2V2]和[Y3U3V3],采样后对应的码流为:Y0U0V0;Y1U1V1; Y2U2V2;Y3U3V3。(1) 4:4:4, that is, the sampling rate of the three channels of YUV is the same. In the generated image, the three component information of each pixel is complete (each component is usually 8 bits), and each uncompressed A pixel takes 3 bytes. For example, there are four pixels: [Y0U0V0], [Y1U1V1], [Y2U2V2] and [Y3U3V3], and the corresponding code stream after sampling is: Y0U0V0; Y1U1V1; Y2U2V2; Y3U3V3.
(2)4∶2∶2,即每个色度信道的抽样率是亮度信道的一半,因此,水平方向的色度抽样率为4:4:4的一半。对非压缩的8bit量化的图像来说,每个由两个水平方向相邻的像素组成的宏像素需要占用4字节内存(亮度2个字节,两个色度各1个字节)。例如,有四个像素为:[Y0U0V0]、[Y1U1V1]、[Y2U2V2]和[Y3U3V3],采样后对应的码流为:Y0U0;Y1V1;Y2U2;Y3V3。(2) 4:2:2, that is, the sampling rate of each chroma channel is half of that of the luma channel, therefore, the chroma sampling rate in the horizontal direction is half of 4:4:4. For an uncompressed 8-bit quantized image, each macropixel consisting of two horizontally adjacent pixels requires 4 bytes of memory (2 bytes for brightness and 1 byte for each of the two chroma). For example, there are four pixels: [Y0U0V0], [Y1U1V1], [Y2U2V2] and [Y3U3V3], and the corresponding code stream after sampling is: Y0U0; Y1V1; Y2U2; Y3V3.
(3)4∶2∶0,对每行扫描线包括一种色度分量以2:1的抽样率存储。相邻的扫描行存储不同的色度分量,例如,第一行为4:2:0的话,第二行为4:0:2,第三行为4:2:0,以此类推。对每个色度分量来说,水平方向和竖直方向的抽样率都是2:1,相当于色度的抽样率是4:1。对非压缩的8bit量化的视频来说,每个由2x2个2行2列相邻的像素组成的宏像素需要占用6字节内存(亮度4个字节,两个色度各1个字节)。例如,有八个像素为:[Y0 U0 V0]、[Y1 U1 V1]、[Y2 U2 V2]、[Y3 U3 V3]、[Y5 U5 V5]、[Y6 U6 V6]、[Y7 U7 V7]和[Y8 U8 V8],采样后的码流为:Y0 U0;Y1;Y2 U2;Y3;Y5V5;Y6;Y7V7;Y8。(3) 4:2:0, each scan line includes a chrominance component stored at a sampling rate of 2:1. Adjacent scan lines store different chroma components, for example, if the first line is 4:2:0, the second line is 4:0:2, the third line is 4:2:0, and so on. For each chroma component, the sampling rate in the horizontal and vertical directions is 2:1, which is equivalent to a chroma sampling rate of 4:1. For uncompressed 8bit quantized video, each macropixel consisting of 2x2 adjacent pixels in 2 rows and 2 columns requires 6 bytes of memory (4 bytes for brightness, 1 byte for each of the two chroma ). For example, there are eight pixels as: [Y0 U0 V0], [Y1 U1 V1], [Y2 U2 V2], [Y3 U3 V3], [Y5 U5 V5], [Y6 U6 V6], [Y7 U7 V7] and [Y8 U8 V8], the code stream after sampling is: Y0 U0; Y1; Y2 U2; Y3; Y5V5; Y6; Y7V7; Y8.
下面对帧类型进行介绍和说明。The frame types are introduced and explained below.
帧类型主要包括以下几类:Frame types mainly include the following categories:
(1)I帧:也称为关键帧,I帧用于表示图像背景和运动主体的详情,所占数据的信息量比较大。在编码过程中,I帧的压缩率最小,在解码时,I帧不依赖其它帧进行解码,并且能够作为其它帧类型解码过程中的参考图像帧。I帧是GOP的基础帧(GOP中的第一帧),一组GOP中包含1个I帧。(1) I frame: also known as a key frame, the I frame is used to represent the details of the image background and the moving subject, and the amount of information occupied by the data is relatively large. In the encoding process, the compression rate of the I frame is the smallest. When decoding, the I frame does not depend on other frames for decoding, and can be used as a reference image frame in the decoding process of other frame types. The I frame is the basic frame of the GOP (the first frame in the GOP), and one I frame is included in a group of GOPs.
(2)IDR(Instantaneous Decoding Refresh,即时解码刷新)帧:IDR帧是I帧中的一种,在对IDR帧进行解码前,解码设备会清空前后向参考缓冲区(参考缓冲区用于存储在解码过程中具有参考作用的图像帧),使得IDR帧之后的任意帧不能参考IDR出现前的任一帧进行解码。IDR帧作为视频安全的随机访问点,能够避免传输错误的图像帧长期影响其它帧的解码结果。(2) IDR (Instantaneous Decoding Refresh, instant decoding refresh) frame: IDR frame is a kind of I frame, before decoding the IDR frame, the decoding device will clear the forward and backward reference buffer (the reference buffer is used to store in An image frame that has a reference role in the decoding process), so that any frame after the IDR frame cannot be decoded with reference to any frame before the IDR appears. As a random access point for video security, the IDR frame can prevent the transmission of wrong image frames from affecting the decoding results of other frames for a long time.
(3)P(Predictive-coded Picture,前向预测编码图像)帧:P帧中记录了当前图像帧与前面若干帧的差别信息,在编码过程中,P帧的压缩率较高。对P帧的解码过程,需要参考之前缓存的画面中的部分信息进行解码,生成画面。(3) P (Predictive-coded Picture, forward predictive coded image) frame: The difference information between the current image frame and the previous frames is recorded in the P frame. During the encoding process, the compression rate of the P frame is relatively high. For the decoding process of the P frame, it is necessary to refer to part of the information in the previously buffered picture to decode and generate a picture.
(4)B(Bidirectionally Predicted Picture,双向预测编码图像)帧:B帧中记录了本帧与前后帧的差别,编码过程中的最大的压缩量。在对B帧进行解码时,需要参考前向或/和后向I帧或P帧,前向是指时间轴上比当前帧出现早的图像帧。(4) B (Bidirectionally Predicted Picture, bidirectional predictive coding image) frame: B frame records the difference between the current frame and the preceding and following frames, and the maximum amount of compression in the encoding process. When decoding a B frame, it is necessary to refer to a forward or/and backward I frame or P frame, and the forward direction refers to an image frame that appears earlier than the current frame on the time axis.
下面,对视频编码方法进行介绍和说明。Next, the video coding method is introduced and explained.
视频编码包括预测编码、变换编码、量化以及熵编码,其中预测编码包括帧内预测编码和帧间预测编码。帧内预测编码利用视频空间域的相关性对同一帧图像内相邻的宏块进行比较。帧间预测编码利用视频时域空间的相关性。帧间预测编码使用邻近已编码图像帧的像素预测当前图像帧像素编码后所对应的像素,得到残差信号。帧间预测编码包括:运动估计、MV预测和加权预测、运动补偿等;其中,运动估计用于为当前图像帧的宏块在参考图像帧中匹配最相近的宏块,运动估计通过搜索算法寻找参考图像帧中的宏块,使用最小均方误差、最小平均绝对误差等匹配标准确定参考图像帧中与当前图像帧的宏块最相似的宏块。Video coding includes predictive coding, transform coding, quantization and entropy coding, wherein predictive coding includes intra-frame predictive coding and inter-frame predictive coding. Intra-frame predictive coding uses the correlation of the video space domain to compare adjacent macroblocks in the same frame. Inter-frame predictive coding exploits temporal-spatial correlation of video. The inter-frame predictive coding uses the pixels of the adjacent coded image frame to predict the corresponding pixel of the current image frame after coding, and obtains the residual signal. Inter-frame predictive coding includes: motion estimation, MV prediction and weighted prediction, motion compensation, etc. Among them, motion estimation is used to match the macroblock of the current image frame with the closest macroblock in the reference image frame, and motion estimation is used to find Referring to the macroblocks in the image frame, the macroblocks in the reference image frame that are most similar to the macroblocks in the current image frame are determined using matching criteria such as minimum mean square error and minimum mean absolute error.
量化是指对时间上离散的信号进行处理,使其在幅度上也离散,量化又可以分为均匀量化和非均匀量化,其中非均匀量化对于小信号效果较好。Quantization refers to the processing of time-discrete signals to make them discrete in amplitude. Quantization can be divided into uniform quantization and non-uniform quantization, among which non-uniform quantization is better for small signals.
请参考图1,其示出了本申请一个实施例提供的是实施环境的示意图。该实施环境可以包括编码设备10和解码设备20。Please refer to FIG. 1 , which shows a schematic diagram of an implementation environment provided by an embodiment of the present application. The implementation environment may include an encoding device 10 and a decoding device 20 .
编码设备10是具有视频编码、视频数据存储和视频数据收发等功能的设备。编码设备10可以是诸如计算机、手机、平板电脑、智能电视、摄像机、车载系统等设备。编码设备10对 某一帧待编码的目标图像帧进行编码,可选地,编码设备10直接指定该目标图像帧的编码参数,根据指定的编码参数对该目标图像帧进行编码;可选地,编码设备10首先根据其他参考图像帧预测该目标图像帧的帧类型和编码参数,再根据编码参数对该目标图像帧进行编码。编码设备10将该目标图像帧编码后得到码流以视频数据流的形式发送给解码设备20。The coding device 10 is a device having functions such as video coding, video data storage, and video data transceiving. The encoding device 10 may be a device such as a computer, a mobile phone, a tablet computer, a smart TV, a video camera, or a vehicle system. The encoding device 10 encodes a target image frame to be encoded in a certain frame. Optionally, the encoding device 10 directly specifies the encoding parameters of the target image frame, and encodes the target image frame according to the specified encoding parameters; optionally, The encoding device 10 first predicts the frame type and encoding parameters of the target image frame according to other reference image frames, and then encodes the target image frame according to the encoding parameters. The encoding device 10 encodes the target image frame and sends the code stream to the decoding device 20 in the form of a video data stream.
解码设备20是具有视频编码、视频数据收发等功能的设备,解码设备可以是服务器或终端设备,服务器可以是运行目标应用程序的后台服务器。目标应用程序包括视频类应用程序、直播类应用程序、具有视频通信功能的社交类应用程序等,目标应用程序的种类在此不进行限定。解码设备20可以是一台服务器,也可以是由多台服务器组成的服务器集群,或者是一个云计算服务中心。终端设备上运行有目标应用程序,终端设备除了解码视频数据流的功能,至少具有一种其他功能,例如视频播放功能,信息发送功能等。The decoding device 20 is a device having functions such as video encoding and video data sending and receiving. The decoding device may be a server or a terminal device, and the server may be a background server running a target application program. The target application program includes a video application program, a live broadcast application program, a social application program with a video communication function, etc., and the type of the target application program is not limited here. The decoding device 20 may be a server, or a server cluster composed of multiple servers, or a cloud computing service center. There is a target application program running on the terminal device, and besides the function of decoding video data stream, the terminal device has at least one other function, such as video playing function, information sending function and so on.
请参考图2,其示出了本申请一个实施例的提供的视频编码方法的流程图,本方法各步骤的执行主体可以是编码设备,该方法可以包括如下几个步骤(210-230)中的至少一个步骤:Please refer to FIG. 2, which shows a flow chart of a video coding method provided by an embodiment of the present application. The execution subject of each step of the method may be a coding device, and the method may include the following steps (210-230) At least one step of:
步骤210,获取待编码的目标图像帧。 Step 210, acquiring a target image frame to be encoded.
图像帧是一幅静止的画面,多个图像帧连续切换形成一段视频。在一些实施例中,图像帧是YUV格式的信号。编码是指编码设备将一种格式的画面进行压缩,得到另一种格式画面的过程。编码后画面的数据量小于编码前原画面的数据量。An image frame is a still picture, and multiple image frames are switched continuously to form a video. In some embodiments, the image frame is a signal in YUV format. Coding refers to the process in which a coding device compresses a picture in one format to obtain a picture in another format. The data volume of the encoded picture is smaller than the data volume of the original picture before encoding.
编码设备获取待编码的目标图像帧,对获取的目标图像帧进行编码。可选地,编码设备从视频数据库的某个视频文件中获取目标图像帧;可选地,编码设备通过摄像机等具有视频画面采集功能的设备获取目标图像帧。目标图像帧可以是视频文件中的一帧画面,也可以是视频直播、视频通话过程中,画面采集设备采集到的一帧场景画面,图像帧的来源和类型根据实际情况确定,在此不进行限定。在一个示例中,在视频播放的应用场景中,编码设备可以设置在目标应用程序的后台服务器上,编码设备从服务器的数据库中获取某个待编码的目标图像帧,其中该目标图像帧来自收到播放指示命令对应的视频文件。在另一个示例中,在视频直播的应用场景下,摄像机每25ms采集一帧现场画面,编码设备通过摄像机获取待编码的目标图像帧。The encoding device acquires the target image frame to be encoded, and encodes the acquired target image frame. Optionally, the encoding device acquires the target image frame from a certain video file in the video database; optionally, the encoding device acquires the target image frame through a device with a video frame acquisition function such as a camera. The target image frame can be a frame in a video file, or a frame of a scene captured by a screen capture device during a live video broadcast or a video call. limited. In an example, in the application scenario of video playback, the encoding device can be set on the background server of the target application program, and the encoding device obtains a certain target image frame to be encoded from the database of the server, wherein the target image frame comes from the received to the video file corresponding to the play instruction command. In another example, in a live video application scenario, the camera captures a frame of live images every 25 ms, and the encoding device obtains the target image frame to be encoded through the camera.
步骤220,在目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定目标图像帧的帧类型和编码参数;其中,第一方式是指不依据参考图像帧确定帧类型和编码参数的方式。 Step 220, in the case that the number of reference image frames of the target image frame does not meet the analysis conditions, determine the frame type and encoding parameters of the target image frame according to the first method; wherein, the first method refers to determining the frame without referring to the reference image frame Type and way of encoding parameters.
参考图像帧和目标图像帧是同一个视频文件中的图像帧,或者来自同一个画面采集设备在某段时间间隔内采集得到的图像帧。参考图像帧可以是已经完成编码的图像帧,也可以是未进行编码的图像帧。在一些实施例中,参考图像帧和目标图像帧具有部分相同的特征,目标图像帧可以根据参考图像帧中的信息进行编码。在一些实施例中,参考图像帧存储在编码设备中。The reference image frame and the target image frame are image frames in the same video file, or image frames collected by the same image acquisition device within a certain time interval. The reference image frame may be an image frame that has been encoded, or an image frame that has not been encoded. In some embodiments, the reference image frame and the target image frame have part of the same features, and the target image frame can be encoded according to the information in the reference image frame. In some embodiments, the reference image frame is stored in the encoding device.
分析条件是指用于判断目标图像帧的帧类型和编码参数确定方式的条件。帧类型用于表征目标图像帧的可压缩程度,帧类型包括I帧、P帧和B帧,关于帧类型的详细内容请参考上文,在此不进行赘述。编码参数包括码率控制参数,编码参数用于保持编码过程中码率维持稳定并降低编码过程中图像帧的失真率。The analysis condition refers to the condition for judging the frame type of the target image frame and the way of determining the coding parameters. The frame type is used to represent the compressibility of the target image frame. The frame type includes I frame, P frame and B frame. For details about the frame type, please refer to the above, and will not be repeated here. The coding parameters include code rate control parameters, and the coding parameters are used to keep the code rate stable during the coding process and reduce the distortion rate of the image frame during the coding process.
在一些实施例中,分析条件包括目标图像帧的参考图像帧的数量大于或等于第一阈值。在一些实施例中,编码设备获取待编码的目标图像帧后,根据编码设备中参考图像帧的数量确定获取目标图像帧的编码方式,若目标图像帧的参考图像帧的数量不满足分析条件,则编码设备采用第一方式确定目标图像帧的帧类型和编码参数,反之,若目标图像帧的参考图像帧的数量满足分析条件,则编码设备采用除第一方式以外的其它方法确定目标图像帧的帧类型和编码参数。In some embodiments, the analysis condition includes that the number of reference image frames of the target image frame is greater than or equal to a first threshold. In some embodiments, after the encoding device acquires the target image frame to be encoded, it determines the encoding method for acquiring the target image frame according to the number of reference image frames in the encoding device. If the number of reference image frames of the target image frame does not meet the analysis condition, Then the encoding device adopts the first method to determine the frame type and encoding parameters of the target image frame; otherwise, if the number of reference image frames of the target image frame satisfies the analysis condition, the encoding device uses other methods except the first method to determine the target image frame frame type and encoding parameters.
在一些实施例中,获取待编码的目标图像帧之后,还包括:将目标图像帧添加至预编码 队列中;在预编码队列的长度小于额定长度的情况下,确定目标图像帧的参考图像帧的数量不满足分析条件;在预编码队列的长度等于额定长度的情况下,确定目标图像帧的参考图像帧的数量满足分析条件。In some embodiments, after obtaining the target image frame to be encoded, it also includes: adding the target image frame to the pre-coding queue; when the length of the pre-coding queue is less than the rated length, determining the reference image frame of the target image frame The number of does not meet the analysis condition; when the length of the precoding queue is equal to the rated length, it is determined that the number of reference image frames of the target image frame meets the analysis condition.
预编码队列的长度是指预编码队列中存储的图像帧的数量。在一些实施例中,额定长度等于预编码队列的大小,预编码队列的大小是指预定义队列中可容纳的图像帧的最大数量。在这种情况下,在预编码队列中存储的图像帧的数量达到预编码队列的大小前,编码设备采取第一方式确定获取的目标图像帧的帧类型和编码参数。The length of the pre-coding queue refers to the number of image frames stored in the pre-coding queue. In some embodiments, the nominal length is equal to the size of the pre-coding queue, and the size of the pre-coding queue refers to the maximum number of image frames that can be accommodated in the predefined queue. In this case, before the number of image frames stored in the pre-encoding queue reaches the size of the pre-encoding queue, the encoding device adopts a first method to determine the frame type and encoding parameters of the acquired target image frame.
在一些实施例中,编码设备获取待编码的目标图像帧之后,将目标图像帧添加至预编码队列中,预编码队列中存储的多个图像帧作为目标图像帧的参考图像帧。在一些实施例中,编码设备将待编码的目标图像帧的帧类型和编码参数存储在预编码队列中,例如,编码设备将待编码的目标图像帧、目标图像帧对应的帧类型和编码参数存储在预编码队列的同一个单元中;又例如,编码设备将某个待编码的目标图像帧的帧类型和编码参数单独存储在预编码队列的一个单元中,且该单元与用于存储该目标图像帧的单元具有映射关系,编码设备通过映射关系确定该目标图像帧对应的帧类型和编码参数。In some embodiments, after the encoding device acquires the target image frame to be encoded, it adds the target image frame to a pre-encoding queue, and multiple image frames stored in the pre-encoding queue are used as reference image frames of the target image frame. In some embodiments, the encoding device stores the frame type and encoding parameters of the target image frame to be encoded in the pre-encoding queue, for example, the encoding device stores the target image frame to be encoded, the frame type and encoding parameter corresponding to the target image frame Stored in the same unit of the pre-encoding queue; for another example, the encoding device separately stores the frame type and encoding parameters of a target image frame to be encoded in a unit of the pre-encoding queue, and this unit is used to store the The unit of the target image frame has a mapping relationship, and the encoding device determines the frame type and encoding parameters corresponding to the target image frame through the mapping relationship.
步骤230,根据目标图像帧的帧类型和编码参数,对目标图像帧进行编码。Step 230: Encode the target image frame according to the frame type and encoding parameters of the target image frame.
在一些实施例中,编码设备根据目标图像帧的帧类型和编码参数,对目标图像帧进行编码,获得目标图像帧的对应的码流,并输出目标图像帧对应的码流。编码设备根据某个待编码的目标图像帧的帧类型和编码参数对该目标图像帧进行编码的详细内容请参考相关背景介绍,在此不进行赘述。In some embodiments, the encoding device encodes the target image frame according to the frame type and encoding parameters of the target image frame, obtains a code stream corresponding to the target image frame, and outputs the code stream corresponding to the target image frame. For details about how the encoding device encodes a target image frame according to the frame type and encoding parameters of the target image frame to be encoded, please refer to the relevant background introduction, and details are not repeated here.
综上所述,本方法对于参考图像帧数量不满足分析条件的待编码的目标图像帧,通过第一方式确定目标图像帧的帧类型和编码参数,由于第一方式无需根据目标图像帧的参考图像帧确定目标图像帧的帧类型和编码参数,因此消除了在目标图像帧的参考图像帧的数量不满足分析情况下,编码设备获取足够数量的参考图像帧消耗的等待时间,本方法不需要引入过多的参数以及复杂的计算方式,即可有效降低编码时延。To sum up, this method determines the frame type and coding parameters of the target image frame through the first method for the target image frame to be encoded whose number of reference image frames does not meet the analysis conditions, because the first method does not need to refer to the target image frame The image frame determines the frame type and encoding parameters of the target image frame, thus eliminating the waiting time for the encoding device to acquire a sufficient number of reference image frames when the number of reference image frames of the target image frame does not meet the analysis requirements. This method does not require Introducing too many parameters and complex calculation methods can effectively reduce the encoding delay.
通过几个实施例对本方法提供的视频编码过程进行介绍说明。The video coding process provided by this method is introduced and described through several embodiments.
请参考图3,其示出本申请另一个实施例提供的视频编码方法的流程图。Please refer to FIG. 3 , which shows a flowchart of a video encoding method provided by another embodiment of the present application.
步骤310,获取待编码的目标图像帧。 Step 310, acquiring a target image frame to be encoded.
在一些实施例中,获取待编码的目标图像帧之后,还包括:将目标图像帧添加至预编码队列中;在预编码队列的长度小于额定长度的情况下,编码设备确定目标图像帧的参考图像帧的数量不满足分析条件;在预编码队列的长度等于额定长度的情况下,编码设备确定目标图像帧的参考图像帧的数量满足分析条件。In some embodiments, after obtaining the target image frame to be encoded, it also includes: adding the target image frame to the pre-encoding queue; when the length of the pre-encoding queue is less than the rated length, the encoding device determines the reference of the target image frame The number of image frames does not meet the analysis condition; when the length of the pre-coding queue is equal to the rated length, the encoding device determines that the number of reference image frames of the target image frame meets the analysis condition.
步骤320,在目标图像帧的参考图像帧的数量不满足分析条件的情况下,将预定义的帧类型,确定为目标图像帧的帧类型。 Step 320, if the number of reference image frames of the target image frame does not satisfy the analysis condition, determine a predefined frame type as the frame type of the target image frame.
在一些实施例中,预定义的帧类型存储在编码设备中,或者预定义的帧类型根据画面采集工具或待编码视频确定。预定义的帧类型可以根据网络传输时延、编码设备的编码能力和视频播放质量要求等实际应用情况确定,本申请在此不进行限定。在一些实施例中,在目标图像帧的参考图像帧的数量不满足分析条件的情况下,编码设备根据预定义的帧类型确定目标图像帧的帧类型。In some embodiments, the predefined frame type is stored in the encoding device, or the predefined frame type is determined according to a picture capture tool or the video to be encoded. The predefined frame type can be determined according to actual application conditions such as network transmission delay, encoding capability of encoding equipment, and video playback quality requirements, and is not limited here in this application. In some embodiments, when the number of reference image frames of the target image frame does not satisfy the analysis condition, the encoding device determines the frame type of the target image frame according to a predefined frame type.
其中,步骤320在一种可能的实现方式中,包括步骤322,获取预定义的帧类型排列,预定义的帧类型排列是指预先确定的、多个图像帧分别对应的帧类型的排布方式;基于目标图像帧的序号从预定义的帧类型排列中,确定目标图像帧的帧类型。Wherein, step 320, in a possible implementation manner, includes step 322, obtaining a predefined frame type arrangement, and the predefined frame type arrangement refers to a predetermined arrangement of frame types corresponding to a plurality of image frames respectively. ; Determine the frame type of the target image frame from the predefined frame type arrangement based on the sequence number of the target image frame.
步骤320在另一种可能的实现方式中,包括步骤324,获取预定义的帧类型配置信息,预定义的帧类型配置信息中包括至少一种帧类型的配置数量;基于已编码图像帧的帧类型和预定义的帧类型配置信息,确定目标图像帧的帧类型。In another possible implementation manner, step 320 includes step 324, obtaining predefined frame type configuration information, and the predefined frame type configuration information includes the configuration quantity of at least one frame type; Type and predefined frame type configuration information to determine the frame type of the target image frame.
在一些实施例中,预定义的帧类型排列存储在编码设备中。在一些实施例中,在预定义的帧类型配置信息中包括I帧、P帧和B帧中的至少一种,例如,预定义的帧类型排列可以包括:IIIIIIIIII、IPPPIPPPIP、IBPBPIBPP等。预定义帧类型排列可以根据经验、实际视频播放要求等因素进行设定,本申请在此不进行限定。In some embodiments, a predefined arrangement of frame types is stored in the encoding device. In some embodiments, at least one of I frame, P frame and B frame is included in the predefined frame type configuration information, for example, the predefined frame type arrangement may include: IIIIIIIIII, IPPPIPPPIP, IBPBPIBPP, etc. The arrangement of predefined frame types can be set according to experience, actual video playback requirements and other factors, which is not limited in this application.
在一些实施例中,目标图像帧的序号根据编码设备获取目标图像帧的次序确定,例如,某个目标图像帧是编码设备第9个获取的图像帧,则该目标图像帧的序号为9。在一些实施例中,编码设备获取目标图像帧后,获取预定义的帧类型排列,编码设备根据预定义的帧类型排列以及目标图像的序号,确定目标图像的帧类型。例如,编码设备获取某个待编码的目标图像帧,该目标图像帧是编码设备获取的第1个目标图像帧,则该目标图像帧的序号为1,在该目标图像帧的参考图像帧的数量不满足分析条件的情况下,编码设备中存储的预定义的帧类型排列为IPPBPP,编码设备根据该目标图像帧的序号和预定义的帧类型排列,确定该目标图像帧的帧类型为I帧。In some embodiments, the sequence number of the target image frame is determined according to the order in which the encoding device acquires the target image frames. For example, if a certain target image frame is the ninth image frame acquired by the encoding device, the sequence number of the target image frame is 9. In some embodiments, after the encoding device acquires the target image frame, it acquires a predefined frame type arrangement, and the encoding device determines the frame type of the target image according to the predefined frame type arrangement and the sequence number of the target image. For example, the encoding device acquires a certain target image frame to be encoded, and the target image frame is the first target image frame acquired by the encoding device, then the sequence number of the target image frame is 1, and the reference image frame of the target image frame When the quantity does not meet the analysis conditions, the predefined frame type arrangement stored in the encoding device is IPPBPP, and the encoding device determines that the frame type of the target image frame is I according to the sequence number of the target image frame and the predefined frame type arrangement. frame.
在一些实施例中,预定义的帧类型排列的长度根据预编码队列的长度确定,例如,预编码队列的大小为10帧,则预定义的帧类型排列的长度大于或等于10。目标图像的序号根据预编码队列中存储的图像帧的数量确定,在一些实施例中,编码设备在获取某个待编码的目标图像帧后,将该目标图像帧添加至预编码队列;编码设备通过判断确定预编码队列的长度小于额定长度,编码设备通过预定义的帧类型排列和目标图像帧的序号确定该目标图像帧的帧类型和编码参数,例如,编码设备获取到预定义的帧类型的排列为IBPBPIBPP,编码设备获取某个待分析的目标图像帧后,在预编码队列中存储的图像帧的数量为4,则该目标图像帧的序号4,因此该目标图像帧的帧类型为B帧。In some embodiments, the length of the predefined frame type arrangement is determined according to the length of the precoding queue. For example, if the size of the precoding queue is 10 frames, the length of the predefined frame type arrangement is greater than or equal to 10. The sequence number of the target image is determined according to the number of image frames stored in the pre-encoding queue. In some embodiments, the encoding device adds the target image frame to the pre-encoding queue after acquiring a certain target image frame to be encoded; By judging that the length of the pre-encoding queue is less than the rated length, the encoding device determines the frame type and encoding parameters of the target image frame through the predefined frame type arrangement and the sequence number of the target image frame. For example, the encoding device obtains the predefined frame type The arrangement is IBPBPIBPP, after the encoding device acquires a target image frame to be analyzed, the number of image frames stored in the pre-encoding queue is 4, and the sequence number of the target image frame is 4, so the frame type of the target image frame is B frame.
已编码图像帧的帧类型是指通过编码设备完成编码过程的图像帧的帧类型。在一些实施例中,编码设备通过变量对已编码图像帧的帧类型进行计数,例如,编码设备中设置了第一变量、第二变量和第三变量,其中,第一变量用于记录帧类型为I帧的已编码图像帧的数目,第二变量用于记录帧类型为P帧的已编码图像帧的数目,第三变量用于记录帧类型为B帧的已编码图像帧的数目;编码设备对目标图像帧进行编码后,对帧类型对应的变量的数值进行更新。例如,第二变量的当前数值为3,在编码设备将某个帧类型为P帧的目标图像帧进行编码后,第二变量数值更新为4。The frame type of the encoded image frame refers to the frame type of the image frame whose encoding process is completed by the encoding device. In some embodiments, the encoding device counts the frame types of encoded image frames through variables, for example, a first variable, a second variable and a third variable are set in the encoding device, wherein the first variable is used to record the frame type Be the number of encoded image frames of I frame, the second variable is used to record the number of encoded image frames of P frame as the frame type, and the third variable is used to record the number of encoded image frames of B frame as the frame type; encoding After the device encodes the target image frame, it updates the value of the variable corresponding to the frame type. For example, the current value of the second variable is 3, and the value of the second variable is updated to 4 after the encoding device encodes a target image frame whose frame type is P frame.
在一些实施例中,预定义的配置信息由编码设备确定,编码设备根据编码质量要求、视频播放质量等要求确定,预定义的帧类型配置信息的确定方式取决于实际情况,本申请在此不进行限定。在一些实施例中,预定义的帧类型配置信息中包括5个I帧,并且已编码的图像帧中包括1个I帧,编码设备获取某个待编码的目标图像帧并将该编码设备存储在预编码队列中,在预编码队列中图像帧的数量小于预编码队列大小的情况下,编码设备基于预配置的帧类型配置信息,确定该目标图像帧的帧类型为I帧。在一些实施例中,在编码设备将获取的某个目标图像帧存储在预编码队列中的情况下,编码设备根据预配置的帧类型配置信息、已编码图像帧的帧类型和预编码队列的长度,确定该目标图像帧的帧类型,例如,预配置的帧类型配置信息包括3个I帧,已编码的图像帧中包括2个I帧,预编码队列中长度为8,预编码队列的大小为10,编码设备根据上述信息确定该目标图像帧的帧类型为I帧。In some embodiments, the predefined configuration information is determined by the coding device, and the coding device determines according to requirements such as coding quality requirements and video playback quality. The method of determining the predefined frame type configuration information depends on the actual situation. To limit. In some embodiments, the predefined frame type configuration information includes 5 I frames, and the encoded image frame includes 1 I frame, and the encoding device acquires a target image frame to be encoded and stores the encoding device In the pre-coding queue, when the number of image frames in the pre-coding queue is smaller than the size of the pre-coding queue, the coding device determines that the frame type of the target image frame is an I frame based on preconfigured frame type configuration information. In some embodiments, when the encoding device stores an acquired target image frame in the pre-encoding queue, the encoding device according to the pre-configured frame type configuration information, the frame type of the encoded image frame and the pre-encoding queue Length, to determine the frame type of the target image frame, for example, the pre-configured frame type configuration information includes 3 I frames, the encoded image frame includes 2 I frames, the length of the pre-coding queue is 8, and the pre-coding queue The size is 10, and the encoding device determines that the frame type of the target image frame is an I frame according to the above information.
在一些实施例中,预定义的帧类型配置信息中包括至少一种帧类型的数量和两种不同帧类型数据的比例,例如某个预定义的帧类型配置信息包括3个I帧和I帧数量/P帧数量小于等于4。在一些实施例中,编码设备出于对视频编码的压缩率等方面的考虑,配置的预定义的帧类型配置信息中包括3个I帧,并且和I帧数量/P帧数量小于等于4,已编码的图像帧中包括3个I帧和6个P帧,编码设备获取某个待编码的目标图像帧,在该目标图像帧的参考图像帧的数量不满足分析条件的情况下,基于上述信息将目标图像帧的帧类型确定为P帧。In some embodiments, the predefined frame type configuration information includes the quantity of at least one frame type and the ratio of two different frame type data, for example, a certain predefined frame type configuration information includes 3 I frames and I frame Quantity/P frame quantity is less than or equal to 4. In some embodiments, the encoding device configures the predefined frame type configuration information to include 3 I frames, and the number of I frames/the number of P frames is less than or equal to 4 in consideration of the compression rate of video encoding, etc. The coded image frame includes 3 I frames and 6 P frames. The coding device acquires a target image frame to be coded. When the number of reference image frames of the target image frame does not meet the analysis conditions, based on the above The information identifies the frame type of the target image frame as a P frame.
步骤330,在目标图像帧的参考图像帧的数量不满足分析条件的情况下,将预定义的帧类型,确定为目标图像帧的帧类型。 Step 330, when the number of reference image frames of the target image frame does not satisfy the analysis condition, determine a predefined frame type as the frame type of the target image frame.
其中,步骤330包括以下几个子步骤,步骤332,获取预定义的编码参数配置信息,编码参数配置信息中包括至少一组帧类型和编码参数之间的对应关系;步骤334,从编码参数配置信息中,确定与目标图像帧的帧类型相对应的编码参数,作为目标图像帧的编码参数。Wherein, step 330 includes the following several sub-steps, step 332, obtains the predefined coding parameter configuration information, and the coding parameter configuration information includes at least one set of correspondence between frame types and coding parameters; step 334, from the coding parameter configuration information In, the encoding parameters corresponding to the frame type of the target image frame are determined as the encoding parameters of the target image frame.
编码参数包括比特率,QP等。Encoding parameters include bit rate, QP, etc.
在一些实施例中,编码设备获取某个待编码的目标图像帧,并将该目标图像帧存储在预编码队列中,在预编码队列的长度小于额定长度的情况下,编码设备通过预定义的帧类型排列IPPIPP和该目标图像帧的序号5确定目标图像帧的帧类型,通过预定义的编码参数确定目标图像帧的编码参数。In some embodiments, the encoding device acquires a certain target image frame to be encoded, and stores the target image frame in a pre-encoding queue. When the length of the pre-encoding queue is less than the rated length, the encoding device passes a predefined The frame type arrangement IPPIPP and the sequence number 5 of the target image frame determine the frame type of the target image frame, and the encoding parameters of the target image frame are determined through predefined encoding parameters.
在一些实施例中,编码设备获取某个待编码的目标图像帧,并将该目标图像帧存储在预编码队列中。编码设备判断该目标图像帧的参考图像帧是否满足编码条件,即判断预编码队列中存储的图像帧的数量是否达到预编码队列的大小。在目标图像帧的参考图像帧的数量不满足编码条件的情况下,也即预编码队列中图像帧的数量小于预编码队列的大小时,编码设备将预定义的编码参数,确定为该目标图像帧的编码参数。In some embodiments, the encoding device acquires a target image frame to be encoded, and stores the target image frame in a pre-encoding queue. The encoding device judges whether the reference image frame of the target image frame satisfies the encoding condition, that is, determines whether the number of image frames stored in the pre-encoding queue reaches the size of the pre-encoding queue. When the number of reference image frames of the target image frame does not meet the encoding conditions, that is, when the number of image frames in the pre-encoding queue is smaller than the size of the pre-encoding queue, the encoding device determines the predefined encoding parameters as the target image Encoding parameters for the frame.
在一些实施例中,编码设备通过码率控制模型获取目标图像帧的编码参数。编码设备通过码率控制模型确定目标图像帧的编码参数的详细内容,请参考下文中通过第二方式对目标图像帧进行编码的实施例。In some embodiments, the encoding device acquires the encoding parameters of the target image frame through a rate control model. For details about how the encoding device determines the encoding parameters of the target image frame through the rate control model, please refer to the following embodiment of encoding the target image frame in the second manner.
步骤340,在目标图像帧的帧类型为关键帧的情况下,采用帧内预测模式根据目标图像帧的编码参数对目标图像帧进行编码;或者,在目标图像帧的帧类型为非关键帧的情况下,采用帧间预测模式根据目标图像帧的编码参数对目标图像帧进行编码。 Step 340, in the case where the frame type of the target image frame is a key frame, use the intra prediction mode to encode the target image frame according to the encoding parameters of the target image frame; or, when the frame type of the target image frame is a non-key frame In this case, the target image frame is encoded according to the encoding parameters of the target image frame by using an inter-frame prediction mode.
在一些实施例中,编码设备通过目标图像帧的帧类型和编码参数对目标图像帧进行编码,编码的具体过程请参考背景技术中的详细介绍,在此不进行赘述。根据目标图像帧的信号类型以及编码设备的编码特性可以使用不同的编码方式,对目标图像帧进行编码,本申请对编码的具体方式不进行限定。In some embodiments, the encoding device encodes the target image frame according to the frame type and encoding parameters of the target image frame. For the specific encoding process, please refer to the detailed introduction in the background technology, and details are not repeated here. Different encoding methods may be used to encode the target image frame according to the signal type of the target image frame and the encoding characteristics of the encoding device, and the specific encoding method is not limited in this application.
请参考图4,其示出了本申请另一个实施例提供的视频编码方法的示意图。在一些实施例中,编码设备获取某个待编码的目标图像帧,通过判断确定该目标图像帧的参考图像帧的数量不满足分析条件,如图4所示,编码设备获取的待编码图像帧为第3帧图像帧,第3帧图像帧的参考图像帧的数量为2(第3帧图像帧的参考图像帧包括第1帧和第2帧),分析条件为30,编码设备通过第一方式确定第3帧图像帧的帧类型为I帧(关键帧),通过码率控制模型获得第三帧图像帧对应的编码参数。编码设备对第3帧图像帧进行帧内预测编码,获得残差值,并经过量化、熵编码等编码处理,最终获得第3帧图像帧对应的码流,编码设备输出该目标图像帧对应的码流,并准备获取新的目标图像帧。Please refer to FIG. 4 , which shows a schematic diagram of a video encoding method provided by another embodiment of the present application. In some embodiments, the encoding device obtains a target image frame to be encoded, and determines that the number of reference image frames of the target image frame does not meet the analysis conditions by judging. As shown in FIG. 4, the image frame to be encoded acquired by the encoding device Be the 3rd frame image frame, the quantity of the reference image frame of the 3rd frame image frame is 2 (the reference image frame of the 3rd frame image frame includes the 1st frame and the 2nd frame), the analysis condition is 30, the encoding device passes the first The method determines that the frame type of the third image frame is an I frame (key frame), and obtains the encoding parameters corresponding to the third image frame through the code rate control model. The encoding device performs intra-frame predictive encoding on the third image frame to obtain the residual value, and after encoding processes such as quantization and entropy encoding, finally obtains the code stream corresponding to the third image frame, and the encoding device outputs the corresponding code stream of the target image frame. Code stream, and prepare to acquire a new target image frame.
在一些实施例中,编码设备获取某个待编码的目标图像帧后,将该目标图像帧添加至预编码队列中;在预编码队列不满足额定长度的情况下,编码设备使用第一方式确定该目标图像帧的帧类型为P帧,并通过码率预测模型确定该目标图像帧对应的编码参数;编码设备将该目标图像帧的帧类型和编码参数添加至预编码队列;编码设备对该目标图像帧进行帧间预测等、获得残差值,并经过量化、熵编码等编码处理,最终获得该关键帧对应的码流,编码设备输出该目标图像帧对应的码流,可选地,在预编码队列长度小于额定长度的情况下,已经完成编码的目标图像帧存储在编码队列中。In some embodiments, after the encoding device acquires a target image frame to be encoded, the target image frame is added to the pre-encoding queue; when the pre-encoding queue does not meet the rated length, the encoding device uses the first method to determine The frame type of the target image frame is a P frame, and the encoding parameters corresponding to the target image frame are determined through a code rate prediction model; the encoding device adds the frame type and encoding parameters of the target image frame to the pre-encoding queue; The target image frame is subjected to inter-frame prediction to obtain a residual value, and after encoding processing such as quantization and entropy encoding, the code stream corresponding to the key frame is finally obtained, and the encoding device outputs the code stream corresponding to the target image frame. Optionally, In the case that the length of the pre-encoding queue is less than the rated length, the encoded target image frame is stored in the encoding queue.
编码设备在目标图像帧的参考图像帧的数量不满足分析条件的情况下,不使用目标图像的图像帧对目标图像帧的帧类型和编码参数进行预测,通过预定义的帧类型排列直接确定目标图像帧的帧类型,能节省编码设备确定目标图像帧的帧类型的时间,避免了目标图像帧的参考图像帧的数量不满足分析条件的情况下,编码设备获取到足够数量的参考图像帧才能预测目标图像帧的帧类型和编码参数导致的编码时长的浪费。When the number of reference image frames of the target image frame does not meet the analysis conditions, the encoding device does not use the image frames of the target image to predict the frame type and encoding parameters of the target image frame, and directly determines the target through the predefined frame type arrangement The frame type of the image frame can save the time for the encoding device to determine the frame type of the target image frame, and avoid the situation that the encoding device obtains a sufficient number of reference image frames when the number of reference image frames of the target image frame does not meet the analysis conditions. The waste of encoding time caused by predicting the frame type and encoding parameters of the target image frame.
下面通过几个实施例对目标图像帧的参考图像帧的数量满足分析条件的情况下的视频编码方法进行介绍。The video encoding method under the condition that the number of reference image frames of the target image frame satisfies the analysis condition is introduced below through several embodiments.
在一些实施例中,获取待编码的目标图像帧之后,还包括:在参考图像帧的数量满足分析条件的情况下,按照第二方式确定目标图像帧的帧类型和编码参数;其中,第二方式是指依据参考图像帧确定帧类型和编码参数的方式。In some embodiments, after obtaining the target image frame to be coded, it further includes: when the number of reference image frames satisfies the analysis condition, determining the frame type and encoding parameters of the target image frame in a second manner; wherein, the second The method refers to the method of determining the frame type and encoding parameters according to the reference image frame.
第二方式是指在对目标图像帧进行正式编码前,编码设备需要根据目标图像帧和参考图像帧,预测目标图像帧的帧类型和编码参数,已获得较好的压缩率并降低视频编码造成的失真度,保证视频编码后得到的码流在解码后能产生码率稳定的视频画面。The second method means that before formally encoding the target image frame, the encoding device needs to predict the frame type and encoding parameters of the target image frame according to the target image frame and the reference image frame, so as to obtain a better compression rate and reduce the video encoding loss. The degree of distortion ensures that the code stream obtained after video encoding can produce a video picture with a stable bit rate after decoding.
在一些实施例中,编码设备获取目标图像帧后,将目标图像帧添加至预编码队列,在预编码队列的长度满足额定长度的情况下,编码设备按照第二方式确定目标图像帧的帧类型和编码参数。在一些实施例中,预编码队列中除编码设备添加的目标图像帧以外,其它多个的图像帧是已经编码的图像帧,可选地,预编码队列中包括其它多个图像帧的帧类型和编码参数。在一些实施例中,预编码队列中的除目标图像帧以外的所有图像帧均为目标图像帧的参考图像帧。In some embodiments, after the encoding device acquires the target image frame, it adds the target image frame to the pre-coding queue, and when the length of the pre-coding queue satisfies the rated length, the encoding device determines the frame type of the target image frame according to the second method and encoding parameters. In some embodiments, in addition to the target image frame added by the encoding device in the pre-encoding queue, the other multiple image frames are already encoded image frames. Optionally, the pre-encoding queue includes the frame types of the other multiple image frames and encoding parameters. In some embodiments, all image frames in the precoding queue except the target image frame are reference image frames of the target image frame.
在一些实施例中,按照第二方式确定目标图像帧的帧类型和编码参数,包括:根据目标图像帧的参考图像帧确定目标图像帧的帧类型;根据配置参数和/或目标图像帧的帧类型目标图像帧的编码参数,配置参数用于约束目标图像帧的码率。In some embodiments, determining the frame type and encoding parameters of the target image frame according to the second method includes: determining the frame type of the target image frame according to the reference image frame of the target image frame; The encoding parameters of the type target image frame, the configuration parameters are used to constrain the bit rate of the target image frame.
在一些实施例中,根据参考图像帧确定目标图像帧的帧类型,包括;生成多种帧类型排列,每一种帧类型排列中包括目标图像帧的一种假定帧类型和位于假定帧类型之后的至少一个帧类型;计算多种帧类型排列分别对应的失真率;其中,帧类型排列对应的失真率是根据帧类型排列对目标图像帧和至少一个参考图像帧进行编码确定的;从多种帧类型排列中,确定失真率最小的目标帧类型排列;将目标帧类型排列中的假定帧类型,确定为目标图像帧的帧类型。In some embodiments, determining the frame type of the target image frame according to the reference image frame includes: generating a plurality of frame type arrangements, each frame type arrangement including an assumed frame type of the target image frame and an assumed frame type following the assumed frame type at least one frame type; calculate the distortion rates corresponding to multiple frame type arrangements; wherein, the distortion rate corresponding to the frame type arrangement is determined by encoding the target image frame and at least one reference image frame according to the frame type arrangement; from a variety of In the frame type arrangement, determine the target frame type arrangement with the smallest distortion rate; determine the assumed frame type in the target frame type arrangement as the frame type of the target image frame.
在一些实施例中,目标图像帧的一种假定帧类型包括,I帧、P帧和B帧。位于假定帧类型之后的至少一个帧类型是指根据目标图像帧的假定帧类型,确定的参考图像帧的假定帧类型。编码设备在根据参考图像帧确定某个待编码的目标图像帧的帧类型的过程中,先假定一个该目标图像帧的帧类型,根据该目标图像帧的假定的帧类型逐个确定参考图像帧的假定帧类型,最终获得多个帧类型排列,由于目标图像帧的假定帧类型会影响后续参考图像帧确定的假定帧,因此目标图像帧设定不同的假定帧类型,会产生多个不同的帧类型排列,例如,某个待编码的目标图像帧具有5个参考图像帧,将该目标图像帧的假定帧类型确定为I帧,根据该目标图像帧的画面内容以及该目标图像帧的假定帧类型,确定的第一参考图像帧的假定帧类型为B帧,第二参考图像帧基于目标图像帧的画面内容、目标图像帧的假定帧类型、第一参考图像帧的画面内容、第一参考图像帧的假定帧类型,确定的假定帧类型为B帧,重复上述步骤直到该目标图像帧的所有参考图像帧的假定帧类型确定完成,最终得到的一个帧类型排列为IBBPBB;将该目标图像帧的帧类型确定为B帧,重复上述步骤,确定所有参考图像帧的假定帧类型,最终得到的一个帧类型排列为BIPBPP。In some embodiments, an assumed frame type of the target image frame includes I-frame, P-frame and B-frame. At least one frame type following the assumed frame type refers to the assumed frame type of the reference image frame determined according to the assumed frame type of the target image frame. In the process of determining the frame type of a target image frame to be encoded according to the reference image frame, the encoding device first assumes a frame type of the target image frame, and determines the frame type of the reference image frame one by one according to the assumed frame type of the target image frame. Assume frame type, and finally obtain multiple frame type arrangements, because the assumed frame type of the target image frame will affect the assumed frame determined by the subsequent reference image frame, so the target image frame sets different assumed frame types, resulting in multiple different frames Type arrangement, for example, a target image frame to be coded has 5 reference image frames, the assumed frame type of the target image frame is determined as I frame, according to the picture content of the target image frame and the assumed frame of the target image frame Type, the assumed frame type of the first reference image frame determined is B frame, the second reference image frame is based on the picture content of the target image frame, the assumed frame type of the target image frame, the picture content of the first reference image frame, the first reference The assumed frame type of the image frame, the determined assumed frame type is a B frame, repeat the above steps until the assumed frame types of all reference image frames of the target image frame are determined, and a frame type finally obtained is arranged as IBBPBB; the target image The frame type of the frame is determined as a B frame, and the above steps are repeated to determine the assumed frame types of all reference image frames, and finally a frame type obtained is arranged as BIPBPP.
编码设备计算多个帧类型排列分别对应的失真率,其中帧类型排列对应的失真率是编码设备基于该帧类型排列对目标图像帧和参考图像帧内进行粗略编码确定的。帧类型排列对应的失真率越小,说明这种这种帧类型排列的编码质量越好。编码设备选择失真率最小的帧类型排列作为目标帧类型序列,并将该目标帧类型排列中目标图像帧的假定帧类型作为该目标图像帧的帧类型。在一些实施例中,通过码率控制模型确定目标图像帧的编码参数。The encoding device calculates distortion rates corresponding to multiple frame type arrangements, wherein the distortion rates corresponding to the frame type arrangements are determined by the encoding device through rough encoding of the target image frame and the reference image frame based on the frame type arrangement. The smaller the distortion rate corresponding to the frame type arrangement is, the better the encoding quality of this frame type arrangement is. The encoding device selects the frame type arrangement with the smallest distortion rate as the target frame type sequence, and uses the assumed frame type of the target image frame in the target frame type arrangement as the frame type of the target image frame. In some embodiments, the encoding parameters of the target image frame are determined through a rate control model.
由于视频文件或实时直播场景中,相邻视频图像帧之间的变换频率较低,因此通过目标图像帧的前向参考图像帧预测目标图像帧的帧类型能够获取较可靠的帧类型预测结果。Since the conversion frequency between adjacent video image frames is low in video files or real-time live scenes, predicting the frame type of the target image frame through the forward reference image frame of the target image frame can obtain more reliable frame type prediction results.
在一些实施例中,目标图像帧的编码参数至少包括:目标图像帧的码率和量化参数。根据配置参数和/或目标图像帧的帧类型,确定目标图像帧的编码参数,包括:根据配置参数确定目标图像帧的码率,以及,根据目标图像帧的帧类型确定目标图像帧的量化参数。或者,根据配置参数确定目标图像帧的码率和量化参数。In some embodiments, the encoding parameters of the target image frame at least include: a code rate and a quantization parameter of the target image frame. According to the configuration parameters and/or the frame type of the target image frame, determining the encoding parameters of the target image frame includes: determining the code rate of the target image frame according to the configuration parameters, and determining the quantization parameter of the target image frame according to the frame type of the target image frame . Or, determine the bit rate and quantization parameters of the target image frame according to the configuration parameters.
编码设备可以根据不同的编码要求设置不同的配置参数。在一些实施例中,为了保持编码后的视频码率稳定,编码设备采用ABR(Average Bitrate,平均码率)控制模型作为配置参数。编码设备还可以采用CRF(Constant Rate Factor,恒定码率系数)、CQP(Constant Quantizer Parameter,恒定量化参数)和R-QP(Rate-Quantization Parameter,码率-量化)等码率控制模型作为配置参数,配置参数根据网络通信质量和编码要求等条件确定,本申请在此不进行限定。The encoding device can set different configuration parameters according to different encoding requirements. In some embodiments, in order to keep the encoded video bit rate stable, the encoding device uses an ABR (Average Bitrate, average bit rate) control model as a configuration parameter. The encoding device can also use rate control models such as CRF (Constant Rate Factor, constant rate factor), CQP (Constant Quantizer Parameter, constant quantization parameter) and R-QP (Rate-Quantization Parameter, rate-quantization) as configuration parameters , the configuration parameters are determined according to conditions such as network communication quality and coding requirements, which are not limited in this application.
在一些实施例中,编码设备基于R-QP码率控制模型确定配置参数,配置参数中包括初始QP和二次模型的码率R i计算公式: In some embodiments, the encoding device determines the configuration parameters based on the R-QP code rate control model, and the configuration parameters include the calculation formula of the code rate R i of the initial QP and the quadratic model:
Figure PCTCN2022118265-appb-000001
Figure PCTCN2022118265-appb-000001
其中,R i表示目标图像帧的码率,a 0、a 1和a i用于表示与目标图像帧的内容相关的信息,Q i是指根据初始QP对目标图像帧进行编码后得到的失真率。 Among them, R i represents the code rate of the target image frame, a 0 , a 1 and a i are used to represent information related to the content of the target image frame, Q i refers to the distortion obtained after encoding the target image frame according to the initial QP Rate.
得到目标帧类型的码率后,编码设备在RDO(Rate-Distortion Optimization,率失真优化)过程中,根据目标图像帧的帧类型确定目标图像帧的QP。After obtaining the code rate of the target frame type, the encoding device determines the QP of the target image frame according to the frame type of the target image frame in the RDO (Rate-Distortion Optimization) process.
在一些实施例中,编码设备基于CQP确定配置参数,按照CQP方法,编码设备在配置参数中配置固定的QP。可选地,编码设备为不同的帧类型配置不同的QP,例如,编码设备将帧类型为P帧的目标图像帧的QP配置为第一固定QP。编码设备根据配置参数中固定的QP计算目标图像帧的码率。In some embodiments, the encoding device determines configuration parameters based on CQP, and according to the CQP method, the encoding device configures a fixed QP in the configuration parameters. Optionally, the encoding device configures different QPs for different frame types, for example, the encoding device configures the QP of the target image frame whose frame type is P frame as the first fixed QP. The encoding device calculates the code rate of the target image frame according to the fixed QP in the configuration parameters.
需要说明的是,在视频文件进行预编码过程中,用于确定目标图像帧的码率和量化参数的所有方法,都可以在本申请中使用。It should be noted that all methods for determining the bit rate and quantization parameters of the target image frame during the pre-encoding process of the video file can be used in this application.
请参考图5,其示出本方法另一个实施例提供的视频编码方法的示意图。Please refer to FIG. 5 , which shows a schematic diagram of a video encoding method provided by another embodiment of the method.
编码设备获取目标图像帧后,将目标图像帧添加至预编码队列,预编码队列的长度为n,额定长度为n,n为正整数。在0-T1时间段内,预编码队列中帧的数量小于n,对于获取到的某个待编码的目标图像帧,编码设备采用第一方式确该定目标图像帧的帧类型和编码参数;在一个示例中,第1帧输入预编码队列,预编码队列中帧的数量为1,小于n,不满足分析条件,编码设备通过第一方式确定该目标图像帧的帧类型为I帧,基于码率控制模型确定第1帧对应的编码参数,并对第1帧进行编码,编码后获得第1帧对应的码流,输出该码流。在T1时刻后,编码设备获取第n帧目标图像帧,此时,预编码队列的长度为n,满足分析条件;编码设备采用第二方式确定第n帧目标图像帧类型和编码参数,在另一个示例中,编码设备将第n+1帧添加至预编码队列,预编码队列中包括第2帧至第n+1帧中的任意一帧。编码设备根据第2帧至第n+1帧生成m种帧类型排列,m为正整数,计算m种帧类型排列分别对应的编码失真率,将失真率最小的帧类型排列确定为目标帧类型排列,并根据该目标图像帧类型排列确定第n+1帧的目标图像帧的帧类型,编码设备通过码率控制模型确定第n+1帧的编码参数。编码设备根据预测得到的帧类型和编码参数对第n+1帧进行编码,得到第n+1帧对应的码流,输出该码流。编码设备将第2帧移出预编码队列,等待第n+2帧待编码画面输入预编码队列。After the encoding device acquires the target image frame, it adds the target image frame to the pre-encoding queue, the length of the pre-encoding queue is n, and the rated length is n, where n is a positive integer. During the 0-T1 time period, the number of frames in the pre-coding queue is less than n, and for an acquired target image frame to be encoded, the encoding device uses the first method to determine the frame type and encoding parameters of the target image frame; In one example, the first frame is input into the pre-encoding queue, the number of frames in the pre-encoding queue is 1, less than n, and the analysis condition is not met, and the encoding device determines that the frame type of the target image frame is an I frame through the first method, based on The code rate control model determines the encoding parameters corresponding to the first frame, and encodes the first frame, obtains the code stream corresponding to the first frame after encoding, and outputs the code stream. After time T1, the encoding device acquires the nth frame of the target image frame. At this time, the length of the pre-encoding queue is n, which meets the analysis conditions; the encoding device uses the second method to determine the type of the nth frame of the target image frame and the encoding parameters. In an example, the encoding device adds the n+1th frame to the pre-encoding queue, and the pre-encoding queue includes any frame from the second frame to the n+1th frame. The encoding device generates m frame type arrangements according to the second frame to the n+1th frame, m is a positive integer, calculates the encoding distortion rates corresponding to the m frame type arrangements, and determines the frame type arrangement with the smallest distortion rate as the target frame type arrangement, and determine the frame type of the target image frame of the n+1th frame according to the arrangement of the target image frame type, and the encoding device determines the encoding parameters of the n+1th frame through a code rate control model. The encoding device encodes the n+1th frame according to the predicted frame type and encoding parameters to obtain a code stream corresponding to the n+1th frame, and outputs the code stream. The encoding device removes the second frame from the pre-encoding queue, and waits for the n+2 frame to be encoded to enter the pre-encoding queue.
请参考图6,其示出本申请另一个实施例提供的视频编码过程的示意图。Please refer to FIG. 6 , which shows a schematic diagram of a video encoding process provided by another embodiment of the present application.
在实时直播的应用场景下,编码设备通过画面采集设备获取某个待编码的目标图像帧;编码设备将该目标图像帧添加至预编码队列中;编码设备判断预编码队列的长度是否满足分析要求,在一些实施例中,分析要求是预编码队中存储的图像帧的数量达到与编码队列的大小相同;在预编码队列长度不满足分析要求的情况下,编码设备采用第一方式确定的该目标图像帧的帧类型和编码参数,编码设备将该目标图像的帧类型和编码参数添加至预编码队列中,并根据帧类型和编码参数对该目标图像帧进行编码,获得对应的码流,并输出该码流;在预编码队列长度满足分析要求的情况下,编码设备采用第二方式预测目标图像帧的帧类型和编码参数,使得通过预测获得的目标图像帧的帧类型能在编码过程中产生的失真度最小。 编码设备将编码设备将目标图像的帧类型和编码参数添加至预编码队列中,并根据帧类型和编码参数对目标图像帧进行编码,获得对应的码流,并输出该码流;编码设备通过第二方式获得目标图像帧的帧类型和编码参数后,将最早添加到预编码队列中的图像帧移出预编码队列,保证编码设备可以将下一次从画面采集设备中获得的目标图像帧添加至预编码队列中。编码设备完成对一帧目标图像帧进行编码之后,需要判断是否继续重复上述编码过程,若编码设备接收到退出指令,例如终端设备上由用户退出视频播放界面触发的退出指令、或画面采集设备结束工作触发的退出指令,则编码设备停止上述编码过程。In the application scenario of real-time live broadcast, the encoding device obtains a target image frame to be encoded through the screen acquisition device; the encoding device adds the target image frame to the pre-encoding queue; the encoding device judges whether the length of the pre-encoding queue meets the analysis requirements , in some embodiments, the analysis requirement is that the number of image frames stored in the pre-encoding queue is the same as the size of the encoding queue; when the length of the pre-encoding queue does not meet the analysis requirements, the encoding device adopts the The frame type and encoding parameters of the target image frame, the encoding device adds the frame type and encoding parameters of the target image to the pre-encoding queue, and encodes the target image frame according to the frame type and encoding parameters to obtain the corresponding code stream, And output the code stream; in the case that the length of the pre-encoding queue meets the analysis requirements, the encoding device uses the second method to predict the frame type and encoding parameters of the target image frame, so that the frame type of the target image frame obtained through prediction can be obtained during the encoding process. produces the least amount of distortion. The encoding device adds the frame type and encoding parameters of the target image to the pre-encoding queue, and encodes the target image frame according to the frame type and encoding parameters, obtains the corresponding code stream, and outputs the code stream; the encoding device passes In the second method, after obtaining the frame type and encoding parameters of the target image frame, the image frame added to the pre-encoding queue at the earliest is removed from the pre-encoding queue to ensure that the encoding device can add the target image frame obtained from the picture acquisition device next time to the in the precoding queue. After the encoding device finishes encoding a target image frame, it needs to judge whether to continue to repeat the above encoding process. If the encoding device receives an exit command, such as an exit command triggered by the user exiting the video playback interface on the terminal device, or the end of the screen capture device If the exit command is triggered by the work, the coding device stops the above coding process.
通过上述方法,在目标图像帧的参考图像帧的数量不满足分析条件的情况下,使用第一方式直接确定目标图像帧的帧类型和编码参数,减少了编码设备的等待时间,降低了编码延时;在目标图像帧的参考图像帧的数量满足分析条件的情况下,使用第二方式通过参考图像帧预测目标图像的帧类型,有助于维持编码过程码率的稳定性,避免帧间质量的波动。由于视频文件中的图像帧或直播场景的图像帧的内容切换率较低,即场景发生切换的情况较少,目标图像帧与相邻的参考图像帧之间具有很强的相关性,因此,在本方法中,采用第二方式,即使用前向参考图像帧对目标图像帧的帧类型进行预测,能够达到使用后向参考图像帧对目标图像帧类型进行预测相近的效果,并且降低了编码时延。Through the above method, when the number of reference image frames of the target image frame does not meet the analysis conditions, the frame type and encoding parameters of the target image frame are directly determined using the first method, which reduces the waiting time of the encoding device and reduces the encoding delay. When the number of reference image frames of the target image frame satisfies the analysis conditions, using the second method to predict the frame type of the target image through the reference image frames helps to maintain the stability of the encoding process bit rate and avoid inter-frame quality fluctuations. Since the content switching rate of the image frames in the video file or the image frames of the live scene is low, that is, the scene switching is less, and there is a strong correlation between the target image frame and the adjacent reference image frame, therefore, In this method, the second method is adopted, that is, using the forward reference image frame to predict the frame type of the target image frame can achieve the similar effect of using the backward reference image frame to predict the target image frame type, and reduces the coding cost. delay.
虽然,在使用第一方式确定直接确定目标图像帧的帧类型和编码参数时,会造成帧间质量的波动,但是在视频流进行编码的过程中,按照第一方式确定帧类型和编码参数的图像帧的数量远小于按照第二方式确定帧类型和编码参数的图像帧的数量,因此采用本方法建立了合理的码率控制模型,进行在视频播放的初始的很短一段时间内会因此帧间质量的波动,此后均能保持的视频画面的帧间质量的平稳性,并且编码延时低,在直播和RTC等场景中具有很高的使用性。Although, when using the first method to directly determine the frame type and coding parameters of the target image frame, it will cause fluctuations in the quality between frames, but in the process of encoding the video stream, the first method is used to determine the frame type and coding parameters. The number of image frames is much smaller than the number of image frames that determine the frame type and encoding parameters according to the second method, so this method is used to establish a reasonable code rate control model, and the frame will be reduced in a short period of time during the initial period of video playback. Inter-frame quality fluctuations, and the stability of the inter-frame quality of video images can be maintained afterwards, and the encoding delay is low, which is highly usable in live broadcast and RTC scenarios.
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。The following are device embodiments of the present application, which can be used to implement the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
请参考图7,其示出了本申请一个实施例提供的视频编码装置的框图。该装置具有实现上述视频编码方法的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是上文介绍的电子设备,也可以设置在电子设备中。该装置700可以包括:图像帧获取模块710、第一编码确定模块720和图像帧编码模块730。Please refer to FIG. 7 , which shows a block diagram of a video coding apparatus provided by an embodiment of the present application. The device has the function of realizing the above video coding method, and the function can be realized by hardware, or by hardware executing corresponding software. The device may be the electronic device described above, or may be set in the electronic device. The apparatus 700 may include: an image frame acquisition module 710 , a first encoding determination module 720 and an image frame encoding module 730 .
图像帧获取模块710,配置为获取待编码的目标图像帧。The image frame acquisition module 710 is configured to acquire the target image frame to be encoded.
第一编码确定模块720,配置为在所述目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定所述目标图像帧的帧类型和编码参数;其中,所述第一方式是指不依据所述参考图像帧确定帧类型和编码参数的方式。The first encoding determination module 720 is configured to determine the frame type and encoding parameters of the target image frame in a first manner when the number of reference image frames of the target image frame does not meet the analysis condition; wherein, the The first manner refers to a manner in which frame types and encoding parameters are not determined according to the reference image frame.
图像帧编码模块730,配置为根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码。The image frame encoding module 730 is configured to encode the target image frame according to the frame type and encoding parameters of the target image frame.
在一些实施例中,如图8所示,所述第一编码确定模块720包括:第一帧类型确定单元722,配置为将预定义的帧类型,确定为目标图像帧的帧类型;第一参数确定单元724,配置为将预定义的编码参数,确定为所述目标图像帧的编码参数。In some embodiments, as shown in FIG. 8, the first encoding determining module 720 includes: a first frame type determining unit 722 configured to determine a predefined frame type as the frame type of the target image frame; the first The parameter determining unit 724 is configured to determine a predefined encoding parameter as an encoding parameter of the target image frame.
在一些实施例中,所述第一帧类型确定单元722,配置为:获取预定义的帧类型排列,所述预定义的帧类型排列是指预先确定的多个图像帧分别对应的帧类型的排布方式;基于所述目标图像帧的序号从所述预定义的帧类型排列中,确定所述目标图像帧的帧类型;或者,获取预定义的帧类型配置信息,所述预定义的帧类型配置信息中包括至少一种帧类型的配置数量;基于已编码图像帧的帧类型和所述预定义的帧类型配置信息,确定所述目标图像帧的帧类型。In some embodiments, the first frame type determination unit 722 is configured to: acquire a predefined frame type arrangement, the predefined frame type arrangement refers to the frame types corresponding to the predetermined plurality of image frames respectively Arrangement mode: determine the frame type of the target image frame from the predefined frame type arrangement based on the sequence number of the target image frame; or obtain predefined frame type configuration information, the predefined frame The type configuration information includes configuration quantities of at least one frame type; and the frame type of the target image frame is determined based on the frame type of the encoded image frame and the predefined frame type configuration information.
在一些实施例中,所述第一参数确定单元724,配置为所述获取预定义的编码参数配置信息,所述编码参数配置信息中包括至少一组帧类型和编码参数之间的对应关系;从所述编码参数配置信息中,确定与所述目标图像帧的帧类型相对应的编码参数,作为所述目标图像 帧的编码参数。In some embodiments, the first parameter determination unit 724 is configured to acquire predefined encoding parameter configuration information, where the encoding parameter configuration information includes at least one set of correspondence between frame types and encoding parameters; From the encoding parameter configuration information, determine an encoding parameter corresponding to the frame type of the target image frame as an encoding parameter of the target image frame.
所述图像帧编码模块730,配置为在所述目标图像帧的帧类型为关键帧的情况下,采用帧内预测模式根据所述目标图像帧的编码参数对所述目标图像帧进行编码;或者,在所述目标图像帧的帧类型为非关键帧的情况下,采用帧间预测模式根据所述目标图像帧的编码参数对所述目标图像帧进行编码。The image frame encoding module 730 is configured to encode the target image frame by using an intra prediction mode according to encoding parameters of the target image frame when the frame type of the target image frame is a key frame; or , if the frame type of the target image frame is a non-key frame, encode the target image frame in an inter-frame prediction mode according to the encoding parameters of the target image frame.
在一些实施例中,如图8所示,所述装置700还包括第二编码确定模块740,配置为在所述参考图像帧的数量满足所述分析条件的情况下,按照第二方式确定所述目标图像帧的帧类型和编码参数;其中,所述第二方式是指依据所述参考图像帧确定帧类型和编码参数的方式。In some embodiments, as shown in FIG. 8 , the apparatus 700 further includes a second encoding determination module 740 configured to determine the encoding in a second manner when the number of the reference image frames satisfies the analysis condition. The frame type and coding parameters of the target image frame; wherein, the second method refers to a method of determining the frame type and coding parameters according to the reference image frame.
在一些实施例中,所述第二编码确定模块740包括:第二帧类型确定单元742,配置为根据目标图像帧的参考图像帧确定目标图像帧的帧类型;第二参数确定单元744,配置为根据配置参数和/或目标图像帧的帧类型目标图像帧的编码参数,所述配置参数用于约束所述目标图像帧的码率。In some embodiments, the second encoding determination module 740 includes: a second frame type determination unit 742 configured to determine the frame type of the target image frame according to a reference image frame of the target image frame; a second parameter determination unit 744 configured to The encoding parameters of the target image frame according to the configuration parameters and/or the frame type of the target image frame, the configuration parameters are used to constrain the code rate of the target image frame.
在一些实施例中,所述第二帧类型确定单元742,配置为生成多种帧类型排列,每一种帧类型排列中包括所述目标图像帧的一种假定帧类型和位于所述假定帧类型之后的至少一个帧类型;计算所述多种帧类型排列分别对应的失真率;其中,所述帧类型排列对于的失真率是根据所述帧类型排列对所述目标图像帧和至少一个参考图像帧进行编码确定的;从所述多种帧类型排列中,确定失真率最小的目标帧类型排列;将所述目标帧类型排列中的假定帧类型,确定为所述目标图像帧的帧类型。In some embodiments, the second frame type determination unit 742 is configured to generate a plurality of frame type arrangements, each frame type arrangement includes a hypothetical frame type of the target image frame and a frame located in the hypothetical frame At least one frame type after the type; calculate the distortion rates corresponding to the multiple frame type arrangements; wherein, the distortion rate for the frame type arrangement is based on the frame type arrangement for the target image frame and at least one reference The image frame is determined by encoding; from the various frame type arrangements, determine the target frame type arrangement with the smallest distortion rate; determine the assumed frame type in the target frame type arrangement as the frame type of the target image frame .
在一些实施例中,如图8所示,所述装置700还包括图像帧添加模块750,配置为将所述目标图像帧添加至预编码队列中;在所述预编码队列的长度小于额定长度的情况下,确定所述目标图像帧的参考图像帧的数量不满足分析条件;在所述预编码队列的长度等于所述额定长度的情况下,确定所述目标图像帧的参考图像帧的数量满足分析条件。In some embodiments, as shown in FIG. 8 , the apparatus 700 further includes an image frame adding module 750 configured to add the target image frame to a pre-coding queue; when the length of the pre-coding queue is less than the rated length In the case of , it is determined that the number of reference image frames of the target image frame does not meet the analysis condition; when the length of the precoding queue is equal to the rated length, determine the number of reference image frames of the target image frame meet the analysis conditions.
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内容结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that, when realizing the functions of the device provided by the above-mentioned embodiments, the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to the needs. The content structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the device and the method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
在示例中实施例中,还提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序。所述计算机程序经配置以由一个或者一个以上处理器执行,以实现上述视频编码方法。In an example embodiment, there is also provided a computer device comprising a processor and a memory in which a computer program is stored. The computer program is configured to be executed by one or more processors to implement the above video encoding method.
计算机设备可以称为视频编码设备,用于将视频文件进行编码,以减少传输视频文件消耗的资源。A computer device may be referred to as a video encoding device, and is used for encoding a video file, so as to reduce resources consumed in transmitting the video file.
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序在被计算机设备的处理器执行时实现上述视频编码方法。In an exemplary embodiment, there is also provided a computer-readable storage medium, in which a computer program is stored, and the computer program implements the above video encoding method when executed by a processor of a computer device.
可选地,上述计算机可读存储介质可以是ROM(Read-Only Memory,只读存储器)和RAM(Random Access Memory,随机存取存储器)等存储设备。Optionally, the above-mentioned computer-readable storage medium may be storage devices such as ROM (Read-Only Memory, read-only memory) and RAM (Random Access Memory, random access memory).
在示例性实施例中,还提供了一种计算机程序产品,当所述计算机程序产品在计算机设备上运行时,使得计算机设备执行如上述视频编码方法。In an exemplary embodiment, a computer program product is also provided, which, when the computer program product is run on a computer device, causes the computer device to execute the above video encoding method.
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外,本文中描述的步骤编号,仅示例性示出了步骤间的一种可能的执行先后顺序,在一些其它实施例中,上述步骤也可以不按照编号顺序来执行,如两个不同编号的步骤同时执行,或者两个不同编号的步骤按照与图示相反的顺序执行,本申请实施例对此不作限定。It should be understood that the "plurality" mentioned herein refers to two or more than two. "And/or" describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently. The character "/" generally indicates that the contextual objects are an "or" relationship. In addition, the numbering of the steps described herein only exemplarily shows a possible sequence of execution among the steps. In some other embodiments, the above-mentioned steps may not be executed according to the order of the numbers, such as two different numbers The steps are executed at the same time, or two steps with different numbers are executed in the reverse order as shown in the illustration, which is not limited in this embodiment of the present application.

Claims (13)

  1. 一种视频编码方法,所述方法由计算机设备执行,所述方法包括:A video encoding method, the method is executed by a computer device, the method comprising:
    获取待编码的目标图像帧;Obtain the target image frame to be encoded;
    在所述目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定所述目标图像帧的帧类型和编码参数;其中,所述第一方式是指不依据所述参考图像帧确定帧类型和编码参数的方式;In the case that the number of reference image frames of the target image frame does not satisfy the analysis condition, determine the frame type and encoding parameters of the target image frame according to the first method; wherein, the first method refers to not according to the The method of determining the frame type and encoding parameters with reference to the image frame;
    根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码。Encoding the target image frame according to the frame type and encoding parameters of the target image frame.
  2. 根据权利要求1所述的方法,其中,所述按照第一方式确定所述目标图像帧的帧类型和编码参数,包括:The method according to claim 1, wherein said determining the frame type and encoding parameters of the target image frame according to the first method comprises:
    将预定义的帧类型,确定为目标图像帧的帧类型;Determining the predefined frame type as the frame type of the target image frame;
    将预定义的编码参数,确定为所述目标图像帧的编码参数。A predefined encoding parameter is determined as an encoding parameter of the target image frame.
  3. 根据权利要求2所述的方法,其中,所述将预定义的帧类型,确定为目标图像帧的帧类型,包括:The method according to claim 2, wherein said determining the predefined frame type as the frame type of the target image frame comprises:
    获取预定义的帧类型排列,所述预定义的帧类型排列是指预先确定的多个图像帧分别对应的帧类型的排布方式;基于所述目标图像帧的序号从所述预定义的帧类型排列中,确定所述目标图像帧的帧类型;Acquiring a predefined frame type arrangement, the predefined frame type arrangement refers to the arrangement of frame types corresponding to a plurality of predetermined image frames; In the type arrangement, determine the frame type of the target image frame;
    或者,or,
    获取预定义的帧类型配置信息,所述预定义的帧类型配置信息中包括至少一种帧类型的配置数量;基于已编码图像帧的帧类型和所述预定义的帧类型配置信息,确定所述目标图像帧的帧类型。Acquire predefined frame type configuration information, the predefined frame type configuration information including the configuration quantity of at least one frame type; based on the frame type of the encoded image frame and the predefined frame type configuration information, determine the Describes the frame type of the target image frame.
  4. 根据权利要求2所述的方法,其中,所述将预定义的编码参数,确定为所述目标图像帧的编码参数,包括:The method according to claim 2, wherein said determining the predefined encoding parameters as the encoding parameters of the target image frame comprises:
    获取预定义的编码参数配置信息,所述编码参数配置信息中包括至少一组帧类型和编码参数之间的对应关系;Obtain predefined encoding parameter configuration information, where the encoding parameter configuration information includes at least one set of correspondence between frame types and encoding parameters;
    从所述编码参数配置信息中,确定与所述目标图像帧的帧类型相对应的编码参数,作为所述目标图像帧的编码参数。From the encoding parameter configuration information, determine an encoding parameter corresponding to the frame type of the target image frame as an encoding parameter of the target image frame.
  5. 根据权利要求1所述的方法,其中,所述根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码,包括:The method according to claim 1, wherein said encoding the target image frame according to the frame type and encoding parameters of the target image frame comprises:
    在所述目标图像帧的帧类型为关键帧的情况下,采用帧内预测模式根据所述目标图像帧的编码参数对所述目标图像帧进行编码;In the case where the frame type of the target image frame is a key frame, encoding the target image frame by using an intra-frame prediction mode according to the encoding parameters of the target image frame;
    或者,or,
    在所述目标图像帧的帧类型为非关键帧的情况下,采用帧间预测模式根据所述目标图像帧的编码参数对所述目标图像帧进行编码。When the frame type of the target image frame is a non-key frame, the target image frame is encoded according to the encoding parameters of the target image frame by using an inter-frame prediction mode.
  6. 根据权利要求1所述的方法,其中,所述获取待编码的目标图像帧之后,还包括:The method according to claim 1, wherein, after acquiring the target image frame to be encoded, further comprising:
    在所述参考图像帧的数量满足所述分析条件的情况下,按照第二方式确定所述目标图像帧的帧类型和编码参数;其中,所述第二方式是指依据所述参考图像帧确定帧类型和编码参数的方式。When the number of the reference image frames satisfies the analysis condition, determine the frame type and coding parameters of the target image frame according to the second method; wherein, the second method refers to determining according to the reference image frame Frame type and way of encoding parameters.
  7. 根据权利要求6所述的方法,其中,所述按照第二方式确定所述目标图像帧的帧类型 和编码参数,包括:The method according to claim 6, wherein said determining frame type and encoding parameters of said target image frame in a second manner comprises:
    根据所述参考图像帧确定所述目标图像帧的帧类型;determining the frame type of the target image frame according to the reference image frame;
    根据配置参数和/或所述目标图像帧的帧类型,确定所述目标图像帧的编码参数,所述配置参数用于约束所述目标图像帧的码率。Determine the encoding parameters of the target image frame according to the configuration parameters and/or the frame type of the target image frame, where the configuration parameters are used to constrain the code rate of the target image frame.
  8. 根据权利要求7所述的方法,其中,所述根据所述参考图像帧确定所述目标图像帧的帧类型,包括;The method according to claim 7, wherein said determining the frame type of the target image frame according to the reference image frame comprises;
    生成多种帧类型排列,每一种帧类型排列中包括所述目标图像帧的一种假定帧类型和位于所述假定帧类型之后的至少一个帧类型;generating a plurality of frame type arrangements, each frame type arrangement including a hypothetical frame type of the target image frame and at least one frame type following the hypothetical frame type;
    计算所述多种帧类型排列分别对应的失真率;其中,所述帧类型排列对于的失真率是根据所述帧类型排列对所述目标图像帧和至少一个参考图像帧进行编码确定的;Calculating the respective distortion rates corresponding to the various frame type arrangements; wherein, the distortion rates for the frame type arrangements are determined by encoding the target image frame and at least one reference image frame according to the frame type arrangement;
    从所述多种帧类型排列中,确定失真率最小的目标帧类型排列;From the various frame type arrangements, determine the target frame type arrangement with the smallest distortion rate;
    将所述目标帧类型排列中的假定帧类型,确定为所述目标图像帧的帧类型。The assumed frame type in the target frame type arrangement is determined as the frame type of the target image frame.
  9. 根据权利要求1至8任一项所述的方法,其中,所述获取待编码的目标图像帧之后,还包括:The method according to any one of claims 1 to 8, wherein, after acquiring the target image frame to be encoded, further comprising:
    将所述目标图像帧添加至预编码队列中;adding the target image frame to the precoding queue;
    在所述预编码队列的长度小于额定长度的情况下,确定所述目标图像帧的参考图像帧的数量不满足分析条件;When the length of the precoding queue is less than a rated length, it is determined that the number of reference image frames of the target image frame does not meet the analysis condition;
    在所述预编码队列的长度等于所述额定长度的情况下,确定所述目标图像帧的参考图像帧的数量满足分析条件。When the length of the precoding queue is equal to the rated length, it is determined that the number of reference image frames of the target image frame meets the analysis condition.
  10. 一种视频编码装置,所述装置包括:A video encoding device, the device comprising:
    图像帧获取模块,配置为获取待编码的目标图像帧;An image frame acquisition module configured to acquire a target image frame to be encoded;
    第一编码确定模块,配置为在所述目标图像帧的参考图像帧的数量不满足分析条件的情况下,按照第一方式确定所述目标图像帧的帧类型和编码参数;其中,所述第一方式是指不依据所述参考图像帧确定帧类型和编码参数的方式;The first encoding determination module is configured to determine the frame type and encoding parameters of the target image frame in a first manner when the number of reference image frames of the target image frame does not meet the analysis condition; wherein, the first One method refers to a method of determining the frame type and encoding parameters not according to the reference image frame;
    图像帧编码模块,配置为根据所述目标图像帧的帧类型和编码参数,对所述目标图像帧进行编码。An image frame encoding module configured to encode the target image frame according to the frame type and encoding parameters of the target image frame.
  11. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如权利要1至9任一项所述的方法。A computer device, the computer device includes a processor and a memory, and a computer program is stored in the memory, and the computer program is loaded and executed by the processor to implement the method described in any one of claims 1 to 9 method.
  12. 一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至9任一项所述的方法。A computer-readable storage medium, in which a computer program is stored, and the computer program is loaded and executed by a processor to implement the method according to any one of claims 1 to 9.
  13. 一种计算机程序产品,所述计算机程序产品在计算机设备上运行,使得所述计算机设备执行如权利要求1至9任一项所述的方法。A computer program product, the computer program product runs on a computer device, causing the computer device to execute the method according to any one of claims 1 to 9.
PCT/CN2022/118265 2021-10-12 2022-09-09 Video encoding method and apparatus, device, and storage medium WO2023061129A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111188246.2 2021-10-12
CN202111188246.2A CN113973202A (en) 2021-10-12 2021-10-12 Video encoding method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023061129A1 true WO2023061129A1 (en) 2023-04-20

Family

ID=79587246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118265 WO2023061129A1 (en) 2021-10-12 2022-09-09 Video encoding method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN113973202A (en)
WO (1) WO2023061129A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113973202A (en) * 2021-10-12 2022-01-25 百果园技术(新加坡)有限公司 Video encoding method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009284149A (en) * 2008-05-21 2009-12-03 Panasonic Corp Image encoding processing apparatus
CN102413326A (en) * 2010-09-26 2012-04-11 华为技术有限公司 Video coding and decoding method and device
CN104486040A (en) * 2014-12-15 2015-04-01 西安电子科技大学 Efficient coding-aware routing method based on cache management
CN109983775A (en) * 2016-12-30 2019-07-05 深圳市大疆创新科技有限公司 The system and method sent for the data based on feedback
CN112040233A (en) * 2020-11-04 2020-12-04 北京金山云网络技术有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, electronic device, and storage medium
CN113973202A (en) * 2021-10-12 2022-01-25 百果园技术(新加坡)有限公司 Video encoding method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009284149A (en) * 2008-05-21 2009-12-03 Panasonic Corp Image encoding processing apparatus
CN102413326A (en) * 2010-09-26 2012-04-11 华为技术有限公司 Video coding and decoding method and device
CN104486040A (en) * 2014-12-15 2015-04-01 西安电子科技大学 Efficient coding-aware routing method based on cache management
CN109983775A (en) * 2016-12-30 2019-07-05 深圳市大疆创新科技有限公司 The system and method sent for the data based on feedback
CN112040233A (en) * 2020-11-04 2020-12-04 北京金山云网络技术有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, electronic device, and storage medium
CN113973202A (en) * 2021-10-12 2022-01-25 百果园技术(新加坡)有限公司 Video encoding method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113973202A (en) 2022-01-25

Similar Documents

Publication Publication Date Title
US20220417504A1 (en) Video decoding method and apparatus, video coding method and apparatus, device, and storage medium
US11451787B2 (en) Method and apparatus for video encoding and decoding
WO2021244341A1 (en) Picture coding method and apparatus, electronic device and computer readable storage medium
CN107005698B (en) Metadata hints to support best effort decoding
US20220058775A1 (en) Video denoising method and apparatus, and storage medium
US11558639B2 (en) Selective resolution video encoding method, computer device, and readable storage medium
JP7075983B2 (en) Video processing equipment and video stream processing method
CN105900419B (en) The Video coding of screen content data
KR20110071231A (en) Encoding method, decoding method and apparatus thereof
CN102986211A (en) Rate control in video coding
TW202218428A (en) Image encoding method, image decoding method, and related apparatuses
US20230017002A1 (en) File encapsulation method, file transmission method, file decoding method, electronic device, and storage medium
CN111901630A (en) Data transmission method, device, terminal equipment and storage medium
CA2886995A1 (en) Rate-distortion optimizers and optimization techniques including joint optimization of multiple color components
US11343501B2 (en) Video transcoding method and device, and storage medium
CN111669589B (en) Image encoding method, image encoding device, computer device, and storage medium
WO2023061129A1 (en) Video encoding method and apparatus, device, and storage medium
CN113259671B (en) Loop filtering method, device, equipment and storage medium in video coding and decoding
US20240080487A1 (en) Method, apparatus for processing media data, computer device and storage medium
US7613351B2 (en) Video decoder with deblocker within decoding loop
CN111182310A (en) Video processing method and device, computer readable medium and electronic equipment
CN111083450A (en) Vehicle-mounted-end image remote output method, device and system
CN115118964A (en) Video encoding method, video encoding device, electronic equipment and computer-readable storage medium
CN107409211A (en) A kind of video coding-decoding method and device
WO2024078066A1 (en) Video decoding method and apparatus, video encoding method and apparatus, storage medium, and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22880071

Country of ref document: EP

Kind code of ref document: A1