CN113038179A

CN113038179A - Video encoding method, video decoding method, video encoding device, video decoding device and electronic equipment

Info

Publication number: CN113038179A
Application number: CN202110217976.4A
Authority: CN
Inventors: 顾弘
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2021-06-25
Also published as: WO2022179600A1

Abstract

The application discloses a video coding method, a video decoding device and electronic equipment, and belongs to the technical field of communication. The method comprises the following steps: acquiring a first video frame of a video to be processed; performing luminance-chrominance YUV encoding on the first video frame; acquiring a second video frame of the video to be processed, wherein the second video frame is any one of N video frames associated with the first video frame, and N is a positive integer; only luma Y-coding is performed on the second video frame. According to the video coding method, under the condition that the bandwidth is limited, the bandwidth does not need to be adapted by reducing the resolution of each video frame, and the definition of video playing can be further ensured.

Description

Video encoding method, video decoding method, video encoding device, video decoding device and electronic equipment

Technical Field

The present application belongs to the field of communication technologies, and in particular, to a video encoding method, a video decoding device, and an electronic device.

Background

With the rapid development of communication technology, electronic devices such as smart phones and tablet computers are becoming more and more popular and become indispensable tools in people's daily life. The video recording function is one of the main functions of the electronic equipment, so that a user can dynamically record all things around, and the user experience effect of the electronic equipment is improved.

At present, in the process of encoding a video frame, because a situation that a bandwidth of video frame data obtained after transmission and encoding is limited may occur, or for real-time performance or fluency of video playing, or for saving space, in order to meet a video code rate requirement, generally, a resolution of the video frame is reduced in the encoding process to reduce an occupied space of the video frame data, thereby reducing a definition of the video playing.

Disclosure of Invention

An object of the embodiments of the present application is to provide a video encoding method, a video decoding device, and an electronic device, which can solve the problem that the resolution of a video frame is reduced due to bandwidth limitation of the current electronic device, so that the definition of video playing is low.

In order to solve the technical problem, the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a video encoding method, including:

acquiring a first video frame of a video to be processed;

performing luminance-chrominance YUV encoding on the first video frame;

acquiring a second video frame of the video to be processed, wherein the second video frame is any one of N video frames associated with the first video frame, and N is a positive integer;

only luma Y-coding is performed on the second video frame.

In a second aspect, an embodiment of the present application provides a video decoding method, including:

acquiring first video frame data and second video frame data of a video to be processed;

decoding the first video frame data in the video to be processed to obtain first YUV information of a first video frame, wherein the first video frame data is generated after YUV coding is performed on the first video frame;

decoding second video frame data in the video to be processed to obtain Y information of a second video frame, wherein the second video frame data is generated after Y coding is performed on the second video frame;

and carrying out color transfer processing on the second video frame based on the first YUV information and the Y information of the second video frame to obtain a second video frame with color.

In a third aspect, an embodiment of the present application provides a video encoding apparatus, including:

the first video frame acquisition module is used for acquiring a first video frame of a video to be processed;

a YUV encoding module for performing luminance-chrominance YUV encoding on the first video frame;

a second video frame obtaining module, configured to obtain a second video frame of the video to be processed, where the second video frame is any one of N video frames associated with the first video frame, and N is a positive integer;

and the Y coding module is used for only carrying out brightness Y coding on the second video frame.

In a fourth aspect, an embodiment of the present application provides a video decoding apparatus, including:

the video frame data acquisition module is used for acquiring first video frame data and second video frame data of a video to be processed;

the first decoding module is used for decoding the first video frame data in the video to be processed to obtain first YUV information of a first video frame, wherein the first video frame data is generated after YUV coding is performed on the first video frame;

the second decoding module is configured to decode second video frame data in the video to be processed to obtain Y information of a second video frame, where the second video frame data is data generated by Y-coding the second video frame;

and the color transfer module is used for carrying out color transfer processing on the second video frame based on the first YUV information and the Y information of the second video frame to obtain a second video frame with color.

In a fifth aspect, the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect or the second aspect.

In a sixth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect or the second aspect.

In a seventh aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect or the second aspect.

In the embodiment of the application, YUV encoding is performed on a first video frame of a video to be processed, and only Y encoding is performed on a second video frame associated with the first video frame, so that by using the video encoding method in the embodiment of the application, only Y encoding is performed on the second video frame, and therefore, the occupied space of data of the encoded second video frame can be effectively reduced; under the condition of ensuring video definition, the space can be saved, the coding and decoding speed of video frames can be improved, and real-time playing during video recording can be ensured; under the condition of limited bandwidth, the bandwidth is not required to be adapted by reducing the resolution of each video frame, and the definition of video playing can be further ensured.

Drawings

Fig. 1 is a schematic flowchart of a video encoding method according to an embodiment of the present application;

fig. 2 is a schematic diagram of encoding a video frame based on h.265 according to an embodiment of the present application;

fig. 3 is a schematic diagram of an encoding process of a video frame provided by an embodiment of the present application;

fig. 4 is a flowchart illustrating a video decoding method according to an embodiment of the present application;

fig. 5 is a schematic diagram of a decoding process of a video frame provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a process for calculating RGB according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present application;

fig. 9 is one of the hardware structure diagrams of the electronic device provided in the embodiment of the present application;

fig. 10 is a second schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The video coding method provided by the embodiments of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Referring to fig. 1, which is a schematic flowchart of a video encoding method provided in an embodiment of the present application, applied to an electronic device, as shown in fig. 1, the video encoding method includes the following steps:

step 101, acquiring a first video frame of a video to be processed;

102, performing brightness-color YUV coding on the first video frame;

103, acquiring a second video frame of the video to be processed;

and 104, only carrying out brightness Y coding on the second video frame.

Based on this, by performing YUV coding on a first video frame of a video to be processed and performing Y coding only on a second video frame associated with the first video frame, in this way, by the video coding method in the embodiment of the present application, since only Y coding is performed on the second video frame, the occupied space of data of the coded second video frame can be effectively reduced; under the condition of ensuring video definition, the space can be saved, the coding and decoding speed of video frames can be improved, and real-time playing during video recording can be ensured; and then the definition of video playing can be ensured.

In step 101, in the transmission process of the video to be processed, the electronic device may acquire a first video frame of the video to be processed.

The video to be processed may be a video that has been recorded and stored in the electronic device, and then, the obtaining of the first video frame of the video to be processed may be extracting a frame of video frame in the video to be processed as the first video frame. Or, the to-be-processed video may also be a video being recorded, and then the acquiring of the first video frame of the to-be-processed video may include: in the process of recording a video to be processed, a first video frame is collected.

In addition, the first video frame may be any video frame in the video to be processed. For example, in a case that the video to be processed is a video being recorded, the first video frame may be a current video frame of the recorded video; or any video frame extracted from a video stored in the electronic device.

Of course, the first video frame may be a video frame satisfying a preset condition. Specifically, the first video frame may be a 1 st video frame acquired in an mth video frame acquisition period, and the electronic device acquires N +1 video frames in each of the video frame acquisition periods, where M and N are positive integers.

Alternatively, the first video frame may also be a video frame determined according to the chromaticity of the video frame, and more specifically, the acquiring the first video frame may include: under the condition that the chrominance difference value between the kth frame video frame and the (k-1) th frame video frame of the video to be processed exceeds a preset chrominance range, determining the kth frame video frame as a first video frame, wherein k is an integer larger than 1, namely determining the video frame of which the chrominance difference value with the last frame video frame exceeds the preset chrominance range as the first video frame needing YUV coding.

It should be noted that the video frame of the video to be processed may be a video frame in any YUV format, and specifically, the YUV format of the video frame of the video to be processed may be any one format of 4:4:4, 4:2:2, 4:2:0, and 4:1:1, and the like.

In step 102, after the electronic device obtains the first video frame, the electronic device may encode luminance-chrominance (YUV, "Y" represents brightness, i.e. a gray-scale value) of the first video frame, and "U" and "V" represent chrominance, which is used to describe color and saturation of an image and to specify color of a pixel), that is, encode color image information of the first video frame, where the encoded color image information includes luminance information and chrominance information of an image in the first video frame.

In this embodiment of the application, the YUV format of the video frame in the video to be processed may be any one of 4:4:4, 4:2:2, 4:2:0, and 4:1: 1.

The above-described encoding of the color image information of the first video frame may be encoding of the color image information of the video frame by using any video compression technique, and specifically may be encoding by using a video compression technique such as video encoding standard h.264 or video encoding standard h.265.

Taking h.265 as an example, h.265 divides an image into coding tree blocks (CTUs) instead of 16 × 16 macroblocks like h.264. The size of the coding tree block may be set to 64 x 64 or limited 32 x 32 or 16x16, depending on the coding settings. Where larger coding tree blocks may provide higher compression efficiency (and also require higher coding speed). Each coding tree block may be recursively partitioned, for example, into 32 × 32, 16 × 16, and 8 × 8 sub-regions by using a quadtree structure, as shown in fig. 2, which is an example of a partition of a 64 × 64 coding tree block (it should be noted that fig. 2 actually represents a color image). Each image is further distinguished into specific groups of coding trees, called cuts (Slices) and Tiles (Tiles). The coding tree unit is the basic coding unit of h.265, like the macroblock of h.264. The Coding tree Unit may partition a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) downward.

Each coding tree unit contains 1 luma and 2 chroma coding tree blocks and syntax elements for recording additional information. Assuming that the video frame is compressed with YUV 4:2:0 color samples, a 16x16 code tree unit is taken as an example, which includes 1 luminance code tree block of 16x16 and 2 chrominance code tree blocks of 8x 8. When the first video frame is encoded by using H26.5, the luma and chroma coding tree blocks in the coding tree unit are reserved after encoding (i.e., the encoded color image information includes luma information and chroma information).

It should be noted that, after the electronic device performs YUV encoding on the first video frame, video frame data of the first video frame may be obtained, and the electronic device may send or buffer the video frame data.

For example, in the process of recording the video, the electronic device may buffer the first video frame data, so that in the case that the electronic device subsequently receives an instruction to play the recorded video, the electronic device decodes the first video frame data to display the first video frame.

In step 103, the electronic device may further obtain a second video frame of the video to be processed.

It should be noted that, the acquiring of the second video frame may be performed after the acquiring of the first video frame, for example, in a video recording process, the second video frame is a video frame acquired after the first video frame; alternatively, the acquiring of the second video frame may be performed simultaneously with the acquiring of the first video frame, for example, in a case that the to-be-processed video is a stored video, the electronic device may extract a plurality of video frames of the to-be-processed video at a time, where the plurality of video frames include the first video frame and the second video frame, which is not limited herein.

In an embodiment of the present application, the second video frame is any one of the N video frames, and the N video frame is at least one video frame associated with the first video frame.

The association between the first video frame and the N frames of video frames may be a preset association relationship between the first video frame and the N frames of video frames, and specifically, a time interval between each video frame in the N frames of video frames and the first video frame may be less than or equal to a preset time duration, that is, the N frames of video frames and the first video frame are collected video frames within the preset time duration.

It should be noted that the preset time period may be a time period set according to actual needs. For example, the preset time period may be 0.5s, 1s, or 2s, and so on.

As mentioned above, the first video frame may be a video frame acquired in a video recording process, and then, any one of the N video frames may also be a video frame acquired in a video recording process and associated with the first video frame; as also described above, if the time interval between each of the N frames of video frames and the first video frame is less than or equal to the preset time duration, then the second video frame may also be acquired before or after the first video frame, and the time interval between the second video frame and the first video frame is less than or equal to the preset time duration, that is:

the step 101 may include:

collecting the first video frame in the process of recording the video to be processed;

the second video frame is acquired before or after the first video frame, and the time interval between the second video frame and the first video frame is less than or equal to a preset time length.

Therefore, in the process of recording the video, the image difference between the video frames acquired within the preset time length is small, so that the video frame acquired within the preset time length with the first video frame is used as the second video frame, and the video coding quality can be improved.

In this embodiment, the second video frame may be a video frame captured before the first video frame, for example, the second video frame may be a video frame before the first video frame, or the first video frame is a last video frame within the preset time length, and so on.

Alternatively, the second video frame may also be a video frame captured after the first video frame, for example, the second video frame may be a video frame subsequent to the first video frame, or the first video frame may be a 1 st video frame within the preset time length, and so on.

Specifically, the first video frame is a 1 st video frame acquired in an mth video frame acquisition period, and the electronic device acquires N +1 video frames in each video frame acquisition period, where M is a positive integer;

the acquiring the second video frame of the video to be processed may include:

determining an i +1 frame video frame acquired in the Mth video frame acquisition period as a second video frame, wherein i is an integer less than or equal to N.

Based on this, the number of video frames subjected to YUV coding can be further reduced by taking the 1 st frame of video frame in each acquisition period in the video recording process as the first video frame, and determining other video frames (namely, video frames in the N frames of video frames) in the acquisition period as the second video frame, namely, only the 1 st frame of video frame is subjected to YUV coding in each acquisition period, and the other video frames are subjected to Y coding; under the condition of ensuring video definition, the space can be saved, the coding and decoding speed of video frames can be improved, and real-time playing during video recording can be ensured; under the condition of limited transmission bandwidth, the definition of video playing can be ensured; in addition, the encoding process is periodically carried out on the video frames of the video to be processed, and the video encoding quality can be improved.

It should be noted that the mth video frame acquisition period may be any video frame acquisition period in the video recording process, and the electronic device may acquire N +1 video frames in each video frame acquisition period.

For example, as shown in fig. 3, in a case where N is 3, that is, the mth video frame capture period includes the 1 st frame video frame, the 2 nd frame video frame, the 3 rd frame video frame, and the 4 th frame video frame, the electronic device may use the 1 st frame video frame as the first video frame, and use the 2 nd frame video frame, the 3 rd frame video frame, and the 4 th frame video frame as the second video frame, respectively.

Of course, the N frames of video frames may also be the first video frame or any one frame of video frame that satisfies the preset condition during the video recording process, and the second video frame is a video frame that does not satisfy the preset condition after the second video frame and is associated with the second video frame.

Specifically, the YUV encoding of the first video frame may include:

under the condition that the brightness value of the first video frame is detected to be larger than or equal to a preset brightness value, carrying out YUV coding on the first video frame;

the Y-only encoding the second video frame in the case that it is determined that the second video frame is associated with the first video frame includes:

and only performing Y encoding on the second video frame under the condition that the first brightness value of the brightness values of the second video frame is determined to be smaller than the preset brightness value and no video frame with the brightness value larger than or equal to the preset brightness value exists between the second video frame and the first video frame.

Based on the above, the second video frame with the brightness value greater than or equal to the preset brightness value in the video recording process is subjected to YUV coding, and each frame of video frame (namely the first video frame) which is collected after the second video frame and has the brightness value smaller than the preset brightness value is subjected to Y coding, so that not only can the occupied space of the recorded video be further reduced, but also the coding modes in the video coding process are more flexible and diversified.

For example, if the brightness value of the collected ith frame of video frame is greater than a preset brightness value in the video recording process, and i is a positive integer, the YUV encoding is performed on the ith frame of video frame; if the brightness values of the collected (i + 1) th frame video frame and the (i + 2) th frame video frame are smaller than the preset brightness value, carrying out Y coding on the (i + 1) th frame video frame and the (i + 2) th frame video frame; if the brightness values of the collected (i + 3) th frame video frame and the (i + 4) th frame video frame are both greater than the preset brightness value, carrying out YUV coding on the (i + 3) th frame video frame and the (i + 4) th frame video frame; and if the brightness value of the collected (i + 5) th frame of video frame is less than the preset brightness value, performing Y coding on the (i + 5) th frame of video frame, … … and the like.

It should be noted that, after the electronic device encodes the second video frame, video frame data of the second video frame may be obtained, and the electronic device may cache the video frame data of the second video frame, so that the electronic device may play back the second video frame; alternatively, the video frame data of the second video frame may be transmitted to other devices so that other electronic devices can view the second video frame, which is not limited herein.

Note that, the step 103 may be executed after the step 102, before the step 102, or simultaneously with the step 101, or simultaneously with the step 102, and the present application is for explaining the video encoding method, and only a case where the step 103 is executed after the step 102 is shown in fig. 1, and is not limited thereto.

In step 104, after acquiring the second video frame, the electronic device may perform Y-coding on the second video frame, that is, coding on the grayscale image information of the second video frame, where the coded grayscale image information includes luminance information of an image in the second video frame, but does not include chrominance information, so that a space occupied by video frame data of the coded second video frame may be reduced.

Please refer to fig. 4, which is a flowchart illustrating a video decoding method according to an embodiment of the present application, applied to an electronic device, as shown in fig. 4, the video encoding method includes the following steps:

step 401, acquiring first video frame data and second video frame data of a video to be processed;

step 402, decoding the first video frame data in the video to be processed to obtain first YUV information of a first video frame, wherein the first video frame data is generated after YUV coding is performed on the first video frame;

step 403, decoding the second video frame data in the video to be processed to obtain Y information of a second video frame, where the second video frame data is generated after Y encoding the second video frame;

and step 404, performing color transfer processing on the second video frame based on the first YUV information and the Y information of the second video frame to obtain a second video frame with color.

Based on the method, under the condition that the video to be processed is played, the YUV information of the first video frame subjected to YUV coding is used for restoring the color image of the second video frame associated with the first video frame in the decoding process, so that the second video frame can be displayed in the color image, and the video playing quality is ensured.

In step 401, in the process of playing the to-be-processed video by the electronic device, the electronic device may obtain first video frame data and second video frame data of the to-be-processed video.

In this embodiment of the application, the first video frame data is video frame data obtained by YUV coding a first video frame, the second video frame data is video frame data obtained by Y coding a second video frame, and a correlation exists between a first video frame corresponding to the first video frame data and a second video frame corresponding to the second video frame data.

The obtaining of the first video frame data and the second video frame data may be that the electronic device reads the first video frame data and the second video frame data in a cache; alternatively, the first video frame data and the second video frame data transmitted by other electronic devices in real time may be received.

In addition, the electronic device executing the video decoding method may be the same electronic device as the electronic device executing the video encoding method in fig. 1. For example, in the process of recording the to-be-processed video by the electronic device, the recorded video frame is encoded to obtain video frame data, and the video frame data is buffered, and in the process of playing back the to-be-processed video by the electronic device, the electronic device decodes the buffered video frame data.

Alternatively, the electronic device executing the video decoding method and the electronic device executing the video encoding method in fig. 1 may be different electronic devices. For example, the electronic device 1 may encode the recorded video frame to obtain video frame data in the process of recording the to-be-processed video, and transmit the video frame data to the electronic device 2, and the electronic device 2 may decode the video frame data after receiving the video frame data transmitted by the electronic device 1, so as to play the to-be-processed video.

In the step 402, after the electronic device acquires the first video frame data, the electronic device may decode the first video frame data to obtain YUV information (i.e., first YUV information) of the first video frame; similarly, in step 403, after the electronic device acquires the second video frame data, the electronic device may decode the second video frame data to obtain Y information of the second video frame.

It should be noted that, the above-mentioned decoding of the first video frame data and the decoding of the second video frame data may be performed after the operations of acquiring the first video frame data and the second video frame data are performed, respectively, and the decoding of the first video frame data may be performed before the decoding of the second video frame data, as shown in fig. 4, or the decoding of the first video frame data may be performed after the decoding of the second video frame data, or the decoding of the first video frame data may be performed simultaneously with the decoding of the second video frame data.

Alternatively, the decoding of the first video frame data may be performed after the decoding of the second video frame data, or after the electronic device acquires the first video frame data, the decoding of the first video frame data is performed, that is, it is not necessary to wait for the acquisition of the second video frame data; after the electronic device acquires the second video frame data, decoding the second video frame data is performed, and acquiring the second video frame data may be performed before the decoding of the first video frame data is completed or after the decoding of the first video frame data is completed, which is not limited herein.

The decoding of the first video frame data and the second video frame data may be performed by decoding each video frame data through a preset decoding method, that is, a chrominance coding unit exists in the first video frame data, and color image information of the first video frame can be obtained after decoding; and only the brightness coding unit exists in the center of the second video frame data, but the chrominance coding unit does not exist, and the gray image information of the second video frame can be obtained after decoding.

In addition, the first YUV information may include luminance information and chrominance information of the first video frame; and the Y information may be luminance information including only the second video frame.

In the step 404, when the electronic device obtains the first YUV information of the first video frame and the Y information of the second video frame, the electronic device may calculate the YUV information of the second video frame or the color display information associated with the YUV information by a preset method.

For example, the electronic device may directly use chrominance information in the first YUV information of the first video frame as chrominance information of a second video frame associated with the first video frame, so that the YUV information of the color image restored by the second video frame includes Y information (i.e., luminance information) of the second video frame and the chrominance information of the first video frame.

In the process of displaying an image, the YUV information of the image is usually converted into Red (Red) Green (Green) Blue (Blue) (abbreviated as "RGB") information of each pixel point, so as to display a color image.

Specifically, the performing the color transfer processing on the second video frame may include:

and calculating to obtain the red, green and blue (RGB) information of the second video frame based on a preset deep neural network model.

Based on the above, the RGB information of the second video frame is obtained through calculation of the neural network model, so that the color image of the second video frame after color transmission is closer to the actually acquired color image, and the image quality in the video playing process is further improved.

It should be noted that the deep neural network model may be a model preset in the electronic device before the color transfer processing is performed on the second video frame, and the deep neural network model may be a model trained by a large amount of video frame data, and the training process may include: inputting samples of a training set into an initial deep neural network model, and continuously updating the weight of the initial deep neural network model in an iterative manner until the loss of the model obtained after the iterative updating tends to be unchanged, so as to finally obtain the deep neural network model, wherein each sample in the training set can be label data of at least one third video frame with YUV information, a fourth video frame only with Y information and associated with each third video frame and each fourth video frame, and the label data can be actual YUV information of each fourth video frame.

In this embodiment of the application, the calculating to obtain the RGB information of the second video frame based on the preset deep neural network model may include:

inputting the first YUV information and the Y information of the second video frame into the deep neural network model, and calculating to obtain RGB information of the second video frame; or,

inputting second YUV information and Y information of the second video frame into the deep neural network model, and calculating to obtain RGB information of the second video frame, wherein the second YUV information is YUV information of an N-1 frame video frame collected after the first video frame, and the YUV information of the 1 frame video frame is calculated based on the first YUV information and the Y information of the 1 frame video frame.

Based on the above, in the process of color transfer processing of the second video frame, the color transfer can be directly performed by using the YUV information of the first video frame; or, under the condition that at least one frame of video frame is spaced between the second video frame and the first video frame, the YUV information of the previous frame of video frame can be adopted for color transfer, so that the color transfer processing is more flexible and diversified.

Taking the decoding process shown in fig. 5 as an example, in the case that the electronic device decodes the video frames acquired in the M-th video frame acquisition period, assuming that the M-th video frame acquisition period includes the 1 st, 2 nd, 3 rd and 4 th video frames, and the 1 st video frame is a video frame that performs YUV coding, and the 2 nd, 3 rd and 4 th video frames are all video frames that perform Y coding, when the electronic device displays the 2 nd, 3 rd and 4 th video frames, Y decoding is performed on the 2 nd, 3 rd and 4 th video frames, respectively, it may be that YUV information of the 1 st video frame and Y information of any one of the 2 nd to 4 th video frames are input to the deep neural network model, RGB information of the corresponding video frame in the 2 nd to 4 th video frames is calculated, realizing color transfer processing of each frame of video frames from 2 nd to 4 th frames of video frames, and displaying the video frames after the color transfer processing;

or, the YUV information of the 1 st frame of video frame and the Y information of the 2 nd frame of video frame may be input to the deep neural network model, and the RGB information of the 2 nd frame of video frame is obtained through calculation; inputting YUV information corresponding to RGB information obtained by calculating the 2 nd frame of video frame and Y information of the 3 rd frame of video frame into a deep neural network model, and obtaining RGB information of the 3 rd frame of video frame by calculation; inputting YUV information corresponding to RGB information obtained by calculating the video frame of the 3 rd frame and Y information of the video frame of the 4 th frame into a deep neural network model, and obtaining RGB information of the video frame of the 4 th frame by calculation, thereby realizing color transfer processing of each video frame subjected to Y coding.

In addition, the above calculating to obtain the RGB information of the second video frame based on the preset deep neural Network model may be to convert the color of the decoded YUV information of the first video frame or the previous frame of video frame into RGB components, combine the Y channels of the second video frame (for example, combine the YUV information with a Concat function) into a 4-channel input, calculate through a neural Network (Network) model, and output an image of the RGB components, where the output image corresponds to the image of the second video frame after being colored (i.e., the second video frame with color), as shown in fig. 6.

It should be noted that, in the video encoding method provided in the embodiment of the present application, the execution subject may be a video encoding apparatus, or a control module for the video encoding method in the video encoding apparatus. In the embodiment of the present application, a video encoding apparatus executing a video encoding method is taken as an example to describe the video encoding apparatus provided in the embodiment of the present application.

Referring to fig. 7, an embodiment of the present application provides a video encoding apparatus, as shown in fig. 7, the video encoding apparatus 700 includes:

a first video frame obtaining module 701, configured to obtain a first video frame of a video to be processed;

a YUV encoding module 702, configured to perform luminance-chrominance YUV encoding on the first video frame;

a second video frame obtaining module 703, configured to obtain a second video frame of the video to be processed, where the second video frame is any one of N video frames associated with the first video frame, and N is a positive integer;

a Y-coding module 704, configured to only perform luma Y-coding on the second video frame.

Therefore, by performing YUV encoding on the first video frame of the video to be processed and performing Y encoding only on the second video frame associated with the first video frame, the video encoding method in the embodiment of the application can effectively reduce the occupied space of the data of the encoded second video frame due to the fact that only the Y encoding is performed on the second video frame, so that under the condition that the bandwidth is limited, the bandwidth does not need to be adapted by reducing the resolution of each video frame, and the definition of the video frame can be further ensured.

Optionally, the first video frame obtaining module 701 is specifically configured to:

Optionally, the first video frame is a 1 st video frame acquired in an mth video frame acquisition period, and the electronic device acquires N +1 video frames in each video frame acquisition period;

the second video frame obtaining module 703 is specifically configured to:

Based on this, the number of video frames subjected to YUV coding can be further reduced by taking the 1 st frame of video frame in each acquisition period in the video recording process as the first video frame and determining other video frames in the acquisition period as the second video frame, namely only the 1 st frame of video frame is subjected to YUV coding in each acquisition period, and the other video frames are subjected to Y coding; under the condition of ensuring video definition, the space can be saved, the coding and decoding speed of video frames can be improved, and real-time playing during video recording can be ensured; under the condition of limited transmission bandwidth, the definition of video playing can be ensured; in addition, the encoding process is periodically carried out on the video frames of the video to be processed, and the video encoding quality can be improved.

determining a kth frame video frame of the video to be processed as a first video frame under the condition that a chromaticity difference value between the kth frame video frame and a kth-1 frame video frame of the video to be processed exceeds a preset chromaticity range, wherein k is an integer greater than 1;

the second video frame obtaining module 703 is specifically configured to:

determining an ith frame video frame of the video to be processed as a second video frame when detecting that a chrominance difference value between the ith frame video frame and the kth frame video frame is in the preset chrominance range and a target video frame does not exist between the ith frame video frame and the kth frame video frame, wherein the target video frame is: the video frames with the chrominance difference value exceeding the preset chrominance range with the k frame video frames; and i is an integer greater than k.

It should be noted that, in the video decoding method provided in the embodiment of the present application, the execution subject may be a video decoding apparatus, or a control module for the video decoding method in the video decoding apparatus. In the embodiment of the present application, a video decoding apparatus executing a video decoding method is taken as an example to describe the video decoding apparatus provided in the embodiment of the present application.

Referring to fig. 8, an embodiment of the present application provides a video decoding apparatus, as shown in fig. 8, the video decoding apparatus 800 includes:

a video frame data obtaining module 801, configured to obtain first video frame data and second video frame data of a video to be processed;

a first decoding module 802, configured to decode the first video frame data in the video to be processed to obtain first YUV information of a first video frame, where the first video frame data is generated after YUV coding is performed on the first video frame;

a second decoding module 803, configured to decode the second video frame data in the video to be processed to obtain Y information of a second video frame, where the second video frame data is generated after Y encoding the second video frame;

and a color transfer module 804, configured to perform color transfer processing on the second video frame based on the first YUV information and the Y information of the second video frame, so as to obtain a second video frame with a color.

Optionally, the color transfer module 804 is specifically configured to:

Optionally, N is greater than 1;

the color transfer module 804 is specifically configured to:

The video encoding apparatus and the video decoding apparatus in the embodiments of the present application may be apparatuses, or may be components, integrated circuits, or chips in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The video encoding apparatus and the video decoding apparatus in the embodiments of the present application may be apparatuses having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The video encoding device provided in this embodiment of the present application can implement each process implemented in the method embodiment of fig. 1, and the video decoding device provided in this embodiment of the present application can implement each process implemented in the method embodiment of fig. 4, and for avoiding repetition, details are not repeated here.

Optionally, as shown in fig. 9, an electronic device 900 is further provided in this embodiment of the present application, where the electronic device 900 includes a processor 901, a memory 902, and a program or an instruction stored in the memory 902 and executable on the processor 901, and when the program or the instruction is executed by the processor 901, the program or the instruction implements each process of the above-mentioned video encoding method embodiment or the above-mentioned video decoding method embodiment, and can achieve the same technical effect, and is not described here again to avoid repetition.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 10 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1000 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010.

Those skilled in the art will appreciate that the electronic device 1000 may further comprise a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 1010 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 10 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is not repeated here.

Wherein, the processor 1010 is configured to:

acquiring a first video frame of a video to be processed;

performing luminance-chrominance YUV encoding on the first video frame;

only luma Y-coding is performed on the second video frame.

Optionally, the processor 1010 is specifically configured to:

the processor 1010 is specifically configured to:

Optionally, the processor 1010 is specifically configured to:

Alternatively, processor 1010 is configured to:

Optionally, the processor 1010 is specifically configured to:

Optionally, N is greater than 1;

the processor 1010 is specifically configured to:

It should be understood that in the embodiment of the present application, the input Unit 1004 may include a Graphics Processing Unit (GPU) 10041 and a microphone 10042, and the Graphics Processing Unit 10041 processes image data of still pictures or videos obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 may include two parts, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. The memory 1009 may be used to store software programs as well as various data, including but not limited to application programs and operating systems. Processor 1010 may integrate an application processor that handles primarily operating systems, user interfaces, applications, etc. and a modem processor that handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1010.

An embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above-mentioned video encoding method or the above-mentioned video decoding method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the embodiment of the video encoding method or the video decoding method, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A video encoding method, comprising:

acquiring a first video frame of a video to be processed;

performing luminance-chrominance YUV encoding on the first video frame;

only luma Y-coding is performed on the second video frame.

2. The method of claim 1, wherein obtaining the first video frame of the video to be processed comprises:

3. The method according to claim 2, wherein the first video frame is a 1 st video frame captured in an mth video frame capture period, and the electronic device captures N +1 video frames in each of the video frame capture periods;

the acquiring a second video frame of the video to be processed includes:

4. The method of claim 1, wherein obtaining the first video frame of the video to be processed comprises:

the acquiring a second video frame of the video to be processed includes:

5. A video decoding method, comprising:

6. The method of claim 5, wherein the color-transferring the second video frame comprises:

7. The method of claim 6, wherein N is greater than 1;

the calculating to obtain the RGB information of the second video frame based on the preset deep neural network model includes:

8. A video encoding apparatus, comprising:

9. A video decoding apparatus, comprising:

10. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, which when executed by the processor, implement the steps of the video encoding method of any of claims 1 to 4 or the video decoding method of any of claims 5 to 7.