CN113422961A

CN113422961A - Video encoding method, video decoding method, encoder, and decoder

Info

Publication number: CN113422961A
Application number: CN202110828115.XA
Authority: CN
Inventors: 谢亚光; 李日; 朱建国; 陈勇; 廖义
Original assignee: Hangzhou Arcvideo Technology Co ltd
Current assignee: Hangzhou Arcvideo Technology Co ltd
Priority date: 2021-07-21
Filing date: 2021-07-21
Publication date: 2021-09-21

Abstract

The invention discloses a video coding method, a video decoding method, an encoder and a decoder, wherein the video coding method comprises the following steps: decomposing each 4:2:2 video frame to be encoded into a 4:2:0 sample frame and two separate Cb, Cr chrominance matrices; the method comprises the steps of coding frames of 4:2:0 samples in a normal AVS3 or AVS2 format of 4:2:0 to obtain AVS3 or AVS2 compressed frames, compressing an independent chrominance matrix in a video standard supporting the 4:0:0 format to obtain compressed frames in a 4:0:0 format, inserting the compressed frames in the 4:0:0 format into a frame header of a compressed frame of AVS3 or AVS2 as user data, and processing the frames frame by frame to obtain a complete coded video finally.

Description

Video encoding method, video decoding method, encoder, and decoder

Technical Field

The present invention belongs to the video encoding technology, and in particular, relates to a video encoding method, a video decoding method, an encoder, and a decoder.

Background

A color video frame is typically divided into a matrix of three components, a luminance component Y, and two chrominance components, denoted Cb and Cr, respectively. Although other forms of representation are possible, such as R, G, B, video coding is typically based on Y, Cb, Cr. Y component generally carries more texture information, while Cb and Cr generally carry less information, so that in video coding, chroma components are usually chroma sampled in order to reduce the amount of data to be coded.

Without the chrominance component, leaving only the luminance component Y, a gray scale map, i.e. 4:0 format, is shown in fig. 1. Whereas for the 4:2:0 format, the Cb and Cr matrices are half the Y matrix in both the horizontal and vertical directions, as shown in fig. 2. In contrast, in the 4:2 format, the Cb and Cr matrices are only half as large as the Y matrix in the horizontal direction, and are the same as Y in the vertical direction, as shown in FIG. 3. The non-sampled format is a 4: 4 format, i.e., the Cb, Cr matrices are both horizontally and vertically sized as the Y matrix, as shown in FIG. 4.

According to the visual perception characteristic of human eyes, the human eyes have low relative sensitivity to chrominance detail information, so that the Cb and Cr signals of a plurality of adjacent pixels are averaged, chrominance down-sampling is carried out, and then coding compression is carried out, so that the video compression code rate can be effectively reduced, the cost of brought visual damage is relatively low, and meanwhile, the requirement on the computing power of video coding and decoding can be reduced. In the last years, under the condition that transmission bandwidth and computational power are bottlenecks, the method is widely applied to various video applications, such as cable digital television broadcasting, IPTV video broadcasting, consumer-grade digital video shooting and sharing, low-bandwidth digital video streaming media, digital video monitoring and the like, and in the main video application fields of internet streaming media video, cable television broadcasting control and the like, in order to save transmission bandwidth, the video is generally sampled to be 4:2: 0. Therefore, the own intellectual property standard AVS3/AVS2 established by China only supports 4:2:0 format coding at present and does not support 4:2 format.

However, with the progress of the times, the popularization of high-bandwidth basic communication technologies such as 5G and the like and the great improvement of computational power are also achieved, 4K and 8K HDR ultra-high definition videos are also increasingly popularized, the pursuit of the color degree details of the pictures is also continuously improved, and the video experience demand of 4:2 is also continuously increased. 4:2 has or will find application in scenarios including, but not limited to:

(1) video production, synthesis and content exchange in the television station. Excessive chroma downsampling can reduce color accuracy and quality of other picture processing. In the field of professional video production, it is difficult for professional video editing experts to avoid the quality defects visible at the edges due to chroma down-sampling in a 4:2:0 image. Therefore, the domain is produced in the television station, and the widely adopted video format and the corresponding processing software are based on the down sampling of 4:2 at least.

At present, the ratio of shooting, studio, non-editing island, channel broadcasting and other links of the television station is 4: 2. The '4K ultra high definition television program production technology implementation guide' issued by the State radio and television administration indicates that the sampling format of the 4K ultra high definition slice should be 4:2, and the sampling format of the material should be at least 4:2 or higher (such as 4: 4). The sampling format of the basic technical parameters of the video and audio of the 8K ultra-high definition television program is specified to be 4:2 in the temporary scheme of the technical requirements of the production and broadcasting of the 8K ultra-high definition television program of the central broadcast television central office.

The Apple Prores Format family and the Final Cut Pro X editing system are all in a format of 4:2 or more. Other non-authoring systems and formats are similar, such as the Avid Media Composer editor software and the Avid DNxHD/DNxHR format.

Professional-level photographic equipment used for professional video shooting, such as video AVCIntra/AVCUltra format shot by Panasonic high-end video cameras, widely adopts 4:2 or 4: 4 to keep high color reproduction. The same is true of Sony XAVC/XAVC-S, which is widely adopted in the professional market in the 4:2 or even 4: 4 format.

(2) Large-scale events, concerts and internet live broadcast. At present, 4K or even 8K ultra high definition HDR is widely used in international and domestic large-scale singing meetings and important sports events, and live broadcast is carried out through the Internet. Such as CBA live, world cup, european cup live, currently mikuu, the source is mainly 422 and part 444. The demand for video experience is increasing, and HDR video means a higher color rendition degree in addition to a high luminance range, a high frame rate, and a high bit depth. Such high-definition live broadcasting is becoming more and more popular in conjunction with the popularization of household 4K/8K televisions and high internet bandwidth.

(3) The film industry uses a 4: 4 format in Digital film masters and releases, and the sampling format of a film release is required to meet the 4: 4 format and the quantization depth of a video image is required to be not less than 12bit in Digital film specifications established by DCI (Digital Cinema Initiatives). The digital film release video image sampling format for the on-demand cinema is not lower than 4:2:0, and the video image quantization depth is not lower than 8 bit.

(4) The international video standard widely supports 4:2 and 4: 4 video compression formats, and at present, HEVC, VVC, Sony XAVC, Apple Prores, DNxHR and the like all support 422 and 444, so that more application scenes are covered.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a video encoding method, a video decoding method, an encoder and a decoder, so that the AVS3 (or AVS2) can be extended to 4:2 encoding with downward compatibility.

In order to solve the technical problems, the invention adopts the following technical scheme:

a first aspect of an embodiment of the present invention provides a video encoding method, including the following steps:

each video frame to be coded in a ratio of 4:2 is decomposed into a frame of 4:2:0 samples and two independent Cb and Cr chrominance matrixes;

the frames sampled in the ratio of 4 to 2 to 0 are encoded in the ratio of 4 to 2 to 0 of the normal AVS3 or AVS2 to obtain AVS3 or AVS2 compressed frames, the single chrominance matrix is compressed in the video standard supporting the ratio of 4 to 0 to obtain compressed frames in the ratio of 4 to 0, and then the compressed frames in the ratio of 4 to 0 are inserted into the frame headers of the compressed frames of AVS3 or AVS2 as user data, and the frames are processed frame by frame to finally obtain the complete encoded video.

Preferably, each 4:2 video frame to be encoded is decomposed into a 4:2:0 sampled frame and two separate Cb, Cr chroma matrices further comprising:

the pixel value bit Depth of a video frame to be coded in a ratio of 4:2 is set as Depth, the horizontal resolution of a luminance component is set as Width, the vertical resolution is set as Height, the horizontal resolution of chrominance matrixes Cb and Cr is set as Width/2, the vertical resolution is set as Height, a new video frame in a ratio of 4:2:0 is formed by using all luminance matrixes, odd-numbered Cb matrixes and odd-numbered Cr matrixes, and is marked as F1, a Y matrix is marked as L1, a Cb matrix is marked as U1, a Cr matrix is marked as V1, Cb and Cr matrixes in even-numbered rows are independently extracted to form 2 independent chrominance matrixes, a Cb matrix is marked as U2, and Cr matrixes are marked as V2, and the horizontal and vertical resolutions of U2 and V2 are set as Width/2 and Height/2 respectively.

Preferably, further comprising: the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]Range, then combine U2 with V2 into a new chroma matrix.

Preferably, the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each same position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]Range, then merging U2 with V2 into a new chromaticity matrix further comprises:

updating the values of the U2 and V2 matrixes, namely, for each U2(i, j) and V2(i, j), wherein the value range of i is 0-Width/2, the value range of j is 0-Height/2,

first, the difference is calculated:

U2(i，j)＝U1(i，j)-U2(i，j)+2^Depth-1；

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

step two, saturation operation:

if U2(i, j) < 0, let U2(i, j) be0; if U2(i, j) > 2^Depth1, then let U2(i, j) be 2^Depth-1, otherwise leave U2(i, j) unchanged; if V2(i, j) < 0, let V2(i, j) be 0; if V2(i, j) > 2^Depth1, then let V2(i, j) be 2^Depth-1, otherwise leave U2(i, j) unchanged;

thirdly, merging U2 and V2 into a new chromaticity matrix, merging U2 and V2 line by line into a new chromaticity matrix with the horizontal resolution of Width and the vertical resolution of Height/2, and marking as UV1, wherein each pixel value UV1(k, l) is provided, and the value range of k is from 0 to Width-1; the value range of l is from 0 to Height/2;

if k < Width/2, UV1(k, l) ═ U2(k, l), otherwise UV1(k, l) ═ V2(k-Width/2, l).

Preferably, the normal AVS3 (or AVS2) format 4:2:0 is used for encoding F1 to obtain a compressed frame CF1, the value in UV1 is regarded as a common gray pixel, compression encoding is performed by adopting any other video encoding standard supporting the 4:0 format to obtain a compressed frame CF2, and then the code stream of CF2 is inserted into CF1 in the form of user data to form final compressed frame data CF 3.

A second aspect of the embodiments of the present invention provides an encoder for performing the above-mentioned video encoding method.

A third aspect of an embodiment of the present invention provides a video decoding method, including the following steps: when a compressed frame is received, extracting user data, decomposing the user data into a standard compressed frame DF1 of 4:2:0AVS3 (or AVS2) and an additional compressed frame DF2 of 4:0, and then decoding the compressed frames by an AVS3 (or AVS2) standard decoder respectively to obtain a 4:2:0 video frame which is marked as DF3, wherein a brightness component of the DF3 is marked as L2, a horizontal resolution is marked as Width, a vertical resolution is marked as Height, a Cb component matrix is marked as U3, a Cr component matrix is marked as V3, the horizontal resolutions are all Width/2, and the vertical resolutions are all Height/2; decoding by another independent decoder with a corresponding format to obtain DF4, dividing DF4 into left and right halves with horizontal resolution of Width and vertical resolution of Height/2, extracting respectively to obtain U4 for the left half and V4 for the right half,

the compressed frame is obtained by the following method: each 4:2 video frame to be encoded is decomposed into a 4:2:0 sample frame and two separate Cb, Cr chroma matrices, which specifically include: setting the pixel value bit Depth of a 4:2 video frame to be coded as Depth, the horizontal resolution of a brightness component as Width, the vertical resolution as Height, the horizontal resolution of a chrominance matrix (Cb, Cr) as Width/2, the vertical resolution as Height, using all the brightness matrixes, Cb matrixes of odd rows and Cr matrixes of odd rows to form a new 4:2:0 video frame together, making F1, making a Y matrix as L1, making a Cb matrix as U1, making a Cr component plane as V1, and separately extracting Cb and Cr of even rows to form 2 separate chrominance matrixes, making a Cb matrix as U2, making a Cr matrix as V2, and making horizontal and vertical resolutions of U2 and V2 as Width/2 and Height/2 respectively;

the frames sampled in the ratio of 4:2:0 are coded in the format of 4:2:0 of normal AVS3 or AVS2 to obtain compressed frames in the ratio of AVS3 or AVS2, the single chrominance matrix is compressed in the video standard supporting the format of 4:0 to obtain compressed frames in the ratio of 4:0, then the compressed frames in the ratio of 4:0 are used as user data to be inserted into the frame headers of the compressed frames in the ratio of AVS3 (or AVS2), and the frame-by-frame processing is carried out to obtain the complete coded video finally.

Preferably, when encoding, further comprises: the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]And (3) updating the values of the U2 and V2 matrixes, namely:

for each i, where i is from 0 to Width/2, and j, where j is from 0 to Height/2,

first, the difference is calculated: u2(i, j) ═ U1(i, j) -U2(i, j) +2^Depth-1；

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

Step two, saturation operation: if U2(i, j) < 0, let U2(i, j) be 0; if U2(i, j) > 2^Depth1, then let U2(i, j) be 2^Depth-1, otherwise keeping U2(i, j) unchanged.

if k < Width/2, UV1(k, l) ═ U2(k, l), otherwise UV1(k, l) ═ V2(k-Width/2, l);

then the following process is done for U3 and V3 when decoding:

in the first step, for all x (x is from 0 to Width/2) and y (y is from 0 to Height/2),

U4(x，y)＝U3(x，y)-U4(x，y)+2^Depth-1；

step two, saturation operation: if U4(x, y) < 0, let U4(x, y) be 0; if U4(x, y) is greater than 2^Depth1, then let U2(x, y) be 2^Depth-1, otherwise keeping U2(x, y) unchanged.

Preferably, further comprising:

combining U3 and U4 into a new Cb matrix, which is marked as U5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, odd lines of U5 are copied from U3 in sequence, and even lines are copied from U4 in sequence; combining V3 and V4 into a new Cr matrix, which is marked as V5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, the odd lines of V5 are copied from V3 in turn, and the even lines are copied from V4 in turn; decoding of the entire video is accomplished by processing all the frame sequences in sequence using L2, U5, and V5 to merge into a final 4:2 video frame.

A fourth aspect of the embodiments of the present invention provides a decoder for performing the above-mentioned video decoding method.

The invention has the following beneficial effects:

(1) the AVS3 (or AVS2) standard can be applied to the application field which must use 4:2 chroma sampling, fully enjoys the high compression efficiency of AVS3, reduces the storage and transmission pressure, and simultaneously supports 4:2 chroma sampling and maintains the high chroma resolution;

(2) even if the decoding end uses the standard AVS3 (or AVS2) for decoding, 4:2:0 can be obtained, and only certain chroma definition is lost compared with 4:2, but the whole image quality is not lost.

(3) By the technical scheme of the embodiment of the invention, the application fields of AVS3 (or AVS2) such as non-programming, content exchange, ultra high definition HDR live broadcast of large-scale events and concerts and the like of television stations can be greatly expanded at the cost of a small amount of calculation, storage or transmission bandwidth.

(4) Through experimental statistics, the AVS3 (or AVS2)4:2:0 format is expanded to 4:2 format by using the technical scheme of the embodiment of the invention, the added calculation complexity does not exceed 10%, and the additionally consumed storage bandwidth does not exceed 5%.

Drawings

FIG. 1 is a diagram illustrating the location of luma samples in a 4:0 format;

FIG. 2 is a diagram of the location of luma and chroma samples in a 4:2:0 format;

FIG. 3 is a diagram of the location of luma and chroma samples in a 4:2 format;

FIG. 4 is a diagram of luma and chroma sample positions in a 4: 4 format;

FIG. 5 is a flowchart illustrating steps of a video encoding method according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Coding example 1

Referring to fig. 5, a video encoding method according to an embodiment of the present invention is shown, including the following steps:

In a specific application example, each 4:2 video frame to be encoded is decomposed into a 4:2:0 sampled frame and two separate Cb, Cr chroma matrices further include:

Coding example 2

On the basis of the encoding embodiment 1, a video encoding method according to an embodiment of the present invention further includes: the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]The range is updated, the values of the U2 and V2 matrixes are updated, namely, for each U2(i, j) and V2(i, j), wherein the value range of i is 0-Width/2, the value range of j is 0-Height/2,

first, the difference is calculated:

U2(i，j)＝U1(i，j)-U2(i，j)+2^Depth-1；

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

step two, saturation operation:

if U2(i, j) < 0, let U2(i, j) be 0; if U2(i, j) > 2^Depth1, then let U2(i, j) be 2^Depth-1, otherwise leave U2(i, j) unchanged; if V2(i, j) < 0, let V2(i, j) be 0; if V2(i, j) > 2^Depth1, then let V2(i, j) be 2^Depth-1, otherwise leave U2(i, j) unchanged;

In a specific application example, further, F1 is encoded in a normal AVS3 (or AVS2) format of 4:2:0 to obtain a compressed frame CF1, a value in UV1 is regarded as a common gray pixel, compression encoding is performed by adopting any other video encoding standard supporting the format of 4:0 to obtain a compressed frame CF2, and then a code stream of CF2 is inserted into CF1 in the form of user data to form final compressed frame data CF 3.

By the video coding method provided by the embodiment of the invention, at the encoder end, each video frame to be coded in a ratio of 4:2 is decomposed into a frame of 4:2:0 samples and two independent Cb and Cr chrominance matrixes. And then the Cb matrix and the Cr matrix in the frame of 4:2:0 are subtracted from each other to obtain a difference value, half of the maximum pixel value is added, then the difference value is saturated to 0-maximum pixel value, and then the Cb matrix and the Cr matrix are transversely spliced into a wider chrominance matrix. Encoding in the 4:2:0 format of normal AVS3 (or AVS2) for the 4:2:0 plane to obtain AVS3 (or AVS2) compressed frames, compressing in the video standard (e.g., H.264) supporting the 4:0 format for the individual chroma matrices, and then inserting the compressed frames as user data into the frame header of AVS3 (or AVS 2). And processing frame by frame to finally obtain a complete coded video. By adopting the technical scheme of the embodiment of the invention, the AVS3 coding can be expanded from 4:2:0 to 4:2 format, the non-compressed baseband signal with the chrominance sampling format of 4:2 or other non-AVS 3 video formats can be converted into the AVS3 with the chrominance sampling format of 4:2, meanwhile, a decoding library is provided for a non-coding tool made in a station, and the AVS3 signal is reduced to the non-compressed formats such as YUV with the chrominance sampling format of 4: 2. That is, the technical solution of the above embodiment can apply the AVS3 (or AVS2) standard to the application field such as the in-station production field, which must use 4:2 chroma sampling, and fully enjoy the high compression efficiency of AVS3, reduce the storage and transmission pressure, and simultaneously support 4:2 chroma sampling and maintain the high chroma resolution.

The embodiment of the invention also provides an encoder which is implemented by adopting the video encoding method described above.

Decoding example 1

Corresponding to the encoding embodiment of the present invention, an embodiment of the present invention provides a video decoding method, including the following steps: when a compressed frame is received, extracting user data, decomposing the user data into a standard compressed frame DF1 of 4:2:0AVS3 (or AVS2) and an additional compressed frame DF2 of 4:0, and then decoding the compressed frames by an AVS3 (or AVS2) standard decoder respectively to obtain a 4:2:0 video frame which is marked as DF3, wherein a brightness component of the DF3 is marked as L2, a horizontal resolution is marked as Width, a vertical resolution is marked as Height, a Cb component matrix is marked as U3, a Cr component matrix is marked as V3, the horizontal resolutions are all Width/2, and the vertical resolutions are all Height/2; decoding by another independent decoder with a corresponding format to obtain DF4, dividing DF4 into left and right halves with horizontal resolution of Width and vertical resolution of Height/2, extracting respectively to obtain U4 for the left half and V4 for the right half,

wherein the compressed frame is obtained by: each 4:2 video frame to be encoded is decomposed into a 4:2:0 sample frame and two separate Cb, Cr chroma matrices, which specifically include: setting the pixel value bit Depth of a 4:2 video frame to be coded as Depth, the horizontal resolution of a brightness component as Width, the vertical resolution as Height, the horizontal resolution of a chrominance matrix (Cb, Cr) as Width/2, the vertical resolution as Height, using all the brightness matrixes, Cb matrixes of odd rows and Cr matrixes of odd rows to form a new 4:2:0 video frame together, making F1, making a Y matrix as L1, making a Cb matrix as U1, making a Cr component plane as V1, and separately extracting Cb and Cr of even rows to form 2 separate chrominance matrixes, making a Cb matrix as U2, making a Cr matrix as V2, and making horizontal and vertical resolutions of U2 and V2 as Width/2 and Height/2 respectively;

In an embodiment, the encoding further includes: the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]And (3) updating the values of the U2 and V2 matrixes, namely:

for each i, where i is from 0 to Width/2, and j, where j is from 0 to Height/2,

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

Thirdly, merging U2 and V2 into a new chromaticity matrix, merging U2 and V2 line by line into a new chromaticity matrix with the horizontal resolution of Width and the vertical resolution of Height/2, and recording as UV1, wherein each pixel value UV1(k, l) is provided, and the value range of k is from O to Width-1; the value range of l is from 0 to Height/2;

then the following process is done for U3 and V3 when decoding:

U4(x，y)＝U3(x，y)-U4(x，y)+2^Depth-1；

In a specific application example, the method further comprises the following steps: combining U3 and U4 into a new Cb matrix, which is marked as U5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, odd lines of U5 are copied from U3 in sequence, and even lines are copied from U4 in sequence; combining V3 and V4 into a new Cr matrix, which is marked as V5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, the odd lines of V5 are copied from V3 in turn, and the even lines are copied from V4 in turn; decoding of the entire video is accomplished by processing all the frame sequences in sequence using L2, U5, and V5 to merge into a final 4:2 video frame.

By the video decoding method of the above embodiment, at the decoding end, if the standard AVS3 (or AVS2) decoder is adopted to decode 4:2:0 images, it is demonstrated that the method adopting the extension can achieve downward compatibility. If a custom decoder, i.e., the decoding method of the embodiment of the present invention, is used, a standard AVS3 (or AVS2) decoder may be used to obtain a 4:2:0 frame, then the compressed frame in the user data is extracted, a decoder with a corresponding encoding format (e.g., h.264) is used to decode the compressed frame to obtain a wide chroma matrix, then the single chroma matrix is horizontally decomposed into two Cb and Cr matrices, and the Cb and Cr matrices in the 4:2:0 frame are combined to form a final decoded Cb and Cr matrix, and finally a 4:2 frame is formed. By processing frame by frame, a 4:2 decoded video sequence is finally obtained.

The embodiment of the invention also provides a decoder which is implemented by adopting the video decoding method described above.

It is to be understood that the exemplary embodiments described herein are illustrative and not restrictive. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims

1. A video encoding method, comprising the steps of:

decomposing each 4:2:2 video frame to be encoded into a 4:2:0 sample frame and two separate Cb, Cr chrominance matrices;

the method comprises the steps of coding frames of 4:2:0 samples in a normal AVS3 or AVS2 format of 4:2:0 to obtain AVS3 or AVS2 compressed frames, compressing an independent chrominance matrix in a video standard supporting the 4:0:0 format to obtain compressed frames in a 4:0:0 format, inserting the compressed frames in the 4:0:0 format into a frame header of a compressed frame of AVS3 or AVS2 as user data, and processing the frames frame by frame to obtain a complete coded video finally.

2. The video encoding method of claim 1, wherein the parsing each 4:2:2 video frame to be encoded into a 4:2:0 sampled frame and two separate Cb, Cr chroma matrices further comprises:

the pixel value bit Depth of a 4:2:2 video frame to be coded is set as Depth, the horizontal resolution of a brightness component is set as Width, the vertical resolution is set as Height, the horizontal resolution of chrominance matrixes Cb and Cr is set as Width/2, the vertical resolution is set as Height, a new 4:2:0 video frame is formed by using all the luminance matrixes, odd-numbered Cb matrixes and odd-numbered Cr matrixes, and is marked as F1, a Y matrix is marked as L1, a Cb matrix is marked as U1, a Cr matrix is marked as V1, Cb and Cr matrixes in even-numbered rows are independently extracted to form 2 independent chrominance matrixes, a Cb matrix is marked as U2, a Cr matrix is marked as V2, and the horizontal and vertical resolutions of U2 and V2 are set as Width/2 and Height/2 respectively.

3. The video encoding method of claim 2, further comprising: separate Cb matrix U2 and CrThe matrix V2, the pixel value of each identical position is subtracted from the corresponding value of U1, V1, and added to 2^Depth-1Then saturated to [ 0-2 ]^Depth-1]Range, then combine U2 with V2 into a new chroma matrix.

4. The video encoding method of claim 3, wherein the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and added to 2^Depth-1Then saturated to [ 0-2 ]^Depth-1]Range, then merging U2 with V2 into a new chromaticity matrix further comprises:

first, the difference is calculated:

U2(i，j)＝U1(i，j)-U2(i，j)+2^Depth-1；

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

step two, saturation operation:

if U2(i, j) < 0, let U2(i, j) be 0; if U2(i, j)>2^Depth1, then let U2(i, j) be 2^DePth-1, otherwise leave U2(i, j) unchanged; if V2(i, j) < 0, let V2(i, j) be 0; if V2(i, j)>2^Depth1, then let V2(i, j) be 2^Depth-1, otherwise leave U2(i, j) unchanged;

thirdly, merging U2 and V2 into a new chromaticity matrix, merging U2 and V2 line by line into a new chromaticity matrix with the horizontal resolution of Width and the vertical resolution of Height/2, and marking as UV1, wherein each pixel value UV1(k, l) is provided, and the value range of k is from 0 to Width-1; the value range of 1 is from 0 to Height/2;

if k < Width/2, UV1(k, l) ═ U2(k, l), otherwise UV1(k, l) ═ V2(k-Width/2, 1).

5. The video coding method of claim 4, wherein the compressed frame CF1 is obtained by coding F1 using normal AVS3 (or AVS2)4:2:0 format, the value in UV1 is regarded as a common gray pixel, the compressed frame CF2 is obtained by performing compressed coding by adopting any other video coding standard supporting 4:0:0 format, and then the code stream of CF2 is inserted into CF1 in the form of user data to form the final compressed frame data CF 3.

6. An encoder for performing the video encoding method of any of claims 1 to 5.

7. A video decoding method, comprising the steps of: when a compressed frame is received, extracting user data, decomposing the user data into a standard 4:2:0AVS3 (or AVS2) compressed frame DF1 and an additional 4:0:0 compressed frame DF2, and then decoding the compressed frames by an AVS3 (or AVS2) standard decoder respectively to obtain a 4:2:0 video frame which is marked as DF3, wherein a brightness component of DF3 is marked as L2, a horizontal resolution is marked as Width, a vertical resolution is marked as Height, a Cb component matrix is marked as U3, a Cr component matrix is marked as V3, the horizontal resolutions are all Width/2, and the vertical resolutions are all Height/2; decoding by another independent decoder with a corresponding format to obtain DF4, dividing DF4 into left and right halves with horizontal resolution of Width and vertical resolution of Height/2, extracting respectively to obtain U4 for the left half and V4 for the right half,

the compressed frame is obtained by the following method: each 4:2:2 video frame to be encoded is decomposed into a 4:2:0 sample frame and two separate Cb, Cr chroma matrices, which specifically include: setting the bit Depth of a pixel value of a 4:2:2 video frame to be coded as Depth, the horizontal resolution of a luminance component as Width, the vertical resolution as Height, the horizontal resolution of a chrominance matrix (Cb, Cr) as Width/2, the vertical resolution as Height, forming a new 4:2:0 video frame by using all luminance matrixes, Cb matrixes of odd-numbered rows and Cr matrixes of odd-numbered rows, making F1, making a Y matrix as L1, making a Cb matrix as U1, making a Cr component plane as V1, and separately extracting Cb and Cr of even-numbered rows to form 2 separate chrominance matrixes, making a Cb matrix as U2, making a Cr matrix as V2, and making horizontal and vertical resolutions of U2 and V2 as Width/2 and Height/2 respectively;

the method comprises the steps of coding frames of 4:2:0 samples in a normal AVS3 or AVS2 format of 4:2:0 to obtain AVS3 or AVS2 compressed frames, compressing an independent chrominance matrix in a video standard supporting the 4:0:0 format to obtain compressed frames in a 4:0:0 format, then inserting the compressed frames in the 4:0:0 format as user data into a frame header of a compressed frame of AVS3 (or AVS2), and processing frame by frame to finally obtain a complete coded video.

8. The video decoding method of claim 7, further comprising, if encoding: the Cb matrix U2 and the Cr matrix V2 are separated, and the pixel value of each identical position is subtracted from the corresponding value of U1 and V1, and then 2 is added^Depth-1Then saturated to [ 0-2 ]^Depth-1]And (3) updating the values of the U2 and V2 matrixes, namely:

for each i, where i is from 0 to Width/2, and j, where j is from 0 to Height/2,

V2(i，j)＝V1(i，j)-V2(i，j)+2^Depth-1；

if k < Width/2, UV1(k, l) ═ U2(k, l), otherwise UV1(k, l) ═ V2(k-Width/2, 1);

then the following process is done for U3 and V3 when decoding:

U4(x，y)＝U3(x，y)-U4(x，y)+2^Depth-1；

9. The video decoding method of claim 8, further comprising:

combining U3 and U4 into a new Cb matrix, which is marked as U5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, odd lines of U5 are copied from U3 in sequence, and even lines are copied from U4 in sequence; combining V3 and V4 into a new Cr matrix, which is marked as V5, wherein the horizontal resolution is Width/2, and the vertical resolution is Height, the odd lines of V5 are copied from V3 in turn, and the even lines are copied from V4 in turn; decoding of the entire video is completed by processing all the frame sequences in sequence using L2, U5, V5 to merge into one final 4:2:2 video frame.

10. A decoder for performing the video decoding method of any of claims 7 to 9.