CN111614960A

CN111614960A - Quick coding and decoding method for frame level division according to image content characteristics

Info

Publication number: CN111614960A
Application number: CN202010446922.0A
Authority: CN
Inventors: 赵利平; 周青阳; 林涛; 周开伦; 胡珂立; 沈士根; 屠昂燕; 唐剑平
Original assignee: University of Shaoxing
Current assignee: Beijing Boya Ruishi Technology Co ltd
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2020-09-01
Anticipated expiration: 2040-05-25
Also published as: CN111614960B

Abstract

The invention relates to the technical field of data compression, in particular to a quick coding and decoding method for carrying out frame level division according to image content characteristics. In the coding method, the class classification is carried out on the frame level layer according to different characteristics of the image, the class image which meets the conditions can be directly skipped from a string prediction mode for coding, and the total coding time can be reduced on the premise of ensuring the consistent coding efficiency; in addition, when the image of the category is not coded by the string prediction mode, the coding end does not need to write the coding unit level identification of whether the current coding unit adopts the string prediction mode into the code stream, and correspondingly, the coding unit level identification does not need to be analyzed during decoding, so that the coding and decoding time can be shortened, and the coding and decoding efficiency can be improved.

Description

Quick coding and decoding method for frame level division according to image content characteristics

The technical field is as follows:

the invention relates to the technical field of data compression, in particular to a quick coding and decoding method for carrying out frame level division according to image content characteristics.

Background art:

during global social epidemic prevention, remote online interactive activities represented by large-scale remote office work, online teaching, remote meetings and the like play irreplaceable roles, and deeply influence social activity forms. Screen content encoding is just a key technology for screen sharing and display in these remote online interactive activities. These screen contents are ubiquitous and will become one of the main multimedia incremental stream sources. The screen content encoding technology, as a decisive technology for solving the bottleneck problem of data transmission between cloud-end/edge, end/edge-end/edge and the problem of the lag feeling of human-machine interaction in various applications, will profoundly change the modes of human-human interaction and human-machine interaction, even machine-machine interaction. Therefore, the research on screen content encoding technology has become an urgent necessity. Meanwhile, as mobile devices, intelligent terminals/edge devices continue to penetrate into every corner of social life, interaction between various devices is becoming more and more important. In addition to the two important factors of the conventional image distortion and the encoding efficiency, the "codec time" and the "battery life" directly related to the image codec computational complexity are increasingly becoming very important factors. At present, the encoding Efficiency of the latest international Video coding (vvc) (vertical Video coding) standard is improved by about 25% (AI configuration) compared with that of the hevc (high Efficiency Video coding) standard, but the encoding complexity is increased by about 26 times. Such high complexity is not affordable for screen content coding applications on mobile, smart terminals/edge devices. Therefore, there is a need to search for a screen content encoding technique with low complexity and high encoding efficiency.

The screen content is an image that considers the screen content as a region that can contain a part of the content of a natural image and is more general than the natural image. The screen content embodies different characteristics from natural images and mainly comprises the following steps: repetitive patterns with various sizes and shapes, a relatively small number of colors in the coding unit in non-continuous tone regions, and other features (e.g., having sharp, strongly varying content, no noise, etc.). In the conventional hybrid coding framework, coding blocks of image contents with different characteristics are coded by adopting increasingly different prediction modes, that is, each coding unit in an image is pre-coded by adopting various prediction modes in sequence, rate distortion values obtained by each mode are calculated, and the mode with the smallest rate distortion value is selected as the best mode of the coding unit for coding.

In the latest VVC standard and the AVS3(the third generation of Audio coding standard) standard in our country and previous versions or extensions thereof, various prediction modes mainly include an Intra prediction mode, an inter prediction mode, an Intra Block Copy (IBC), a Palette mode, a string matching (also referred to as a string prediction) mode, and the like. For example, the VVC standard has other intra prediction modes besides the conventional intra prediction mode. The IBC algorithm is to use a limited number of fixed-size block matching (e.g., 4x8, 8x4, 8x8, 16x4, etc.) within a search range formed by a portion of the current frame reconstruction region (the current CTU and the left CTU). The Palette mode uses a Palette to save the representative color of the current CU or the nearest CU. The string prediction mode searches for an optimal reference string in a certain matching order and rule on a pixel-by-pixel basis, and finally divides the CU into one or more reference strings and/or one or more unmatched pixels. Different prediction modes have irreplaceable effects on different image contents, and in general, an intra-prediction mode and an inter-prediction mode are very effective for a continuous tone area generated by a camera, such as a natural image; the IBC algorithm, the Palette mode, and the string prediction mode are very effective for screen images containing computer-generated non-continuous tone regions, including text, graphics, icons, charts, and the like. In fact, the IBC algorithm and the Pattern pattern are special cases of string prediction mode.

An image is also called a frame of image, and a frame of image is divided into a maximum coding unit, and the maximum coding unit is divided into one or more coding units, which are determined by a coding tree. Different prediction modes contribute differently to the image content of different features. The string prediction mode is very effective for a screen image having a discontinuous region, but has little effect on a content image having a small content of the discontinuous region. In the existing coding algorithm containing the string prediction mode, on one hand, a method of pre-coding the string prediction mode is tried for all image contents, so that the coding efficiency is not improved, and the coding time is increased; on the other hand, whether a coding unit in the image selects the string prediction mode or not, a coding unit level identifier indicating whether the current coding unit adopts the string prediction mode needs to be written into the code stream, and correspondingly, the decoder also needs to analyze the string prediction mode coding unit level identifier. Therefore, the existing encoding and decoding methods have the problems of long encoding and decoding time and efficiency to be improved. Accordingly, the present disclosure is directed to a method.

The invention content is as follows:

the invention discloses a coding method and a decoding method respectively, which adopt a preset division rule to classify the image contents with different characteristics at an image (frame level) level, adaptively carry out coding and decoding of a serial prediction mode/a non-serial prediction mode on the image contents according to the divided categories and improve the coding and decoding efficiency.

In order to achieve the above object, the encoding method disclosed by the present invention adopts the following technical scheme:

the fast coding method for dividing the frame level according to the image content characteristics comprises the following steps:

step 1: at the image level, classifying the image content characteristics by adopting a preset division rule, and dividing the image content into a first category and a second category;

step 2: setting string prediction mode frame level identification information of the current image according to the image content classification result, and writing the string prediction mode frame level identification information into a code stream; setting a string prediction mode frame level flag of an image to 'true' if the image is divided into a class one, and setting the string prediction mode frame level flag of the image to 'false' if the image is divided into a class two;

and step 3: judging the setting result of the frame level identification of the string prediction mode of the current image:

if the frame level identifier of the string prediction mode is set to be false, all coding units in the image are directly pre-coded by adopting a non-string prediction mode without adopting the string prediction mode, and then the non-string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode are written into a code stream;

if the serial prediction mode frame level flag is set to "true", the following operations are repeatedly performed on each coding unit in the whole image until all coding units are coded: pre-coding by adopting a serial prediction mode and a non-serial prediction mode; then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode: if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, setting the string prediction mode identifier of the current coding unit as 'true' and writing the string prediction mode identifier of the current coding unit into a code stream, and writing the coding information of the string prediction mode of the current coding unit into the code stream; and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, setting the string prediction mode identifier of the current coding unit as 'false' and writing the string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode into the code stream.

Further, the non-string prediction mode is used by adopting one or more of the following modes in any combination: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.

Further, the preset partition rule in step 1 adopts one of the following rules or a combination thereof: (1) total number of pixels Num having the same hash value i_{hashvalue＝i}If the preset condition A is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; (2) total number Num of pixels of adjacent hash values_{hashvalue＝i}And Num_{hashvalue＝i+1}If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class.

Further, the preset condition A is used in one or more of the following ways in any combination: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold; the preset condition B is used in one or more of the following ways in any combination: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.

Further, when the hash value is calculated, one or more of the following modes are adopted for use in any combination: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.

Further, the hash value of the pixel is calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.

The invention also discloses a decoding method after the data coding is completed by the coding method, which comprises the following steps:

step 1: analyzing the compressed data code stream to obtain the frame-level identification information of the string prediction mode of the current image;

step 2: judging the setting result of the frame level identification of the string prediction mode of the current image:

if the frame level identifier of the serial prediction mode is set to be false, analyzing the compressed data code stream to obtain the identifier information of the non-serial prediction mode of the current decoding unit, and decoding the current decoding unit by adopting the non-serial prediction mode;

if the string prediction mode frame level identification is set to be 'true', analyzing the compressed data code stream, obtaining the identification information of the current decoding unit string prediction mode coding unit level and judging whether the current decoding unit string prediction mode coding unit level is 'true' or 'false': if the coding unit level identification information is 'true', decoding the current decoding unit by adopting a string prediction mode; if the coding unit level identification information is false, analyzing the compressed data code stream to obtain a current decoding unit non-string prediction mode identification, and decoding the current decoding unit by adopting a non-string prediction mode.

According to the coding and decoding method disclosed by the invention, the class division is carried out on the frame level layer according to different characteristics of the image, the class image which meets the condition can be directly coded by skipping a string prediction mode, and the total coding time can be reduced on the premise of ensuring the consistent coding efficiency; in addition, when the image of the category is not coded by the string prediction mode, the coding end does not need to write the coding unit level identification of whether the current coding unit adopts the string prediction mode into the code stream, and correspondingly, the coding unit level identification does not need to be analyzed during decoding, so that the coding and decoding time can be shortened, and the coding and decoding efficiency can be improved.

The invention is further illustrated by the accompanying drawings and detailed description.

Description of the drawings:

FIG. 1 is a flow chart of an encoding method employed in an embodiment of the present invention;

fig. 2 is a flowchart of a decoding method adopted in the embodiment of the present invention.

The specific implementation mode is as follows:

the embodiment first discloses a fast encoding method for performing frame-level division according to image content features, as shown in fig. 1, the content includes:

step 2: according to the image content classification result, setting string prediction mode frame level identification information of the current image, and writing the string prediction mode frame level identification information into a code stream, wherein the string prediction mode frame level identification is represented by 'pic _ use _ sp _ flag'; if a picture is divided into category one, "pic _ use _ sp _ flag" of the picture is set to "true", and if a picture is divided into category two, "pic _ use _ sp _ flag" of the picture is set to "false";

and step 3: judging the setting result of the frame level identifier 'pic _ use _ sp _ flag' of the string prediction mode of the current image:

if the pic _ use _ sp _ flag is set to be false, all coding units in the image are directly pre-coded by adopting a non-string prediction mode, but are not pre-coded by adopting a string prediction mode, and then the non-string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode are written into a code stream;

if "pic _ use _ sp _ flag" is set to "true", the following operations are repeatedly performed on each coding unit in the whole image until all coding units are coded: pre-coding by adopting a serial prediction mode and a non-serial prediction mode; then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode: if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, setting the current coding unit string prediction mode identifier cu _ sp _ flag to be 'true' and writing the current coding unit string prediction mode identifier cu _ sp _ flag into a code stream, and writing the coding information of the current coding unit string prediction mode into the code stream; and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, setting the current coding unit string prediction mode identifier cu _ sp _ flag to be false and writing the current coding unit string prediction mode identifier cu _ sp _ flag into the code stream, and writing the current coding unit non-string prediction mode identifier and the coding information of the non-string prediction mode into the code stream.

The non-string prediction mode can be used in any combination of one or more of the following modes: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.

The preset partitioning rule in step 1 may adopt one of the following rules or a combination thereof: rule one is as follows: total number of pixels Num having the same hash value i_{hashvalue＝i}Whether the preset condition A is met or not is classified into class oneThen the classification is a category two; the preset condition a is used in one of the following manners or any combination of a plurality of manners: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold. Rule two: total number Num of pixels of adjacent hash values_{hashvalue＝i}And Num_{hashvalue＝i+1}If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; the preset condition B is used in one or more of the following ways: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.

The hash value calculation method in the preset partition rule may adopt one of the following methods or any combination of the following methods: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.

The hash value of the pixel can be calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.

For example, when calculating the hash value for a pixel with three components, the total number of bits is set to 12 bits as an example: when a mode of connecting a plurality of bits with the same number at the same position of each component in series to form a hash value is selected, the middle 4 bits of each component can be selected, and 12 bits can be selected in total; or taking the lower 4 bits of each component, and taking 12 bits in total; when the mode of connecting a plurality of bits with different numbers at the same position of each component in series to form the hash value is selected, the middle 6 bits of the first component of the pixel, the middle 3 bits of the second component and the middle 3 bits of the third component can be taken.

A specific embodiment is given below regarding the preset partition rule in the encoding method, and the embodiment takes a three-component pixel YUV image as an example, and is described as follows:

1) for each pixel in the image, a 12-bit hash value (range of 0-4095) is calculated as follows, and the intermediate 4 bits of Y, U, V components are respectively taken as a hash value formed by connecting in series, and the formula is as follows: hashValue ═ ((Y &0x3c < <6) | ((U &0x3c) < <2) | ((V &0x3c) > > 2); (in the formula, "&" represents a binary bitwise and operation; "0 x" represents a 16-step representation; "<" and "> >", respectively, represent binary left and right shift operations, "|" represents a binary bitwise or operation).

2) Counting the total number of pixels with the same hash value i (i is more than or equal to 0 and less than or equal to 4095) and recording as Num_{hashvalue＝0}、Num_{hashvalue＝1}、......Num_{hashvalue＝4095}；

3) Counting the total number of pixels with the same hash value, which satisfies the following condition, and is denoted as rcolor no (initialized to 0):

traverse Num_{hashvalue＝0}、Num_{hashvalue＝1}、......Num_{hashvalue＝4095}；

If Num_{hashvalue＝i}Greater than or equal to a certain threshold A (e.g., threshold A is a fixed value of 1000, or is not fixed, e.g., (height of image and width of image)>>11) Then RColorNO adds 1;

4) counting the total number Num of pixels of the adjacent hash values_{hashvalue＝i}And Num_{hashvalue＝i+1}The total number between which the following condition is satisfied is denoted as DColorNO (initialized to 0):

traverse Num_{hashvalue＝0}、Num_{hashvalue＝1}、......Num_{hashvalue＝4094}：

Let Long _ chain be Num_{hashvalue＝i}And Num_{hashvalue＝i+1}The greater between;

let short _ chain be Num_{hashvalue＝i}And Num_{hashvalue＝i+1}Comparison betweenA small value;

if (((short _ chain +1) < < threshold B) is less than long _ chain), then DColorNO is increased by 1; (Here, the threshold B is exemplified by 7)

5) Images are divided into two categories according to the values of rcolor no and dcolono:

if (DColorNO/(rcoolorno +1) > threshold C), the image is classified as class one, otherwise the image is classified as class two (here threshold C is exemplified by 0.2).

To fully illustrate the specific application of the encoding method of the present invention, the following full examples are given:

the embodiment of the coding method comprises the following steps: still taking the three-component image as an example, the encoding method is as follows:

step 1: at an image (frame level) level, the image contents with different characteristics are divided by adopting the following division rules, and the image contents are divided into two preset categories:

substep 1: calculating a 12-bit hash value (the range is 0-4095) for each pixel in the image according to the following mode, and respectively taking the hash value formed by connecting the middle 4 bits of Y, U, V components in series;

hashValue＝((Y&0x3c<<6)|((U&0x3c)<<2)|((V&0x3c)>>2)

substep 2: counting the total number of pixels with the same hash value i (i is more than or equal to 0 and less than or equal to 4095) and recording as Num_{hashvalue＝0}、Num_hashvalue＝1、......Num_{hashvalue＝4095}；

Substep 3: counting the total number of pixels with the same hash value, which satisfies the following condition, and is denoted as rcolor no (initialized to 0):

traverse Num_{hashvalue＝0}、Num_{hashvalue＝1}、......Num_{hashvalue＝4095}：

If Num_{hashvalue＝i}If the value is greater than or equal to the threshold value A, adding 1 to RColorNO;

substep 4: counting the total number Num of pixels of the adjacent hash values_{hashvalue＝i}And Num_{hashvalue＝i+1}The total number of cells satisfying the following condition, denoted DColorNO (initialized to 0):

let short _ chain be Num_{hashvalue＝i}And Num_{hashvalue＝i+1}The smaller value in between;

if (((short _ chain +1) < < threshold B) is less than long _ chain), then DColorNO is increased by 1;

substep 5: images are divided into two categories according to the values of rcolor no and dcolono:

if (DColorNO/(RColorNO +1) > threshold C), classifying the image into a category one; otherwise, the image is classified as category two.

Step 2: according to the image content division result, setting a flag bit pic _ use _ sp _ flag to represent the frame level identification information of the current image string prediction mode, and writing the flag into a code stream:

if the image is classified as class one, then

pic _ use _ sp _ flag is set to true.

Otherwise

pic _ use _ sp _ flag is set to false.

And step 3: according to the string prediction mode frame level identification information, when the string prediction mode frame level identification is false, the string prediction mode pre-coding is skipped, the string prediction mode coding unit level identification is not required to be written into a code stream, the non-string prediction mode is directly adopted for pre-coding all coding units in the image, and the non-string prediction mode identification of the current coding unit and the coding information of the non-string prediction mode are written into the code stream.

Repeating the following operations for each coding unit in the whole image until all coding units are coded:

if pic _ use _ sp _ flag is true, then

Pre-coding in a string prediction mode;

pre-coding in a non-serial prediction mode;

then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode:

if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, writing the string prediction mode identifier cu _ sp _ flag of the current coding unit into a code stream with true, and writing the coding information of the string prediction mode into the code stream;

and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, writing the current coding unit string prediction mode identifier cu _ sp _ flag into the code stream as false, and writing the coding information of the current coding unit non-string prediction mode identifier and the non-string prediction mode into the code stream.

In step 3 in the above specific embodiment, when pic _ use _ sp _ flag is judged to be true, the pre-coding sequence may also be: non-string prediction mode precoding is performed first, and then string prediction mode precoding is performed. Or the pre-coding order may be: and firstly, precoding by adopting a partial non-string prediction mode, then precoding by adopting a string prediction mode, and then precoding by adopting a partial non-string prediction mode. The difference of precoding sequence setting has certain influence on coding efficiency, such as: when one of the non-string prediction modes is the same as the rate distortion obtained by the string prediction mode, the current coding unit selects the non-string prediction mode; on the other hand, when the non-string prediction mode precoding has an early termination algorithm and meets the condition of the early termination algorithm, the subsequent steps may not be performed, which is still helpful for improving the coding efficiency. However, the coding method provided by the present invention is applicable regardless of the arrangement of the precoding order.

Corresponding to the coding method, the decoding method comprises the following steps:

step 1: analyzing the compressed data code stream to obtain the frame level identification pic _ use _ sp _ flag information of the string prediction mode of the current image;

step 2: judging the setting result of the string prediction mode frame level identification pic _ use _ sp _ flag of the current image:

if pic _ use _ sp _ flag is false, analyzing the compressed data code stream to obtain the identification information of the non-serial prediction mode of the current decoding unit, and decoding the current decoding unit by adopting the non-serial prediction mode;

if pic _ use _ sp _ flag is true, analyzing the compressed data code stream, obtaining the current decoding unit string prediction mode coding unit level identification cu _ sp _ flag information, and judging whether true or false: if cu _ sp _ flag is true, decoding the current decoding unit by adopting a string prediction mode; and if the cu _ sp _ flag is false, analyzing the compressed data code stream, obtaining a non-string prediction mode identifier of the current decoding unit, and decoding the current decoding unit by adopting the non-string prediction mode. Wherein, the cu _ sp _ flag information exists in the data code stream in an independent mode (namely, an independent identification bit); the cu _ sp _ flag may also be represented by one of the flag bits in combination with other information.

The above embodiments are only for illustrating the technical solutions of the present invention and are not limited, and other modifications or equivalent substitutions made by the technical solutions of the present invention by the ordinary skilled person in the art are included in the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A fast coding method for frame level division according to image content features is characterized in that: the content comprises the following steps:

2. The fast encoding method for frame-level division by image content characteristics according to claim 1, wherein: the non-string prediction mode is used by adopting one or more of the following modes in any combination: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.

3. The fast encoding method for frame-level division by image content characteristics according to claim 1, wherein: the preset division rule in the step 1 adopts one of the following rules or uses the following rules in combination: (1) total number of pixels Num having the same hash value i_{hashvalue＝i}If the preset condition A is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; (2) total number Num of pixels of adjacent hash values_{hashvalue＝i}And Num_{hashvalue＝i+1}If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class.

4. The fast encoding method for frame-level division by image content characteristics according to claim 3, wherein: the preset condition A is used in one or more of the following modes in any combination: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold; the preset condition B is used in one or more of the following ways in any combination: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.

5. The fast encoding method for frame-level division by image content characteristics according to claim 3, wherein: when the hash value is calculated, one or more of the following modes are adopted for use in any combination: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.

6. The fast encoding method for frame-level division by image content characteristics according to claim 5, wherein: the hash value of the pixel is calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.

7. A decoding method for performing data encoding according to the encoding method of any one of claims 1 to 6, characterized by: the content comprises the following steps: