CN111614960A - Quick coding and decoding method for frame level division according to image content characteristics - Google Patents

Quick coding and decoding method for frame level division according to image content characteristics Download PDF

Info

Publication number
CN111614960A
CN111614960A CN202010446922.0A CN202010446922A CN111614960A CN 111614960 A CN111614960 A CN 111614960A CN 202010446922 A CN202010446922 A CN 202010446922A CN 111614960 A CN111614960 A CN 111614960A
Authority
CN
China
Prior art keywords
prediction mode
string prediction
image
coding
coding unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010446922.0A
Other languages
Chinese (zh)
Other versions
CN111614960B (en
Inventor
赵利平
周青阳
林涛
周开伦
胡珂立
沈士根
屠昂燕
唐剑平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Boya Ruishi Technology Co ltd
Original Assignee
University of Shaoxing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shaoxing filed Critical University of Shaoxing
Priority to CN202010446922.0A priority Critical patent/CN111614960B/en
Publication of CN111614960A publication Critical patent/CN111614960A/en
Application granted granted Critical
Publication of CN111614960B publication Critical patent/CN111614960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to the technical field of data compression, in particular to a quick coding and decoding method for carrying out frame level division according to image content characteristics. In the coding method, the class classification is carried out on the frame level layer according to different characteristics of the image, the class image which meets the conditions can be directly skipped from a string prediction mode for coding, and the total coding time can be reduced on the premise of ensuring the consistent coding efficiency; in addition, when the image of the category is not coded by the string prediction mode, the coding end does not need to write the coding unit level identification of whether the current coding unit adopts the string prediction mode into the code stream, and correspondingly, the coding unit level identification does not need to be analyzed during decoding, so that the coding and decoding time can be shortened, and the coding and decoding efficiency can be improved.

Description

Quick coding and decoding method for frame level division according to image content characteristics
The technical field is as follows:
the invention relates to the technical field of data compression, in particular to a quick coding and decoding method for carrying out frame level division according to image content characteristics.
Background art:
during global social epidemic prevention, remote online interactive activities represented by large-scale remote office work, online teaching, remote meetings and the like play irreplaceable roles, and deeply influence social activity forms. Screen content encoding is just a key technology for screen sharing and display in these remote online interactive activities. These screen contents are ubiquitous and will become one of the main multimedia incremental stream sources. The screen content encoding technology, as a decisive technology for solving the bottleneck problem of data transmission between cloud-end/edge, end/edge-end/edge and the problem of the lag feeling of human-machine interaction in various applications, will profoundly change the modes of human-human interaction and human-machine interaction, even machine-machine interaction. Therefore, the research on screen content encoding technology has become an urgent necessity. Meanwhile, as mobile devices, intelligent terminals/edge devices continue to penetrate into every corner of social life, interaction between various devices is becoming more and more important. In addition to the two important factors of the conventional image distortion and the encoding efficiency, the "codec time" and the "battery life" directly related to the image codec computational complexity are increasingly becoming very important factors. At present, the encoding Efficiency of the latest international Video coding (vvc) (vertical Video coding) standard is improved by about 25% (AI configuration) compared with that of the hevc (high Efficiency Video coding) standard, but the encoding complexity is increased by about 26 times. Such high complexity is not affordable for screen content coding applications on mobile, smart terminals/edge devices. Therefore, there is a need to search for a screen content encoding technique with low complexity and high encoding efficiency.
The screen content is an image that considers the screen content as a region that can contain a part of the content of a natural image and is more general than the natural image. The screen content embodies different characteristics from natural images and mainly comprises the following steps: repetitive patterns with various sizes and shapes, a relatively small number of colors in the coding unit in non-continuous tone regions, and other features (e.g., having sharp, strongly varying content, no noise, etc.). In the conventional hybrid coding framework, coding blocks of image contents with different characteristics are coded by adopting increasingly different prediction modes, that is, each coding unit in an image is pre-coded by adopting various prediction modes in sequence, rate distortion values obtained by each mode are calculated, and the mode with the smallest rate distortion value is selected as the best mode of the coding unit for coding.
In the latest VVC standard and the AVS3(the third generation of Audio coding standard) standard in our country and previous versions or extensions thereof, various prediction modes mainly include an Intra prediction mode, an inter prediction mode, an Intra Block Copy (IBC), a Palette mode, a string matching (also referred to as a string prediction) mode, and the like. For example, the VVC standard has other intra prediction modes besides the conventional intra prediction mode. The IBC algorithm is to use a limited number of fixed-size block matching (e.g., 4x8, 8x4, 8x8, 16x4, etc.) within a search range formed by a portion of the current frame reconstruction region (the current CTU and the left CTU). The Palette mode uses a Palette to save the representative color of the current CU or the nearest CU. The string prediction mode searches for an optimal reference string in a certain matching order and rule on a pixel-by-pixel basis, and finally divides the CU into one or more reference strings and/or one or more unmatched pixels. Different prediction modes have irreplaceable effects on different image contents, and in general, an intra-prediction mode and an inter-prediction mode are very effective for a continuous tone area generated by a camera, such as a natural image; the IBC algorithm, the Palette mode, and the string prediction mode are very effective for screen images containing computer-generated non-continuous tone regions, including text, graphics, icons, charts, and the like. In fact, the IBC algorithm and the Pattern pattern are special cases of string prediction mode.
An image is also called a frame of image, and a frame of image is divided into a maximum coding unit, and the maximum coding unit is divided into one or more coding units, which are determined by a coding tree. Different prediction modes contribute differently to the image content of different features. The string prediction mode is very effective for a screen image having a discontinuous region, but has little effect on a content image having a small content of the discontinuous region. In the existing coding algorithm containing the string prediction mode, on one hand, a method of pre-coding the string prediction mode is tried for all image contents, so that the coding efficiency is not improved, and the coding time is increased; on the other hand, whether a coding unit in the image selects the string prediction mode or not, a coding unit level identifier indicating whether the current coding unit adopts the string prediction mode needs to be written into the code stream, and correspondingly, the decoder also needs to analyze the string prediction mode coding unit level identifier. Therefore, the existing encoding and decoding methods have the problems of long encoding and decoding time and efficiency to be improved. Accordingly, the present disclosure is directed to a method.
The invention content is as follows:
the invention discloses a coding method and a decoding method respectively, which adopt a preset division rule to classify the image contents with different characteristics at an image (frame level) level, adaptively carry out coding and decoding of a serial prediction mode/a non-serial prediction mode on the image contents according to the divided categories and improve the coding and decoding efficiency.
In order to achieve the above object, the encoding method disclosed by the present invention adopts the following technical scheme:
the fast coding method for dividing the frame level according to the image content characteristics comprises the following steps:
step 1: at the image level, classifying the image content characteristics by adopting a preset division rule, and dividing the image content into a first category and a second category;
step 2: setting string prediction mode frame level identification information of the current image according to the image content classification result, and writing the string prediction mode frame level identification information into a code stream; setting a string prediction mode frame level flag of an image to 'true' if the image is divided into a class one, and setting the string prediction mode frame level flag of the image to 'false' if the image is divided into a class two;
and step 3: judging the setting result of the frame level identification of the string prediction mode of the current image:
if the frame level identifier of the string prediction mode is set to be false, all coding units in the image are directly pre-coded by adopting a non-string prediction mode without adopting the string prediction mode, and then the non-string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode are written into a code stream;
if the serial prediction mode frame level flag is set to "true", the following operations are repeatedly performed on each coding unit in the whole image until all coding units are coded: pre-coding by adopting a serial prediction mode and a non-serial prediction mode; then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode: if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, setting the string prediction mode identifier of the current coding unit as 'true' and writing the string prediction mode identifier of the current coding unit into a code stream, and writing the coding information of the string prediction mode of the current coding unit into the code stream; and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, setting the string prediction mode identifier of the current coding unit as 'false' and writing the string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode into the code stream.
Further, the non-string prediction mode is used by adopting one or more of the following modes in any combination: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.
Further, the preset partition rule in step 1 adopts one of the following rules or a combination thereof: (1) total number of pixels Num having the same hash value ihashvalue=iIf the preset condition A is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; (2) total number Num of pixels of adjacent hash valueshashvalue=iAnd Numhashvalue=i+1If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class.
Further, the preset condition A is used in one or more of the following ways in any combination: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold; the preset condition B is used in one or more of the following ways in any combination: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.
Further, when the hash value is calculated, one or more of the following modes are adopted for use in any combination: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.
Further, the hash value of the pixel is calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.
The invention also discloses a decoding method after the data coding is completed by the coding method, which comprises the following steps:
step 1: analyzing the compressed data code stream to obtain the frame-level identification information of the string prediction mode of the current image;
step 2: judging the setting result of the frame level identification of the string prediction mode of the current image:
if the frame level identifier of the serial prediction mode is set to be false, analyzing the compressed data code stream to obtain the identifier information of the non-serial prediction mode of the current decoding unit, and decoding the current decoding unit by adopting the non-serial prediction mode;
if the string prediction mode frame level identification is set to be 'true', analyzing the compressed data code stream, obtaining the identification information of the current decoding unit string prediction mode coding unit level and judging whether the current decoding unit string prediction mode coding unit level is 'true' or 'false': if the coding unit level identification information is 'true', decoding the current decoding unit by adopting a string prediction mode; if the coding unit level identification information is false, analyzing the compressed data code stream to obtain a current decoding unit non-string prediction mode identification, and decoding the current decoding unit by adopting a non-string prediction mode.
According to the coding and decoding method disclosed by the invention, the class division is carried out on the frame level layer according to different characteristics of the image, the class image which meets the condition can be directly coded by skipping a string prediction mode, and the total coding time can be reduced on the premise of ensuring the consistent coding efficiency; in addition, when the image of the category is not coded by the string prediction mode, the coding end does not need to write the coding unit level identification of whether the current coding unit adopts the string prediction mode into the code stream, and correspondingly, the coding unit level identification does not need to be analyzed during decoding, so that the coding and decoding time can be shortened, and the coding and decoding efficiency can be improved.
The invention is further illustrated by the accompanying drawings and detailed description.
Description of the drawings:
FIG. 1 is a flow chart of an encoding method employed in an embodiment of the present invention;
fig. 2 is a flowchart of a decoding method adopted in the embodiment of the present invention.
The specific implementation mode is as follows:
the embodiment first discloses a fast encoding method for performing frame-level division according to image content features, as shown in fig. 1, the content includes:
step 1: at the image level, classifying the image content characteristics by adopting a preset division rule, and dividing the image content into a first category and a second category;
step 2: according to the image content classification result, setting string prediction mode frame level identification information of the current image, and writing the string prediction mode frame level identification information into a code stream, wherein the string prediction mode frame level identification is represented by 'pic _ use _ sp _ flag'; if a picture is divided into category one, "pic _ use _ sp _ flag" of the picture is set to "true", and if a picture is divided into category two, "pic _ use _ sp _ flag" of the picture is set to "false";
and step 3: judging the setting result of the frame level identifier 'pic _ use _ sp _ flag' of the string prediction mode of the current image:
if the pic _ use _ sp _ flag is set to be false, all coding units in the image are directly pre-coded by adopting a non-string prediction mode, but are not pre-coded by adopting a string prediction mode, and then the non-string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode are written into a code stream;
if "pic _ use _ sp _ flag" is set to "true", the following operations are repeatedly performed on each coding unit in the whole image until all coding units are coded: pre-coding by adopting a serial prediction mode and a non-serial prediction mode; then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode: if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, setting the current coding unit string prediction mode identifier cu _ sp _ flag to be 'true' and writing the current coding unit string prediction mode identifier cu _ sp _ flag into a code stream, and writing the coding information of the current coding unit string prediction mode into the code stream; and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, setting the current coding unit string prediction mode identifier cu _ sp _ flag to be false and writing the current coding unit string prediction mode identifier cu _ sp _ flag into the code stream, and writing the current coding unit non-string prediction mode identifier and the coding information of the non-string prediction mode into the code stream.
The non-string prediction mode can be used in any combination of one or more of the following modes: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.
The preset partitioning rule in step 1 may adopt one of the following rules or a combination thereof: rule one is as follows: total number of pixels Num having the same hash value ihashvalue=iWhether the preset condition A is met or not is classified into class oneThen the classification is a category two; the preset condition a is used in one of the following manners or any combination of a plurality of manners: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold. Rule two: total number Num of pixels of adjacent hash valueshashvalue=iAnd Numhashvalue=i+1If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; the preset condition B is used in one or more of the following ways: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.
The hash value calculation method in the preset partition rule may adopt one of the following methods or any combination of the following methods: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.
The hash value of the pixel can be calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.
For example, when calculating the hash value for a pixel with three components, the total number of bits is set to 12 bits as an example: when a mode of connecting a plurality of bits with the same number at the same position of each component in series to form a hash value is selected, the middle 4 bits of each component can be selected, and 12 bits can be selected in total; or taking the lower 4 bits of each component, and taking 12 bits in total; when the mode of connecting a plurality of bits with different numbers at the same position of each component in series to form the hash value is selected, the middle 6 bits of the first component of the pixel, the middle 3 bits of the second component and the middle 3 bits of the third component can be taken.
A specific embodiment is given below regarding the preset partition rule in the encoding method, and the embodiment takes a three-component pixel YUV image as an example, and is described as follows:
1) for each pixel in the image, a 12-bit hash value (range of 0-4095) is calculated as follows, and the intermediate 4 bits of Y, U, V components are respectively taken as a hash value formed by connecting in series, and the formula is as follows: hashValue ═ ((Y &0x3c < <6) | ((U &0x3c) < <2) | ((V &0x3c) > > 2); (in the formula, "&" represents a binary bitwise and operation; "0 x" represents a 16-step representation; "<" and "> >", respectively, represent binary left and right shift operations, "|" represents a binary bitwise or operation).
2) Counting the total number of pixels with the same hash value i (i is more than or equal to 0 and less than or equal to 4095) and recording as Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4095
3) Counting the total number of pixels with the same hash value, which satisfies the following condition, and is denoted as rcolor no (initialized to 0):
traverse Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4095
If Numhashvalue=iGreater than or equal to a certain threshold A (e.g., threshold A is a fixed value of 1000, or is not fixed, e.g., (height of image and width of image)>>11) Then RColorNO adds 1;
4) counting the total number Num of pixels of the adjacent hash valueshashvalue=iAnd Numhashvalue=i+1The total number between which the following condition is satisfied is denoted as DColorNO (initialized to 0):
traverse Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4094
Let Long _ chain be Numhashvalue=iAnd Numhashvalue=i+1The greater between;
let short _ chain be Numhashvalue=iAnd Numhashvalue=i+1Comparison betweenA small value;
if (((short _ chain +1) < < threshold B) is less than long _ chain), then DColorNO is increased by 1; (Here, the threshold B is exemplified by 7)
5) Images are divided into two categories according to the values of rcolor no and dcolono:
if (DColorNO/(rcoolorno +1) > threshold C), the image is classified as class one, otherwise the image is classified as class two (here threshold C is exemplified by 0.2).
To fully illustrate the specific application of the encoding method of the present invention, the following full examples are given:
the embodiment of the coding method comprises the following steps: still taking the three-component image as an example, the encoding method is as follows:
step 1: at an image (frame level) level, the image contents with different characteristics are divided by adopting the following division rules, and the image contents are divided into two preset categories:
substep 1: calculating a 12-bit hash value (the range is 0-4095) for each pixel in the image according to the following mode, and respectively taking the hash value formed by connecting the middle 4 bits of Y, U, V components in series;
hashValue=((Y&0x3c<<6)|((U&0x3c)<<2)|((V&0x3c)>>2)
substep 2: counting the total number of pixels with the same hash value i (i is more than or equal to 0 and less than or equal to 4095) and recording as Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4095
Substep 3: counting the total number of pixels with the same hash value, which satisfies the following condition, and is denoted as rcolor no (initialized to 0):
traverse Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4095
If Numhashvalue=iIf the value is greater than or equal to the threshold value A, adding 1 to RColorNO;
substep 4: counting the total number Num of pixels of the adjacent hash valueshashvalue=iAnd Numhashvalue=i+1The total number of cells satisfying the following condition, denoted DColorNO (initialized to 0):
traverse Numhashvalue=0、Numhashvalue=1、......Numhashvalue=4094
Let Long _ chain be Numhashvalue=iAnd Numhashvalue=i+1The greater between;
let short _ chain be Numhashvalue=iAnd Numhashvalue=i+1The smaller value in between;
if (((short _ chain +1) < < threshold B) is less than long _ chain), then DColorNO is increased by 1;
substep 5: images are divided into two categories according to the values of rcolor no and dcolono:
if (DColorNO/(RColorNO +1) > threshold C), classifying the image into a category one; otherwise, the image is classified as category two.
Step 2: according to the image content division result, setting a flag bit pic _ use _ sp _ flag to represent the frame level identification information of the current image string prediction mode, and writing the flag into a code stream:
if the image is classified as class one, then
pic _ use _ sp _ flag is set to true.
Otherwise
pic _ use _ sp _ flag is set to false.
And step 3: according to the string prediction mode frame level identification information, when the string prediction mode frame level identification is false, the string prediction mode pre-coding is skipped, the string prediction mode coding unit level identification is not required to be written into a code stream, the non-string prediction mode is directly adopted for pre-coding all coding units in the image, and the non-string prediction mode identification of the current coding unit and the coding information of the non-string prediction mode are written into the code stream.
Repeating the following operations for each coding unit in the whole image until all coding units are coded:
if pic _ use _ sp _ flag is true, then
Pre-coding in a string prediction mode;
pre-coding in a non-serial prediction mode;
then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode:
if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, writing the string prediction mode identifier cu _ sp _ flag of the current coding unit into a code stream with true, and writing the coding information of the string prediction mode into the code stream;
and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, writing the current coding unit string prediction mode identifier cu _ sp _ flag into the code stream as false, and writing the coding information of the current coding unit non-string prediction mode identifier and the non-string prediction mode into the code stream.
In step 3 in the above specific embodiment, when pic _ use _ sp _ flag is judged to be true, the pre-coding sequence may also be: non-string prediction mode precoding is performed first, and then string prediction mode precoding is performed. Or the pre-coding order may be: and firstly, precoding by adopting a partial non-string prediction mode, then precoding by adopting a string prediction mode, and then precoding by adopting a partial non-string prediction mode. The difference of precoding sequence setting has certain influence on coding efficiency, such as: when one of the non-string prediction modes is the same as the rate distortion obtained by the string prediction mode, the current coding unit selects the non-string prediction mode; on the other hand, when the non-string prediction mode precoding has an early termination algorithm and meets the condition of the early termination algorithm, the subsequent steps may not be performed, which is still helpful for improving the coding efficiency. However, the coding method provided by the present invention is applicable regardless of the arrangement of the precoding order.
Corresponding to the coding method, the decoding method comprises the following steps:
step 1: analyzing the compressed data code stream to obtain the frame level identification pic _ use _ sp _ flag information of the string prediction mode of the current image;
step 2: judging the setting result of the string prediction mode frame level identification pic _ use _ sp _ flag of the current image:
if pic _ use _ sp _ flag is false, analyzing the compressed data code stream to obtain the identification information of the non-serial prediction mode of the current decoding unit, and decoding the current decoding unit by adopting the non-serial prediction mode;
if pic _ use _ sp _ flag is true, analyzing the compressed data code stream, obtaining the current decoding unit string prediction mode coding unit level identification cu _ sp _ flag information, and judging whether true or false: if cu _ sp _ flag is true, decoding the current decoding unit by adopting a string prediction mode; and if the cu _ sp _ flag is false, analyzing the compressed data code stream, obtaining a non-string prediction mode identifier of the current decoding unit, and decoding the current decoding unit by adopting the non-string prediction mode. Wherein, the cu _ sp _ flag information exists in the data code stream in an independent mode (namely, an independent identification bit); the cu _ sp _ flag may also be represented by one of the flag bits in combination with other information.
The above embodiments are only for illustrating the technical solutions of the present invention and are not limited, and other modifications or equivalent substitutions made by the technical solutions of the present invention by the ordinary skilled person in the art are included in the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (7)

1. A fast coding method for frame level division according to image content features is characterized in that: the content comprises the following steps:
step 1: at the image level, classifying the image content characteristics by adopting a preset division rule, and dividing the image content into a first category and a second category;
step 2: setting string prediction mode frame level identification information of the current image according to the image content classification result, and writing the string prediction mode frame level identification information into a code stream; setting a string prediction mode frame level flag of an image to 'true' if the image is divided into a class one, and setting the string prediction mode frame level flag of the image to 'false' if the image is divided into a class two;
and step 3: judging the setting result of the frame level identification of the string prediction mode of the current image:
if the frame level identifier of the string prediction mode is set to be false, all coding units in the image are directly pre-coded by adopting a non-string prediction mode without adopting the string prediction mode, and then the non-string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode are written into a code stream;
if the serial prediction mode frame level flag is set to "true", the following operations are repeatedly performed on each coding unit in the whole image until all coding units are coded: pre-coding by adopting a serial prediction mode and a non-serial prediction mode; then judging whether the mode with the minimum rate distortion of the current coding unit is a string prediction mode: if the mode with the minimum rate distortion of the current coding unit is the string prediction mode, setting the string prediction mode identifier of the current coding unit as 'true' and writing the string prediction mode identifier of the current coding unit into a code stream, and writing the coding information of the string prediction mode of the current coding unit into the code stream; and if the mode with the minimum rate distortion of the current coding unit is the non-string prediction mode, setting the string prediction mode identifier of the current coding unit as 'false' and writing the string prediction mode identifier of the current coding unit and the coding information of the non-string prediction mode into the code stream.
2. The fast encoding method for frame-level division by image content characteristics according to claim 1, wherein: the non-string prediction mode is used by adopting one or more of the following modes in any combination: an intra prediction mode; an inter-frame prediction mode; an IBC mode; pattern mode.
3. The fast encoding method for frame-level division by image content characteristics according to claim 1, wherein: the preset division rule in the step 1 adopts one of the following rules or uses the following rules in combination: (1) total number of pixels Num having the same hash value ihashvalue=iIf the preset condition A is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class; (2) total number Num of pixels of adjacent hash valueshashvalue=iAnd Numhashvalue=i+1If the preset condition B is met, the classification is carried out into a first class, otherwise, the classification is carried out into a second class.
4. The fast encoding method for frame-level division by image content characteristics according to claim 3, wherein: the preset condition A is used in one or more of the following modes in any combination: greater than a fixed threshold; equal to a fixed threshold; greater than a non-fixed threshold; equal to a non-fixed threshold; the preset condition B is used in one or more of the following ways in any combination: (1) the absolute value of the difference value between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than a certain fixed threshold; (2) the absolute value of the difference between the smaller and the larger of the total number of pixels of the adjacent hash values is greater than some non-fixed threshold.
5. The fast encoding method for frame-level division by image content characteristics according to claim 3, wherein: when the hash value is calculated, one or more of the following modes are adopted for use in any combination: (1) adopting different hash value calculation modes for pixels with different sampling formats; (2) the same hash value calculation mode is adopted for pixels with different sampling formats; (3) adopting different hash value calculation modes for pixels with different bit numbers; (4) and adopting the same hash value calculation mode for pixels with different bit numbers.
6. The fast encoding method for frame-level division by image content characteristics according to claim 5, wherein: the hash value of the pixel is calculated by one or more of the following methods: (1) taking a hash value formed by connecting a plurality of bits with the same number of each component of the pixel in series; (2) taking a hash value formed by connecting a plurality of bits with different numbers of each component of the pixel in series; (3) taking a hash value formed by connecting a plurality of bits at the same position of each component of a pixel in series; (4) and taking a hash value formed by connecting a plurality of bits at different positions of each component of the pixel in series.
7. A decoding method for performing data encoding according to the encoding method of any one of claims 1 to 6, characterized by: the content comprises the following steps:
step 1: analyzing the compressed data code stream to obtain the frame-level identification information of the string prediction mode of the current image;
step 2: judging the setting result of the frame level identification of the string prediction mode of the current image:
if the frame level identifier of the serial prediction mode is set to be false, analyzing the compressed data code stream to obtain the identifier information of the non-serial prediction mode of the current decoding unit, and decoding the current decoding unit by adopting the non-serial prediction mode;
if the string prediction mode frame level identification is set to be 'true', analyzing the compressed data code stream, obtaining the identification information of the current decoding unit string prediction mode coding unit level and judging whether the current decoding unit string prediction mode coding unit level is 'true' or 'false': if the coding unit level identification information is 'true', decoding the current decoding unit by adopting a string prediction mode; if the coding unit level identification information is false, analyzing the compressed data code stream to obtain a current decoding unit non-string prediction mode identification, and decoding the current decoding unit by adopting a non-string prediction mode.
CN202010446922.0A 2020-05-25 2020-05-25 Quick coding and decoding method for frame level division according to image content characteristics Active CN111614960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010446922.0A CN111614960B (en) 2020-05-25 2020-05-25 Quick coding and decoding method for frame level division according to image content characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010446922.0A CN111614960B (en) 2020-05-25 2020-05-25 Quick coding and decoding method for frame level division according to image content characteristics

Publications (2)

Publication Number Publication Date
CN111614960A true CN111614960A (en) 2020-09-01
CN111614960B CN111614960B (en) 2022-09-02

Family

ID=72202058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010446922.0A Active CN111614960B (en) 2020-05-25 2020-05-25 Quick coding and decoding method for frame level division according to image content characteristics

Country Status (1)

Country Link
CN (1) CN111614960B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112073721A (en) * 2020-10-10 2020-12-11 绍兴文理学院 High-efficiency non-I frame image coding and decoding method
CN113542750A (en) * 2021-05-27 2021-10-22 绍兴市北大信息技术科创中心 Data coding method for searching by two or more sets of hash tables
CN115119046A (en) * 2022-06-02 2022-09-27 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system with reference to pixel set
CN116248862A (en) * 2022-12-14 2023-06-09 同济大学 Data encoding method and device for skipping and re-precoding same-position size block

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277509A (en) * 2017-08-03 2017-10-20 重庆邮电大学 A kind of fast intra-frame predicting method based on screen content
CN108174201A (en) * 2018-03-07 2018-06-15 绍兴文理学院 Terminate the video-frequency compression method of encoding block coding mode search in advance by video content
US20190174138A1 (en) * 2016-06-29 2019-06-06 B<>Com Method for intra-coding of a digital image and corresponding decoding method
CN110087087A (en) * 2019-04-09 2019-08-02 同济大学 VVC interframe encode unit prediction mode shifts to an earlier date decision and block divides and shifts to an earlier date terminating method
CN110868587A (en) * 2018-08-27 2020-03-06 华为技术有限公司 Video image prediction method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190174138A1 (en) * 2016-06-29 2019-06-06 B<>Com Method for intra-coding of a digital image and corresponding decoding method
CN107277509A (en) * 2017-08-03 2017-10-20 重庆邮电大学 A kind of fast intra-frame predicting method based on screen content
CN108174201A (en) * 2018-03-07 2018-06-15 绍兴文理学院 Terminate the video-frequency compression method of encoding block coding mode search in advance by video content
CN110868587A (en) * 2018-08-27 2020-03-06 华为技术有限公司 Video image prediction method and device
CN110087087A (en) * 2019-04-09 2019-08-02 同济大学 VVC interframe encode unit prediction mode shifts to an earlier date decision and block divides and shifts to an earlier date terminating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵利平,林涛,郭靖,周开伦: "基于通用串预测算法的AVS2屏幕混合内容帧间编码优化", 《计算机学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112073721A (en) * 2020-10-10 2020-12-11 绍兴文理学院 High-efficiency non-I frame image coding and decoding method
CN112073721B (en) * 2020-10-10 2024-02-27 绍兴文理学院 Efficient non-I frame image coding and decoding method
CN113542750A (en) * 2021-05-27 2021-10-22 绍兴市北大信息技术科创中心 Data coding method for searching by two or more sets of hash tables
CN115119046A (en) * 2022-06-02 2022-09-27 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system with reference to pixel set
CN115119046B (en) * 2022-06-02 2024-04-16 绍兴市北大信息技术科创中心 Image coding and decoding method, device and system for reference pixel set
CN116248862A (en) * 2022-12-14 2023-06-09 同济大学 Data encoding method and device for skipping and re-precoding same-position size block
CN116248862B (en) * 2022-12-14 2023-12-08 同济大学 Data encoding method and device for skipping and re-precoding same-position size block

Also Published As

Publication number Publication date
CN111614960B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN111614960B (en) Quick coding and decoding method for frame level division according to image content characteristics
CN102939750B (en) Across planar prediction
CN108495135B (en) Quick coding method for screen content video coding
CN102939618A (en) Inheritance in sample array multitree subdivision
CN102939754A (en) Sample region merging
CN110401833B (en) Image transmission method and device
US20230291909A1 (en) Coding video frame key points to enable reconstruction of video frame
CN109040764B (en) HEVC screen content intra-frame rapid coding algorithm based on decision tree
CN106534846A (en) Method for dividing and quickly encoding screen contents and natural contents
US20240105193A1 (en) Feature Data Encoding and Decoding Method and Apparatus
CN109996083B (en) Intra-frame prediction method and device
CN109819260A (en) Video steganography method and device based on the fusion of multi-embedding domain
US7065254B2 (en) Multilayered image file
CN110677624A (en) Monitoring video-oriented foreground and background parallel compression method based on deep learning
CN113079373A (en) Video coding method based on HEVC-SCC
CN111698511A (en) Image filtering method, device, equipment and readable storage medium
CN113365080B (en) Encoding and decoding method, device and storage medium for string coding technology
CN111614961A (en) Encoding method for searching by calculating hash values and establishing hash table in different modes
Katayama et al. Reference frame generation algorithm using dynamical learning PredNet for VVC
Zhao et al. Efficient screen content coding based on convolutional neural network guided by a large-scale database
CN111526371B (en) Video intra-frame coding rapid algorithm based on Gabor features and gray level co-occurrence matrix
CN112073721B (en) Efficient non-I frame image coding and decoding method
CN113347437A (en) Encoding method, encoder, decoder and storage medium based on string prediction
CN109218726B (en) Laser-induced breakdown spectroscopy image lossy lossless joint compression method
CN111866506A (en) Image coding method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221207

Address after: 100193 303C, 304C, Floor 3, Building 2, East District, Yard 10, Northwest Wangdong Road, Haidian District, Beijing

Patentee after: Beijing Boya Ruishi Technology Co.,Ltd.

Address before: 312000 508 Yuecheng West Road, Yuecheng District, Shaoxing, Zhejiang

Patentee before: SHAOXING University

TR01 Transfer of patent right