WO2015120823A1

WO2015120823A1 - Image compression method and device using reference pixel storage space in multiple forms

Info

Publication number: WO2015120823A1
Application number: PCT/CN2015/073204
Authority: WO
Inventors: 林涛
Original assignee: 同济大学
Priority date: 2014-02-16
Filing date: 2015-02-16
Publication date: 2015-08-20
Also published as: CN104853211A

Abstract

Provided are an image compression method and device. When conducting predictive and matching coding (or decoding) in respect of a coding unit (CU), representing in at least two formats the historical data used as reference pixel sample values, and storing in at least two reference pixel sample value storage spaces respectively. One reference pixel sample value storage space is the main reference pixel sample value storage space and comprises the greater amount of historical data; the historical data in the other reference pixel sample value storage space is a subset of the historical data in the main reference pixel sample value storage space, but has different forms of representation. All modes of predictive and matching coding (or decoding) are correspondingly classified into at least two types, and respectively use different reference pixel sample value storage spaces to conduct respective predictive and matching coding (or decoding). The different reference pixel sample value storage spaces actually store the same historical data only in different forms of representation. Therefore, historical data in different reference pixel sample value storage spaces needs to remain synchronized.

Description

Image compression method and apparatus using multiple forms of reference pixel storage space

Technical field

The present invention relates to a digital video compression encoding and decoding system, and more particularly to a method and apparatus for encoding and decoding composite images and video containing computer screen images.

Background technique

The natural form of a digital video signal of an image is a sequence of images. A frame of image is usually a rectangular area composed of several pixels, and a digital video signal is a sequence of video images composed of tens of frames to thousands of frames of images, sometimes simply referred to as a video sequence or sequence. Encoding a digital video signal encodes a frame by frame image.

Encoding one frame of image (and corresponding decoding) in almost all international standards for video image encoding such as MPEG-1/2/4, H.264/AVC and the latest international video compression standard HEVC (High Efficiency Video Coding) When a frame image is divided into sub-images of a plurality of blocks of MxM pixels, it is called a "Coding Unit (CU)", and the sub-images are coded one by one with CU as a basic coding unit. The size of the commonly used M is 8, 16, 32, 64. Therefore, encoding a video image sequence is to sequentially encode each coding unit of each frame. Similarly, in decoding, each coding unit of each frame is sequentially decoded in the same order, and finally the entire video image sequence is reconstructed.

In order to adapt to the difference in image content and properties of each part of a frame of image, the most efficient coding is performed in a targeted manner. The size of each CU in one frame of image can be different, some are 8x8, some are 64x64, and so on. In order to enable seamlessly splicing CUs of different sizes, one frame of image is always divided into "Largest Coding Units (LCUs)" having the same size and having NxN pixels, and then each LCU is further divided into trees. Multiple CUs of a structure that are not necessarily the same size. Therefore, the LCU is also referred to as a "Coding Tree Unit (CTU)". For example, a frame of image is first divided into 64x64 pixel LCUs of the same size (N=64). One of the LCUs is composed of three 32x32 pixel CUs and four 16x16 pixel CUs, so that seven tree-structured CUs form a CTU. The other LCU consists of two 32x32 pixel CUs, three 16x16 pixel CUs, and 20 8x8 pixel CUs. Such 25 CUs in a tree structure constitute another CTU. Encoding one frame of image is to sequentially encode one CU in one CTU. At any one time, the CU being coded is referred to as the current coded CU. Decoding one frame of image also sequentially decodes one CU in one CTU in the same order. At any one time, the CU being decoded is referred to as the currently decoded CU. The current encoded CU or the currently decoded CU is typically the current CU.

In the prior art represented by MPEG-1/2/4, H.264/AVC, and HEVC, etc., in order to improve coding efficiency, one CU is usually further divided into smaller sub-areas. The sub-regions include, but are not limited to, a prediction unit (PU), a transform unit (TU), an asymmetric partition (AMP) region, a macroblock, a block, a microblock, and a strip (width or height is one pixel or one pixel component) Area), a variable-sized rectangular area, a variable-sized pixel string (segment) or a pixel component string (segment) or a pixel index string (segment). Encoding (and corresponding decoding) a CU is to encode one sub-region (and corresponding decoding). In coding, a sub-area is called a coding sub-area, and in decoding, a sub-area is called a decoding sub-area. The coding sub-area and the decoding sub-area are collectively referred to as a codec sub-area. In the prior art, the sub-regions (particularly prediction units, transform units, asymmetrically divided regions, macroblocks, blocks, microblocks, strips) are often referred to as "blocks". Therefore, the coding sub-region and the decoding sub-region are often referred to as coding blocks and decoding blocks, respectively, in many cases, collectively referred to as codec blocks.

A color pixel consists of three components. The two most commonly used pixel color formats are the GBR color format consisting of a green component, a blue component, and a red component, and a generic term called YUV consisting of a luma component and two chroma components. Color formats such as YCbCr color format. Therefore, when encoding a sub-area, a sub-area can be divided into three component planes (G plane, B plane, R plane or Y plane, U plane, V plane), and the three component planes are respectively coded; The three components of one pixel can be bundled into one 3-tuple, and the sub-regions composed of these 3-tuples are encoded as a whole. The arrangement of the former pixel and its components is called the planar format of the image (and its sub-regions), and the arrangement of the latter pixel and its components is called the overlay format of the image (and its sub-regions). (packed format).

Taking the pixel's GBR color format p[x][y]={g[x][y],b[x][y],r[x][y]} as an example, a planar format is arranged. First arrange all WxH G components of a frame image (or a CU) with a width of W pixels of H pixels, then arrange all WxH B components, and finally arrange all WxH R components:

g[1][1],g[2][1],...,g[W-1][1],g[W][1],

g[1][2],g[2][2],...,g[W-1][2],g[W][2],

.............................................,

g[1][H],g[2][H],...,g[W-1][H],g[W][H],

b[1][1],b[2][1],...,b[W-1][1],b[W][1],

b[1][2],b[2][2],...,b[W-1][2],b[W][2],

.............................................,

b[1][H],b[2][H],...,b[W-1][H],b[W][H],

r[1][1],r[2][1],...,r[W-1][1],r[W][1],

r[1][2],r[2][2],...,r[W-1][2],r[W][2],

.............................................,

r[1][H],r[2][H],...,r[W-1][H],r[W][H].

A stacking format is arranged by first arranging the G components of the first pixel, then arranging the B component and the R component, then arranging the G component, the B component, and the R component of the second pixel, and so on, and finally arranging. The G component, B component, and R component of the last (WxHth) pixel:

g[1][1],b[1][1],r[1][1],g[2][1],b[2][1],r[2][1],... ..., g[W][1],b[W][1],r[W][1],

g[1][2],b[1][2],r[1][2],g[2][2],b[2][2],r[2][2],... ..., g[W][2],b[W][2],r[W][2],

.......................................................................................,

g[1][H],b[1][H],r[1][H],g[2][H],b[2][H],r[2][H],... ...,g[W][H],b[W][H],r[W][H].

The arrangement of such stacking formats can also be simplified as:

p[1][1],p[2][1],...,p[W-1][1],p[W][1],

p[1][2],p[2][2],...,p[W-1][2],p[W][2],

................................................,

p[1][H], p[2][H],......,p[W-1][H],p[W][H].

In addition to the above arrangement of the planar format and the arrangement of the stacked format, in accordance with the different order of the three components, there may be other arrangements of the plurality of planar formats and the arrangement of the stacked formats.

Another common representation of pixels and their components is the palette index form. In the palette index representation, the value of a component of a pixel is represented by the index of the palette. The palette space stores the value or approximate value of the three components of the pixel that needs to be represented. The address of the palette is called the index of the pixel (called the palette color) stored in this address. An index can represent one component of a pixel, and an index can also represent three components of a pixel. The palette can be one or more. In the case of multiple palettes, a complete index is actually composed of the palette number and the index of the numbered palette. The index representation of a pixel and its components is an index to represent this pixel. The index representation of a pixel is also referred to in the prior art as an indexed color or a pseudo color representation of a pixel, or is often referred to directly as an indexed pixel or a pseudo pixel (pseudo pixel). ) or pixel index or cable lead. Indexes are sometimes referred to as indices. Representing a pixel in its indexed representation is also referred to as indexing or exponentialization. The palette index representation of a pixel and its components is a color-based aggregation and sorting format of the pixel and its components.

Other commonly used prior art pixel representations include the CMYK color format and the grayscale color format.

The index representation of a pixel is also a color format of a pixel, called an index color format or a pseudo color format.

The YUV color format can be subdivided into several seed formats according to whether the chroma component is downsampled: a YUV 4:4:4 pixel color format consisting of 1 Y component, 1 U component, and 1 V component. The left and right adjacent pixels are composed of two Y components, one U component, and one V component in a YUV 4:2:2 pixel color format; four pixels arranged in a left and right adjacent position by 2x2 spatial position are composed of four pixels. YUV4: 2:0 pixel color format consisting of Y component, 1 U component, and 1 V component. A component is generally represented by a number of 8 to 16 bits. The YUV4:2:2 pixel color format and the YUV4:2:0 pixel color format are all downsampled for the YUV4:4:4 pixel color format. A pixel component is also referred to as a pixel sample or simply as a sample. A sample can be an 8-bit number, ie one sample occupies one byte. A sample can also be a 10-bit number or a 12-bit number or a 14-bit number or a 16-bit number.

When encoding or decoding any sub-region, reconstructed pixels are generated, which are further divided into different degrees of partially reconstructed pixels generated in the encoding or decoding process and fully reconstructed pixels generated after the encoding or decoding process is completed. . If the fully reconstructed pixel samples have equal values to the original input pixel samples prior to encoding, the encoding and decoding process that is experienced is referred to as lossless encoding and decoding. If the fully reconstructed pixel samples have unequal values from the original input pixel samples prior to encoding, the encoding and decoding process that is experienced is referred to as lossy encoding and decoding. When the sub-regions are sequentially encoded or decoded, the generated reconstructed pixel samples are usually saved as historical data and used as reference pixel samples for subsequent sub-region encoding or decoding. The storage space in which the reconstructed pixel history data is saved is referred to as a reference pixel sample storage space. The reference pixel sample storage space is limited and only a portion of the historical data can be saved. The historical data in the reference pixel sample storage space may also include reconstructed pixel samples of the reconstructed sub-region of the current CU.

With the development and popularization of a new generation of cloud computing and information processing modes and platforms based on remote desktops, between multiple computers, computer hosts and other digital devices such as smart TVs, smart phones, and tablets The interconnection between various types of digital devices has become a reality and is becoming a mainstream trend. This makes the real-time screen transmission from the server side (cloud) to the client side an urgent need. Due to the large amount of screen video data that needs to be transmitted, efficient and high quality data compression must be performed on computer screen images.

Make full use of the characteristics of computer screen images to achieve ultra-efficient compression of computer screen images. A major goal of the latest international video compression standard HEVC.

A notable feature of computer screen images is that there are often many similar or even identical pixel patterns within the same frame of image. For example, Chinese or foreign text that often appears in computer screen images is composed of a few basic strokes, and many similar or identical strokes can be found in the same frame image. Menus, icons, etc., which are common in computer screen images, also have many similar or identical patterns. The intra prediction method used in the existing image and video compression technology only refers to adjacent pixel samples, and cannot improve the compression efficiency by using the similarity or the similarity in one frame image. The intra motion compensation method in the prior art is also called intra block copy mode, and uses several fixed size (8x8, 16x16, 32x32, 64x64 pixels) blocks for intraframe. Intra block matching coding does not achieve a finer match with a variety of different sizes and shapes. In other prior art, the microblock matching method, the fine division matching method, the string matching method, and the palette matching method (also called the palette index matching method or the index matching method). Although it can effectively find fine matching of different sizes and shapes, in some image cases, more parameters may be needed to represent fine matching of various sizes and shapes, as well as complexity, calculation amount, memory. The read and write bandwidth is large and so on.

Therefore, multiple predictive coding methods and matching coding methods (including block matching method, microblock matching method, fine division matching method, string matching method, palette matching method, etc.) must be combined to achieve efficient image on the screen. coding. It should be noted that "matching" is an encoding operation, and the corresponding reconstruction and decoding operations are "copying". Therefore, the block matching method, the microblock matching method, the fine division matching method, the string matching method, the palette matching method, and the like are also referred to as a block copy mode, a microblock copy mode, a fine partition copy mode, a string copy mode, and a palette. Copy mode (also known as palette index copy mode or simply index copy mode).

All prediction methods and matching methods have one thing in common, that is, the historical data of the encoded (or decoded) reconstructed pixel samples must be used as the reference pixel samples. In general, the larger the reference pixel sample storage space, the better the efficiency of prediction and matching. In the prior art, the reference pixel sample storage space can generally accommodate historical data of a few frames or even tens of pixels (such as palette pixels) and dozens of frames. In the prior art, the historical data of the reconstructed pixel is represented in a single form and placed in a single reference pixel sample storage space module. Take a frame of 640x480 pixels of YUV4:4:4 image and each pixel sample occupies one byte. The most common form of historical data is in a storage space module with a linear (1D) address. The linear placement order is a planar format and each plane is divided into equally sized blocks, each block having WxH pixel samples (ie, one component of the pixel), first arranged in blocks in a certain linear order. Placed in the storage space module, and the pixel samples inside each block are arranged into 1D data according to a certain linear scan format and then placed in the linear address storage space, as shown in the following table (WxH=8x8) :

Each prediction method or method of matching has its specific and most effective form of historical data. When encoding screen images in combination with multiple prediction methods and matching methods, if you continue to use the single-form calendar in the prior art Historical data and reference pixel sample storage space will greatly reduce coding efficiency. Therefore, multiple forms of historical data and reference pixel sample storage space must be used to improve prediction or matching performance and reduce the bandwidth required to read and write historical data.

Summary of the invention

The main technical feature of the present invention is that when encoding or decoding a plurality of prediction and matching modes in combination with a current coding unit, various forms of historical data and reference pixel sample storage space are used accordingly.

The general form of pixel sample storage space usually consists of four elements:

1) pixels and their component arrangement formats, such as stack format and plane format;

2) Pixel color format, such as GBR color format, RGB color format, YCbCr color format, YUV color format, CMYK color format, YIQ color format, HSV color format, HSL color format, index color format, etc., different color formats Usually can be converted to each other;

3) Pixel aggregation and sorting format. For example, pixels are aggregated and ordered according to their position coordinates: the historical data of a pixel whose shape is a 2-dimensional area or its components is aggregated in a 2-dimensional array space, or aggregated in a 1-dimensional array space. In the case of 1 dimension, the linear placement of the historical data in a storage space having a linear (1 dimensional) address is also determined; a typical linear placement is to first divide the 2-dimensional region into equal sizes. Block, each block having WxH pixels (in the case of a stack format) or one component thereof (in the case of a planar format), and then placing the blocks in a determined linear order (eg row order or column order or depth) The 4-point tree order for D is placed in front of the linear address storage space, and the pixels inside each block or one of its components are scanned according to a certain linearity (for example: line scan or column scan or zigzag scan or Z-scan or bow scan or 4-point tree scan with depth D is arranged in 1D data and placed in linear address storage space; for example, pixels are clustered and sorted according to their colors: color palette color, texture color The pixels in the palette are sorted according to one of their characteristics to form a reference pixel sample storage space with a linear address; two examples of the characteristics: 1) the palette color appears in a CU Sequence, 2) the frequency of occurrence of colors in a color palette in the CU;

4) Downsampling format of chroma components, such as YUV4:4:4 format or YUV4:2:2 format or YUV4:2:0 format.

The 4 elements determine the form of the reference pixel sample storage space. One difference between these four elements is the form of a different reference pixel sample storage space.

In the coding method and apparatus of the present invention, the most basic characteristic feature is to determine A (2 ≤ A ≤ 5) kinds of predictive coding modes and matching coding modes, and determine B (2 ≤ B ≤ A ≤ 5) of historical data. Different forms of reference pixel sample storage space, and class A predictive coding mode and matching coding mode are classified into class B, and respectively correspond to B reference pixel sample storage spaces. B different forms of reference pixel sample storage space may have different spatial sizes to store different amounts of historical data, wherein the one storing the most historical data is called the primary reference pixel sample storage space, and the other reference pixel samples are The historical data in the storage space is a subset of the historical data of the primary reference pixel sample storage space, but has different representations and may be in different degrees of partial reconstruction (such as having been color clustered or matched) The residual or the matching residual has not been added). When encoding the current sub-region, each predictive coding mode or matching coding mode performs predictive coding or matching coding according to the corresponding reference pixel sample storage space according to its classification, and selects A prediction according to a predetermined evaluation criterion. One of the coding mode and the matching coding mode encodes the current sub-area. Regardless of which prediction encoding mode and matching encoding method are selected, the current sub-region is encoded, and the generated new historical data is converted into the main reference pixel sample storage space and placed in the main reference pixel sample storage space. It may also be converted to other reference pixel sample storage spaces (which may be in another level of partial reconstruction) and placed in other reference pixel sample storage spaces. Historical data in different reference pixel sample storage spaces needs to be synchronized.

In the decoding method and apparatus of the present invention, the most basic characteristic feature is to determine A (2 ≤ A ≤ 5) kinds of predictive decoding mode and copy decoding mode, and determine B (2 ≤ B ≤ A ≤ 5) of historical data. Different types of reference pixel sample storage spaces, and class A prediction decoding mode and matching decoding mode are classified into class B, and respectively correspond to B reference pixel sample storage spaces. B different forms of reference pixel sample storage space may have different spatial sizes to store different amounts of historical data, wherein the one storing the most historical data is called the primary reference pixel sample storage space, and the other reference pixel samples are The historical data in the storage space is a subset of the historical data of the primary reference pixel sample storage space, but has different representations and may be in different degrees of partial reconstruction (such as having been color clustered or matched) The residual or the matching residual has not been added). When decoding the compressed video code stream data of the current decoding sub-area, each of the prediction decoding mode or the copy decoding mode performs prediction decoding or copy decoding according to the corresponding reference pixel sample storage space according to the classification. The information read from the video stream data or the information read from the video stream data plus the characteristics of the current decoding sub-region and the neighboring sub-regions, one of the A prediction decoding modes and the copy decoding mode is selected. Decode the current subregion. Regardless of which prediction decoding mode and copy decoding mode are selected to decode the current sub-area, the generated new historical data is converted into the main reference pixel sample storage space and placed in the main reference pixel sample storage space. It may also be converted to other reference pixel sample storage spaces (which may be in another level of partial reconstruction) and placed in other reference pixel sample storage spaces. Historical data in different reference pixel sample storage spaces needs to be synchronized.

According to an aspect of the invention, there is provided an image encoding method or apparatus comprising at least one of the steps or modules for performing the following functions and operations:

1) analyzing and evaluating characteristics of the coding sub-region and/or adjacent regions (such as the position of the pixel and/or the characteristics of the color), selecting and determining the coding sub-region according to predetermined evaluation criteria based on the results of the analysis and evaluation Encoding;

2) encoding the coding sub-area by using a reference pixel sample storage space corresponding to the coding mode, and writing the coding result to the video code stream; the video code stream includes at least decoding corresponding decoding in the decoding method and the device. Some or all of the information required by the way and reference pixel sample storage space.

According to another aspect of the present invention, there is also provided an image decoding method or apparatus comprising at least one of the steps or modules for performing the following functions and operations:

1) parsing the video stream and/or analyzing and evaluating the characteristics of the decoded sub-region and/or the neighboring region (such as the location of the pixel and/or the characteristics of the color), selecting and determining the decoding based on the results of the parsing, analysis, and evaluation. Sub-area decoding mode;

2) Decoding the decoded sub-area using the decoding mode and its corresponding reference pixel sample storage space to generate a reconstructed pixel.

The main technical features of the present invention have been described above in several respects. Other advantages and effects of the present invention will be readily apparent to those skilled in the art from this disclosure. The present invention may be embodied or applied in various other specific embodiments, and various modifications and changes may be made without departing from the spirit and scope of the invention.

An embodiment of the encoding device of the present invention includes at least one of the following modules:

The prediction and matching coding module is configured to be a plurality of coding processing units using different coding modes, wherein the coding processing unit is configured to perform data using one of intra prediction coding, inter prediction coding, or a plurality of different matching coding modes. deal with;

An encoding storage module configured to store a plurality of storage units of different forms of reconstructed pixel sample data;

An encoding data control module configured to control the prediction and matching encoding module to read data of a corresponding one or more storage units in the code storage module that is preset; and to control the prediction according to the read data The matching coding module hands over the data to be encoded to the corresponding coding processing unit; and controls the prediction and matching coding mode The block writes output data of the encoding processing unit to one or more storage units of the code storage module that are preset.

An embodiment of the decoding device of the present invention includes at least one of the following modules:

The prediction and copy decoding module is configured to be a plurality of decoding processing units using different decoding modes, wherein the decoding processing unit is configured to perform data using one of intra prediction decoding, inter prediction decoding, or a plurality of different copy decoding modes. deal with;

Decoding the storage module, configured to store a plurality of storage units of different forms of reconstructed pixel sample data;

a decoding data control module configured to control the prediction and copy decoding module to read data of a corresponding one or more storage units in the preset decoding storage module; and to control the prediction according to the read data The copy decoding module hands over the data to be decoded to the corresponding decoding processing unit; and controls the prediction and copy decoding module to write the output data of the decoding processing unit to one or more storage units in the preset decoding storage module.

An embodiment of the encoding apparatus of the present invention, the schematic diagram of which is shown in Figure 1, consists of the following modules:

1) Predictive coding mode module, matching coding mode 1 module, matching coding mode 2 module, ..., matching coding mode A-1 module: A is a positive integer satisfying 2 ≤ A ≤ 5; Using input prediction (intra prediction or inter prediction or both) coding mode, matching coding mode 1, matching coding mode 2, ..., matching coding mode A-1, an input coding unit of the input video image Performing a prediction or matching coding operation on a current coding sub-region; the output of the prediction coding mode module is a prediction mode, an inter-predicted motion vector, and a prediction residual, that is, an original input pixel sample and prediction of the input video image a difference between the pixel samples; the output of each of the matching coding mode modules is a matching mode, a matching position, an unmatched sample, and a matching residual; the matching position is used to represent the currently encoded pixel in the current coding sub-region a variable of a position of the matched reference pixel sample in the reference pixel sample storage space, the reference image matching the current encoded pixel sample The sample value is referred to as a matching pixel sample, and the position of the matched pixel sample does not necessarily form an area connected together in the reference pixel sample storage space, and may also be in the reference pixel sample storage space. Separating regions; the unmatched samples are found in the reference pixel sample storage space according to a predetermined matching criterion, if a matching matching criterion is determined Very loose, allowing arbitrarily large matching errors, so that a match can always be found, then there is no unmatched sample as an output in this matching coding mode; the matching residual is the original input pixel sample and the matching pixel The difference between the samples, if a matching matching criterion is an absolutely accurate lossless matching, the matching residual is zero, that is, the matching encoding method has no matching residual as an output, if a matching If the matching criterion determined by the coding mode is an approximate lossy match, the matching residual may not be zero, and another case of lossy matching is to first perform sample quantization, color quantization or based on the original input pixel sample. Pre-processing of pixel clustering of colors, and then performing matching encoding operations. In this case, since sample quantization, color quantization, or color-based pixel clustering is lossy, even if the matching encoding operation itself is lossless, The matching residual (ie, the difference between the original input pixel sample and the matched pixel sample) may also be non-zero;

2) Reference pixel sample storage space 1 module, reference pixel sample storage space 2 module, ..., reference pixel sample storage space B module: B is a positive integer satisfying 2 ≤ B ≤ A ≤ 5; The reference pixel sample storage space stores the reconstructed pixel sample history data generated in the encoding process in a unique mutually different form, and is used as a reference pixel sample required for subsequent encoding and reconstruction of various operations; Different forms of reference pixel sample storage space may have different spatial sizes to store different amounts of historical data, wherein the one storing the most historical data is called the primary reference pixel sample storage space, and the other reference pixel sample storage spaces. The historical data in the data is a subset of the historical data of the primary reference pixel sample storage space, but has different representations and may be in different degrees of partial reconstruction (such as having been color clustered or added matching residuals) Or no matching residuals are added; corresponding to the B reference pixel sample storage spaces, the predictive coding mode module, the matching coding mode 1 module, and the matching code Mode 2 module, ..., matching coding mode A-1 module A coding mode is also divided into B class correspondingly, corresponding to B reference pixel sample storage space; in the current coding sub-area When encoding, each predictive coding mode or matching coding mode uses the corresponding reference pixel sample storage space for predictive coding or matching coding according to its classification, and selects A predictive coding mode and matching coding mode according to predetermined evaluation criteria. One of the encoding of the current coding sub-area;

3) The remaining common techniques of coding and reconstruction: performing various common techniques on input parameters and variables, such as transform, quantization, inverse transform, inverse quantization, compensation corresponding to prediction residuals and matching residuals. (ie, taking the inverse of the residual operation), predicting and finding the residual, DPCM, first-order and high-order difference, mapping, run, index, deblocking filtering, sample adaptive offset (Sample Adaptive Offset), encoding And the reconstruction operation and the entropy coding operation; the input of the module is the output of each module of the module 1) and the original input pixel sample; the output of the module is a reconstructed pixel and a video code stream; the reconstructed pixel It is converted into the form of the main reference pixel sample storage space and placed in the module 2) the main reference pixel sample storage space, and may also be converted into other reference pixel sample storage space (may be at another level) Partial reconstruction stage) and put into the other reference pixel sample storage space of module 2) for use as a reference pixel sample required for subsequent encoding and reconstruction of various operations; The video stream is the final output of the encoding device, and includes all the syntax elements required for the decoding device to perform decoding and reconstruction operations, in particular, prediction mode, motion vector, matching mode, matching position, unmatched sample, and the like. element;

4) The form conversion module of the pixel sample storage space: converts one form of the form of the reference pixel sample storage space into another form; the reconstructed pixel generated and output by the module 3) passes through the module. Converted into the form of the main reference pixel sample storage space and placed in the module 2) the main reference pixel sample storage space, or may be converted into other reference pixel sample storage space by the module (may be at another level Partial reconstruction stage) and put into the other reference pixel sample storage space of module 2), used as reference pixel samples needed for subsequent encoding and reconstruction of various operations; this module makes different reference pixel samples Historical data in the storage space is synchronized with each other.

An embodiment of the decoding apparatus of the present invention, the schematic diagram of which is shown in FIG. 2, is composed of the following modules:

1) Code stream data parsing and partial decoding module: performing entropy decoding on the video stream of the currently decoded CU including the prediction mode, the motion vector, the matching mode, the matching position, the unmatched sample compressed data, and all other syntax element compressed data. And parsing the meaning of various data obtained by entropy decoding; parsing and performing partial decoding (such as transform decoding, prediction and compensation, ie inverse operation of residual operation, DPCM decoding, first-order and high-order differential decoding, mapping Parameters such as prediction mode, motion vector, matching mode, matching position, and unmatched samples obtained by decoding, run-length decoding, and index decoding are sent to each prediction decoding mode module and matching decoding mode module; Other syntax elements such as prediction residuals and matching residual entropy decoded output data (ie, the result of entropy decoding) are sent to the remaining common techniques of decoding and reconstruction modules; in particular, based on parsing from video stream data. The information is based on the information parsed from the video stream data plus the current decoding sub-region and the neighboring sub-region. Selecting a corresponding prediction decoding mode or matching decoding mode, and sending several parameters and variables corresponding to the prediction mode, the motion vector, the matching mode, the matching position, the unmatched sample, and the like to a corresponding prediction decoding mode module or matching Decoding mode module, starting the corresponding predictive decoding mode module or matching decoding mode module to decode the current decoding sub-region;

2) Predictive decoding mode module, matching decoding mode 1 module, matching decoding mode 2 module, ..., matching decoding mode A-1 module: A is a positive integer satisfying 2 ≤ A ≤ 5; the A decoding mode modules respectively Using the prediction (intra prediction or inter prediction or both) decoding mode, matching decoding mode 1, matching decoding mode 2, ..., matching decoding mode A-1, the A decoding mode corresponds to the corresponding A current decoding The region performs prediction or matching decoding operation; the input of the prediction decoding mode module is a motion vector of a prediction mode and an inter prediction; the input of the matching decoding mode module is a matching mode, a matching position, and possibly an unmatched sample. ; the matching position is used to indicate from the reference Where in the pixel sample storage space, the matching reference pixel samples are copied and pasted to the position of the currently decoded pixel sample (referred to as the matched pixel sample) of the current decoded sub-region, the matched reference pixel The sample values are referred to as matching pixel samples (obviously, the matched pixel samples are replicas of matching pixel samples, both of which are numerically equal), and the positions of the matching pixel samples are not necessarily in the reference One area connected together in the pixel sample storage space may also be several areas separated in the reference pixel sample storage space; the unmatched samples are directly parsed and decoded from the video stream data. a pixel sample value and pasting it to the position of the currently decoded pixel sample in the current decoding sub-region, the unmatched sample is typically not present in the reference pixel sample storage space; the A decoding mode module The output is a predicted pixel sample or the matched pixel sample (which is numerically equal to the matched pixel sample) plus the unmatched sample (present in some matching decoding modes); Pixel sample values are matched and the match is not possible to sample all together constitute the complete data matching the decoded output of the current decoding CU;

3) Reference pixel sample storage space 1 module, reference pixel sample storage space 2 module, ..., reference pixel sample storage space B module: B is a positive integer satisfying 2 ≤ B ≤ A ≤ 5; The reference pixel sample storage space stores the reconstructed pixel sample history data generated in the decoding process in a unique mutually different form, and is used as a reference pixel sample required for subsequent decoding and reconstruction of various operations; Different forms of reference pixel sample storage space may have different spatial sizes to store different amounts of historical data, wherein the one storing the most historical data is called the primary reference pixel sample storage space, and the other reference pixel sample storage spaces. The historical data in the data is a subset of the historical data of the primary reference pixel sample storage space, but has different representations and may be in different stages of partial reconstruction (such as matching residuals or no matching residuals) Corresponding to the B reference pixel sample storage spaces, the prediction decoding mode module, the matching decoding mode 1 module, the matching decoding mode 2 module, ..., The A decoding modes of the decoding mode A-1 module are also divided into B classes correspondingly, corresponding to the B reference pixel sample storage spaces; when decoding the current decoding sub-region, each prediction decoding The mode or the matching decoding mode performs prediction decoding or matching decoding according to the corresponding reference pixel sample storage space according to its classification, and each reference pixel sample storage space is all decoding modes corresponding to the decoding mode (possibly There are a variety of) providing reference pixel samples;

4) The remaining common techniques of decoding and reconstructing modules: performing the remaining common techniques on the current decoding sub-region, such as inverse transform, inverse quantization, compensation corresponding to prediction residuals and matching residuals (ie, taking Inverse operation of residual operation), prediction and compensation (ie inverse operation of residual operation), DPCM, first-order and high-order difference, mapping, run, index, deblocking filtering, sample adaptive compensation (Sample Adaptive Offset), decoding and reconstruction operations; the input to this module is all other syntax elements of module 1) output such as entropy decoded output data for prediction residuals and matching residuals and modules 2) The output of the block is the predicted pixel sample or the matched pixel sample plus the unmatched sample that may exist; the output of the module is a reconstructed pixel (including fully reconstructed pixels and varying degrees of partial reconstruction) Pixels); the reconstructed pixels are converted into a form of a primary reference pixel sample storage space and placed in the module 3) the primary reference pixel sample storage space, and may also be converted into other reference pixel sample storage spaces. Form (possibly in another degree of partial reconstruction phase) and put into the other reference pixel sample storage space of module 2) for use as a reference pixel sample for subsequent decoding and reconstruction of various operations; Fully reconstructed pixels are also the final output of the present decoding device;

5) Form conversion module of pixel sample storage space: convert one form of the form of B reference pixel sample storage space into another form; module 4) reconstructed pixels generated and output must pass through this module Converted into the form of the main reference pixel sample storage space and placed in the module 3) the main reference pixel sample storage space, may also be converted into other reference pixel sample storage space by the module (may be in another degree Partial reconstruction stage) and put into the other reference pixel sample storage space of module 3), used as reference pixel samples needed for subsequent decoding and reconstruction of various operations; this module makes different reference pixel samples Historical data in the storage space is synchronized with each other.

The illustrations provided above merely illustrate the basic idea of the present invention in a schematic manner, and only the components directly related to the present invention are shown in the drawings, rather than the number, shape and size of components in actual implementation, and the components in actual implementation. The type, quantity, and proportion can be a random change, and the component layout can be more complicated.

The following are more implementation details and variations of the invention.

Implementation or variant example 1 embodiment of four coding modes and two different forms of reference pixel sample storage space

The positive integer A is equal to 4, and the four coding mode modules are a prediction coding mode module, a block matching coding mode module, a string matching coding mode module, and a palette matching coding mode module; the positive integer B is equal to 2, The two reference pixel sample storage spaces are reference pixel storage spaces in the form of a planar format and a two-dimensional array space (referred to as a planar two-dimensional reference space) and a reference to a form of a stacked format and a one-dimensional array space. a pixel sample storage space (referred to as a stacked 1-dimensional reference space); the primary reference pixel sample storage space is the planar 2-dimensional reference space, and historical data in the stacked 1-dimensional reference space are all A subset of the historical data of a planar 2-dimensional reference space, but with different representations and possibly at different degrees of partial reconstruction (eg if a matching residual has been added or a matching residual has not been added); the 4 encodings The method is also divided into two categories correspondingly, and the prediction coding mode and the block matching coding mode are both the first type coding mode, corresponding to the planar two-dimensional reference space, and the string matching coding mode and the color palette. Encoding with a second coding method are based, and the The stacked packet corresponds to a 1-dimensional reference space; when encoding the current coding sub-region, the predictive coding mode uses the planar 2-dimensional reference space for predictive coding, and the block-matching coding mode also uses the planar 2-dimensional reference space. Block matching coding, the string matching coding method uses the overlapping packet 1D reference space for string matching coding, and the palette matching coding mode also uses the overlapping packet 1D reference space for palette matching coding; The dimension reference space provides reference pixel samples for predictive coding and block matching encoding, and the stacked 1-dimensional reference space provides reference pixel samples for string matching encoding and palette matching encoding;

A block-matching coding scheme is a block of a certain size (such as 64x64 samples, 32x32 samples, 16x16 samples, 8x8 samples, 8x4 samples, 4x8 samples, 4x4 samples, etc.) (referred to as a matched block, which is Matching encoding is performed in units of a 2-dimensional coordinate, the matching pixel samples form a matching block in the planar 2-dimensional reference space, and the position within one frame of the image is also available. a two-dimensional coordinate representation, and thus in the block matching coding mode, the matching position can be represented by a difference between a 2-dimensional coordinate of the matching block and a 2-dimensional coordinate of the matched block, which is called a displacement vector;

The string matching coding method arranges the pixel samples of the current coding sub-region into a 1-dimensional pixel sample string in a stack format according to a predetermined linear scan format, and uses a variable-length pixel sample string (referred to as being matched). The string, whose position can be represented by a 2D coordinate or a linear address, is matched and encoded. The matched pixel sample forms a matching string in the 1D-dimensional reference space of the stack, and the position is also available. A 2-dimensional coordinate can also be represented by a linear address. Therefore, in the string matching coding mode, the matching position can use the difference between the 2-dimensional coordinates of the matching string and the 2-dimensional coordinates of the matched string, and the linear address of the matching string can also be used. The difference between the linear addresses of the matched strings is generally referred to as the displacement vector. Since the length of the matching string (equal to the length of the matched string) is variable, another variable called the matching length and the displacement vector are needed. Together, that is, (displacement vector, matching length), the matching position is completely represented; the result of performing string matching encoding on the current coding CU is M (M ≥ 1) matching strings and N (N) ≥ 0) unmatched pixel samples, output M pairs (displacement vector, matching length) and N unmatched pixel samples;

The palette matching encoding uses only a part of the pixels of the stacked 1-dimensional reference space (generally containing a number of pixel samples in the current CU) as reference pixels, and thus a predetermined method from the stacked 1-dimensional reference space Pick out and update a set of K pixels (usually 4≤K≤64). These K pixels form a palette. Each pixel in the palette is represented by an index. Using the pixels of the palette as reference pixels, and the matching position of the matching pixel samples is an index of the matching pixel samples in a palette, all of the matching pixels of the current encoding CU The index of the sample value constitutes an index array as an index map;

One of the functions of the remaining common techniques of coding and reconstruction modules is to perform various commonly used transformations and predictions on parameters and variables such as matching patterns, displacement vectors, matching lengths, index mappings, unmatched samples, and the like. And find the residual, DPCM, first-order and high-order difference, mapping, run, index coding;

The form conversion module of the pixel sample storage space performs mutual conversion between the following two expression forms of the pixel: 1) a format of a planar format and a 2-dimensional array space, 2) a form of a stacked packet format and a 1-dimensional array space;

The palette matching encoding mode module and the color palette are optional and may be omitted, so that in the encoding device, the positive integer A is equal to 3, and the three encoding mode modules are predictions respectively. Encoding mode module, block matching coding mode module, and string matching coding mode module;

The string matching coding mode module is optional and may be omitted. Therefore, in the coding apparatus, the positive integer A is equal to 3, and the three coding mode modules are respectively a prediction coding mode module and a block matching coding mode. Module, palette matching encoding mode module.

Implementation or variant example 2 embodiment of four decoding modes and two different forms of reference pixel sample storage space

The positive integer A is equal to 4, and the four decoding mode modules are a prediction decoding mode module, a block matching decoding mode module, a string matching decoding mode module, and a palette matching decoding mode module; the positive integer B is equal to 2, The two reference pixel sample storage spaces are reference pixel storage spaces in the form of a planar format and a two-dimensional array space (referred to as a planar two-dimensional reference space) and a reference to a form of a stacked format and a one-dimensional array space. a pixel sample storage space (referred to as a stacked 1-dimensional reference space); the primary reference pixel sample storage space is the planar 2-dimensional reference space, and historical data in the stacked 1-dimensional reference space are all A subset of the historical data of the planar 2-dimensional reference space, but with different representations and possibly at different degrees of partial reconstruction (eg if a matching residual has been added or a matching residual has not been added); the 4 decodings The mode is also divided into two categories correspondingly, and the prediction decoding mode and the block matching decoding mode are both the first type decoding mode, corresponding to the planar 2-dimensional reference space, and the string matching decoding mode and the color palette. The decoding mode is a type 2 decoding mode corresponding to the stacked packet 1-dimensional reference space; when decoding the current decoding CU, the prediction decoding mode uses the planar 2-dimensional reference space for prediction decoding, and the block The matching decoding method also uses the planar 2-dimensional reference space for block matching decoding, and the string matching decoding method uses the stacked packet 1-dimensional reference space for string matching decoding, and the palette matching decoding method also uses the stacked packet 1D. The reference space performs palette matching decoding; the planar 2-dimensional reference space provides reference pixel samples for predictive decoding and block matching decoding, and the stacked 1-dimensional reference space provides for string matching decoding and palette matching decoding. Reference pixel sample;

A block-matching decoding method is a block of a certain size (such as 64x64 samples, 32x32 samples, 16x16 samples, 8x8 samples, 8x4 samples, 4x8 samples, 4x4 samples, etc.) (referred to as a matched block, which is Matching decoding is performed in units of a 2-dimensional coordinate, which forms a matching block in the planar 2-dimensional reference space, and the position within one frame of the image is also available. a two-dimensional coordinate representation, and thus in the block matching decoding mode, the matching position can be represented by a difference between a 2-dimensional coordinate of the matching block and a 2-dimensional coordinate of the matched block, which is called a displacement vector;

The string matching decoding method arranges the pixel samples of the currently decoded CU into a 1-dimensional pixel sample string in a stack format according to a predetermined linear scan format, and uses a variable length pixel sample string (referred to as a matched string). , the position of which can be represented by a 2D coordinate or a linear address. The matching pixel sample forms a matching string in the 1st dimension reference space of the stack, and the position is also available. The 2-dimensional coordinates can also be represented by a linear address. Therefore, in the string matching decoding mode, the matching position can use both the difference between the 2-dimensional coordinates of the matching string and the 2-dimensional coordinates of the matched string, and the linear address of the matching string can also be used. The difference between the linear addresses of the matching strings is generally referred to as the displacement vector. Since the length of the matching string (equal to the length of the matched string) is variable, another variable called the matching length is required along with the displacement vector. , ie (displacement vector, matching length), to completely represent the matching position; the input when performing string matching decoding on the current decoding CU is parsed from the video stream data And the matching pattern obtained by decoding, M (M ≥ 1) pairs (displacement vector, matching length) and N (N ≥ 0) unmatched pixel samples;

Palette matching decoding uses only a portion of the pixels of the stacked 1-dimensional reference space (typically containing a number of pixel samples within the current CU) as reference pixels, thus following a predetermined method from the stacked 1-dimensional reference space Pick out and update a set of K pixels (usually 4≤K≤64). These K pixels form a palette. Each pixel in the palette is represented by an index. The palette matches the decoding method. Using the pixels of the palette as reference pixels, and the matching position of the matching pixel samples is an index of the matching pixel samples in a palette, all of the matching pixels of the currently decoded CU The index of the sample form an index array as an index map; the index map is parsed and decoded from the video stream data, and is one of the input of the palette matching decoding;

One of the functions of the code stream data parsing and partial decoding module is to represent parameters such as a matching pattern, a displacement vector, a matching length, an index map, an unmatched sample, and the like obtained by entropy decoding from the video code stream. The partial decoded data of the syntax element performs various commonly used transform decoding, prediction and compensation, that is, inverse operation of residual operation, DPCM decoding, first-order and high-order differential decoding, mapping decoding, run-length decoding, and index decoding. The operation obtains the original matching mode, the displacement vector, the matching length, the index mapping, the unmatched sample and other parameters and variables, and is used as a block matching decoding mode module, a string matching decoding mode module, and a palette matching decoding mode module input;

The palette matching decoding mode module and the color palette are optional and may be omitted, so that in the decoding device, the positive integer A is equal to 3, and the three decoding mode modules are predictions respectively. Decoding mode module, block matching decoding mode module, string matching decoding mode module;

The string matching decoding mode module is optional and may be omitted, so that in the decoding device, the positive integer A is equal to 3, and the three decoding mode modules are respectively a prediction decoding mode module, and a block matching decoding mode. Module, palette matching decoding mode module.

Implementation or variant example 3 embodiment of a video code stream containing prediction and matching mode identification codes and other coding results and a classification of prediction and matching modes

The coding unit in the video stream, ie the CU part, consists of syntax elements loaded with the following information:

CU header, prediction and matching mode identification code, prediction mode or matching mode, motion vector 1 or matching position 1, unmatched pixel sample 1, motion vector 2 or matching position 2, unmatched pixel sample 2, ......... ..., more motion vectors or matching positions, more unmatched pixel samples, prediction residuals or matching residuals, other coding results;

In addition to the CU header syntax elements, the order in which all other of the syntax elements are placed in the code stream is not unique, and any predetermined predetermined order may be employed; any one of the syntax elements may also be split into several parts. The parts may be placed centrally in the same place in the code stream, or may be placed in different places in the code stream; any number of syntax elements may also be combined into one syntax element; in addition to the CU header syntax elements and prediction and matching The mode identifier code syntax element, and other syntax elements may not exist in the compressed code stream data of a certain CU;

The prediction and matching mode identification code may take the following code values and have the following semantics:

码值Code value	语义Semantics
码值Code value	语义Semantics	00	当前编码或解码CU(简称为当前CU)采用预测编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts predictive coding or decoding
11	当前编码或解码CU(简称为当前CU)采用匹配编码或解码方式1The current encoding or decoding CU (referred to as the current CU) adopts matching encoding or decoding mode 1	00
11		22	当前编码或解码CU(简称为当前CU)采用匹配编码或解码方式2The current encoding or decoding CU (referred to as the current CU) adopts matching encoding or decoding mode 2
…...	……………………………………..........................................	22
…...	……………………………………..........................................	A-1A-1	当前编码或解码CU(简称为当前CU)采用匹配编码或解码方式A-1The current encoding or decoding CU (referred to as the current CU for short) adopts matching encoding or decoding mode A-1

The divisional classification of the A coding mode or the A decoding mode is defined by a predetermined A-ary array ClassOfCoding[A], and each j=ClassOfCoding[i] (where i satisfies 0≤i<A) The value ranges from 1 to B, and corresponds to the B reference pixel sample storage spaces, respectively. Thus, the code value of the current CU prediction and matching mode identification code and the A-ary array ClassOfCoding[A] can be followed. The following judgment conditions determine which reference pixel sample storage space the current CU uses for encoding or decoding:

If (the code value of the prediction and matching mode identification code = 0), the current CU is encoded or decoded using the reference pixel sample storage space j, where j = ClassOfCoding[0];

If (the code value of the prediction and matching mode identification code = =1), the reference pixel sample storage space j is used. The former CU encodes or decodes, where j=ClassOfCoding[1];

If (the code value of the prediction and matching mode identification code == 2), the current CU is encoded or decoded using the reference pixel sample storage space j, where j = ClassOfCoding [2];

..........................................

If (the code value of the prediction and matching mode identification code == i), the current CU is encoded or decoded using the reference pixel sample storage space j, where j = ClassOfCoding[i];

..........................................

If (the code value of the prediction and matching mode identification code == A-1), the current CU is encoded or decoded using the reference pixel sample storage space j, where j = ClassOfCoding [A-1];

The above symbol "==" means "equal to".

Implementation or Variation 4 Embodiment 1 of a video code stream including a prediction and matching mode identification code and other coding results, and a class 2 coding or decoding mode

CU header, prediction and matching mode identification code, prediction mode or matching mode, motion vector 1 or displacement vector 1 or (displacement vector 1, matching length 1) or index map 1, unmatched pixel sample 1, motion vector 2 or displacement Vector 2 or (displacement vector 2, matching length 2) or index map 2, unmatched pixel samples 2, .........., more motion vectors or displacement vectors or (displacement vectors, matching lengths) or index mapping, more Multiple unmatched pixel samples, prediction residuals or matching residuals, other coding results;

In addition to the CU header syntax elements, the order in which all other of the syntax elements are placed in the code stream is not unique, and any predetermined predetermined order may be employed; any one of the syntax elements may also be split into several parts. The parts may be placed centrally in the same place in the code stream, or may be placed in different places in the code stream; any number of syntax elements may also be combined into one syntax element; in addition to the CU header syntax elements and prediction and matching The mode identifier code syntax element, and other syntax elements may not exist in the video code stream data of a certain CU;

码值Code value	语义Semantics
码值Code value	语义Semantics	00	当前编码或解码CU(简称为当前CU)采用预测编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts predictive coding or decoding
11	当前编码或解码CU(简称为当前CU)采用块匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts block matching encoding or decoding mode.	00

22	当前编码或解码CU(简称为当前CU)采用串匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) uses string matching encoding or decoding.
22		33	当前编码或解码CU(简称为当前CU)采用调色板匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts palette matching encoding or decoding mode.

The division of the four coding modes or the four decoding modes is defined by a predetermined 4-element array ClassOfCoding[4], and each ClassOfCoding[i] (where i satisfies 0≤i<4) Yes:

ClassOfCoding[0]=ClassOfCoding[1]=1,

ClassOfCoding[2]=ClassOfCoding[3]=2,

The

values

1 and 2 correspond to the reference pixel sample storage space 1 (ie, the planar 2-dimensional reference space) and the reference pixel sample storage space 2 (the stacked 1-dimensional reference space), respectively, thereby, from the current CU The code value of the prediction and matching mode identification code and the 4-ary array ClassOfCoding[4] can determine which reference pixel sample storage space the current CU uses for encoding or decoding according to the following judgment conditions:

If (the code value of the prediction and matching mode identification code == i), the current CU is encoded or decoded using the reference pixel sample storage space j, where 0 ≤ i < 4, j = ClassOfCoding [i];

This judgment condition can also be expanded and written as:

If (the code value of the prediction and matching mode identification code = 0), the current CU is encoded or decoded by using a planar 2-dimensional reference space;

If (the code value of the prediction and matching mode identification code ==1), the current CU is encoded or decoded by using the planar 2-dimensional reference space;

If (the code value of the prediction and matching mode identification code == 2), the current CU is encoded or decoded by using the overlapping 1D reference space;

If (the code value of the prediction and matching mode identification code == 3), the current CU is encoded or decoded using the superimposed 1-dimensional reference space.

Implementation or Variation 5 Embodiment 2 of a video code stream including a prediction and matching mode identification code and other coding results, and a class 2 coding or decoding mode

CU header, prediction and matching mode identification code, prediction mode or matching mode, motion vector 1 or displacement vector 1 or (displacement vector 1, matching length 1), unmatched pixel sample 1, motion vector 2 or displacement vector 2 or ( Displacement vector 2, matching length 2), unmatched pixel samples 2, .........., more motion vectors or displacement vectors Quantity or (displacement vector, matching length), more unmatched pixel samples, prediction residuals or matching residuals, other coding results;

码值Code value	语义Semantics
码值Code value	语义Semantics	00	当前编码或解码CU(简称为当前CU)采用预测编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts predictive coding or decoding
11	当前编码或解码CU(简称为当前CU)采用块匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts block matching encoding or decoding mode.	00
11		22	当前编码或解码CU(简称为当前CU)采用串匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) uses string matching encoding or decoding.

The division of the three coding modes or the three decoding modes is defined by a predetermined 3-element array ClassOfCoding[3], and each ClassOfCoding[i] (where i satisfies 0≤i<3) Yes:

ClassOfCoding[0]=ClassOfCoding[1]=1,

ClassOfCoding[2]=2,

The

values

1 and 2 correspond to the reference pixel sample storage space 1 (ie, the planar 2-dimensional reference space) and the reference pixel sample storage space 2 (the stacked 1-dimensional reference space), respectively, thereby, from the current CU The code value of the prediction and matching mode identification code and the 3-ary array ClassOfCoding[3] can determine which reference pixel sample storage space the current CU uses for encoding or decoding according to the following judgment conditions:

If (the code value of the prediction and matching mode identification code == i), the current CU is encoded or decoded using the reference pixel sample storage space j, where 0 ≤ i < 3, j = ClassOfCoding [i];

This judgment condition can also be expanded and written as:

If (the code value of the prediction and matching mode identification code == 2), the current CU is encoded or decoded using the superimposed 1-dimensional reference space.

Implementation or Variation Example 6 Video stream containing prediction and matching mode identification codes and other coding results, and Embodiment 3 of the class 2 coding or decoding mode

CU header, prediction and matching mode identification code, prediction mode or matching mode, motion vector 1 or displacement vector 1 or index map 1, unmatched pixel sample 1, motion vector 2 or displacement vector 2 or index map 2, unmatched pixel Sample 2, ........., more motion vectors or displacement vectors or index maps, more unmatched pixel samples, prediction residuals or matching residuals, other coding results;

码值Code value	语义Semantics
码值Code value	语义Semantics	00	当前编码或解码CU(简称为当前CU)采用预测编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts predictive coding or decoding
11	当前编码或解码CU(简称为当前CU)采用块匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts block matching encoding or decoding mode.	00
11		22	当前编码或解码CU(简称为当前CU)采用调色板匹配编码或解码方式The current encoding or decoding CU (referred to as the current CU) adopts palette matching encoding or decoding mode.

ClassOfCoding[0]=ClassOfCoding[1]=1,

ClassOfCoding[2]=2,

The

values

This judgment condition can also be expanded and written as:

Implementation or Variation Example 7 Reference Pixel Sample Storage Space in the Form of a Planar Format and a 2D Array Space

The reference pixel sample storage space in the form of the planar format and the 2-dimensional array space is composed of pixel samples of two frames of images placed in the following order;

The Y component of the first row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The Y component of the second row of pixels of the L-1 frame image is placed in the scanning order from left to right.

..........................................

The Y component of the last row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The U component of the first row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The U component of the second row of pixels of the L-1 frame image is placed in the scanning order from left to right.

..........................................

The U component of the last row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The V component of the first row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The V component of the second row of pixels of the L-1 frame image is placed in the scanning order from left to right.

..........................................

The V component of the last row of pixels of the L-1 frame image is placed in the scanning order from left to right.

The Y component of the first row of pixels of the Lth frame image, placed in the scan order from left to right

The Y component of the second row of pixels of the Lth frame image is placed in the scanning order from left to right.

..........................................

The Y component of the last row of pixels of the Lth frame image, placed in the scan order from left to right

The U component of the first row of pixels of the Lth frame image is placed in the scanning order from left to right.

The U component of the second row of pixels of the Lth frame image is placed in the scanning order from left to right.

..........................................

The U component of the last row of pixels of the Lth frame image, placed in the scan order from left to right

The V component of the first row of pixels of the Lth frame image is placed in the scanning order from left to right.

The V component of the second row of pixels of the Lth frame image is placed in the scanning order from left to right.

..........................................

The V component of the last row of pixels of the Lth frame image is placed in the scanning order from left to right.

The above Lth frame image is the frame image in the current encoding or decoding, and the L-1 frame image is the frame image before the Lth frame image (in the encoding or decoding order).

Implementation or Variation Example 8 Example of a reference pixel sample storage space in the form of a stacked packet format and a 1-dimensional array space

The reference pixel sample storage space in the form of the stack format and the 1-dimensional array space is composed of pixel samples of 1 frame image, and the 1 frame image is first divided into 64×64 pixels (overlapping format, 64×64×3 pixels) The block of the sample) gives each block a sequence number according to the raster scan order of the block, that is, the sequence number of the block in the upper left corner is 1, the block number on the right side of the block numbered 1 is 2, and the block number 2 is The sequence number of the block on the right is 3, and so on. If the block with the sequence number X is the rightmost block and there is no block on the right side, the sequence of the leftmost block directly below the block with the sequence number X is X+1. The block on the right side of the block with the sequence number X+1 has the sequence number X+2, the block number on the right side of the block with the sequence number X+2 is X+3, and so on, all the blocks of the image of one frame are There is a sequence number; then, all pixels are placed and arranged in the following order to form a 1-dimensional array space:

The first column of the block of sequence number 1 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

The second column of the block of sequence number 1 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

..........................................

The 64th column of the block of sequence number 1 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

The first column of the block of the number 2 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

The second column of the block with the number 2 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

..........................................

The 64th column of the block with the number 2 is arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

..........................................

The first column of pixels of the largest block, the columns are arranged in pixels from top to bottom, and the pixels are arranged in Y, U, and V.

The second column of pixels with the largest serial number, arranged in pixels from top to bottom in one column, and arranged in Y, U, and V in one pixel.

..........................................

The 64th column pixel of the largest block of the largest number is arranged in pixels from top to bottom in one column, and is arranged in Y, U, and V in one pixel.

In the above embodiment, the blocks are arranged in a row order and the pixels in the block are arranged in a column scan format. Change the row order in this embodiment to another order (such as column order or 4-point tree order with depth D) or change the column scan format to other scan formats (such as line scan or zigzag scan or Z scan or depth D). More examples can be obtained by 4-tree scanning and other formats.

In the above embodiment, the size of the block is 64 x 64 pixels. More embodiments can be obtained by changing the size of the block, such as 32x32 or 16x16 or 8x8 or 8x4 or 4x8 or 4x4 pixels.

Implementation or Variation Example 9 Embodiment in which pixels in a block are first divided into sub-blocks arranged in a 4-branch tree order of depth D

The reference pixel sample storage space in the form of the stack format and the 1-dimensional array space is composed of pixel samples of 1 frame image, and the 1 frame image is first divided into 64×64 pixels (overlapping format, 64×64×3 pixels) The block of the sample (abbreviated as 64x64 block) gives each block a sequence number according to the raster scan order of the block, that is, the sequence number of the block in the upper left corner is 1, and the block number on the right side of the block numbered 1 is 2. On the right side of the block numbered 2 The serial number of the block is 3, and so on. If the block with the sequence number X is the rightmost block and there is no block on the right side, the sequence of the leftmost block directly below the block with the sequence number X is X+1, and the serial number is The block of the right side of the block of X+1 has the serial number X+2, the block of the right side of the block of the sequence number X+2 has the serial number of X+3, and so on, all the blocks of the image of one frame have A sequence number; then, divide a 64x64 block into sub-blocks of different depths according to the 4-point tree method and give each sub-block a sequence number; the division method is as follows:

The division with depth 1 is to divide 64x64 blocks into 32x32 sub-blocks with up to the left and right and 4 depths, and the numbers of the sub-blocks with depth 1 being upper left, upper right, lower left, and lower right are called the depth 1 serial number. 1, 2, 3, 4, as shown in Figure 5;

The subdivision with depth of 2 is divided into four sub-blocks with a depth of 1 and divided into four 16x16 sub-blocks with a depth of 2, and the sub-blocks with a depth of 1 are divided into upper left, upper right, lower left, and lower right. The sub-blocks of four sub-blocks of depth 2 (called depth 2 numbers) are 1, 2, 3, 4, and the sub-blocks of depth 1 and number 2 are divided into upper left, upper right, lower left, and lower right. The sub-blocks of 2 are called 5, 6, 7, 8, and the sub-blocks of

depth

1 and 3 are divided into upper sub-tops, upper right, lower left, and lower right sub-blocks of depth 2 The serial number (called the depth 2 serial number) is 9, 10, 11, 12, and the sub-blocks with the depth 1 and the number 4 are divided into the upper left, upper right, lower left, and lower right sub-blocks with a depth of 2 (called The depth 2 serial number is 13, 14, 15, 16 respectively, so that there are 16 sub-blocks with a depth of 2, as shown in FIG. 6;

The subdivision with a depth of 3 is divided into 16 sub-blocks of depth 2 and divided into 8 x 8 sub-blocks of 3 depths and 3 depths, and the sub-blocks whose depth 2 is k (1 ≤ k ≤ 16) are divided into upper left. The upper, lower left, and lower right sub-blocks of depth 3 are called 4x(k-1)+1, 4x(k-1)+2, 4x(k-1). +3,4x(k-1)+4, so that there are 64 sub-blocks with a depth of 3, as shown in Figure 7;

The partition with a depth of 4 is divided into 64 sub-blocks with a depth of 3 and divided into 4 x 4 sub-blocks with 4 depths up and down, and a sub-block whose depth 3 is k (1 ≤ k ≤ 64). The upper, lower left, and lower right sub-blocks of depth 4 (called depth 4 numbers) are 4x(k-1)+1, 4x(k-1)+2, 4x(k-1), respectively. +3,4x(k-1)+4, so that there are 256 sub-blocks with a depth of 4, as shown in Figure 8;

Finally, select a depth D (D=1 or 2 or 3 or 4) and place and arrange all the pixels in the following order to form a 1-dimensional array space:

The first column of the sub-block of depth D is number 1, and the pixel values in one column are arranged in a predetermined order.

The second column of the sub-block of depth D is number 1, and the pixel values in one column are arranged in a predetermined order.

..........................................

The last column of the sub-block with the depth D number is 1, and the pixel values in the 1 column are arranged in a predetermined order.

The first column of the sub-block of depth D is number 2, and the pixel values in one column are arranged in a predetermined order.

The second column of the sub-block of depth D is number 2, and the pixel values in one column are arranged in a predetermined order.

..........................................

The last column of the sub-block whose depth D is number 2, and the pixel values in the 1 column are arranged in a predetermined order.

..........................................

The first column of pixels of the sub-block with the largest depth D number, and the pixel values in one column are arranged in a predetermined order

The second column of the sub-block with the largest depth D number, the pixel values in one column are arranged in a predetermined order

..........................................

The last column pixel of the sub-block with the largest depth D number, and the pixel values in one column are arranged in a predetermined order.

Implementation or variant 10

The analyzing and evaluating the characteristics of the codec sub-region and/or the neighboring region in the codec method or device to select and determine a codec mode of the codec sub-region and a corresponding reference pixel sample storage space thereof The operation includes at least one of the following or a combination thereof:

1) selecting and determining a codec mode and a corresponding reference pixel sample storage space according to the location of the codec sub-region;

2) selecting and determining a codec mode and a corresponding reference pixel sample storage space according to the position of the reference pixel of the codec sub-region;

3) Selecting and determining a codec mode and its corresponding reference pixel sample storage space according to the feature quantity of the color of the pixels of the codec sub-region and/or the neighboring region.

4) Selecting and determining the codec mode and its corresponding reference pixel sample storage space according to the feature quantity of the color of the coded sub-area and/or its reference pixel.

Implementation or Variation 11 Embodiment 1 of at least 2 decoding modes and at least 2 reference pixel sample storage spaces

The decoding method or device adopts at least the following two decoding modes and two corresponding reference pixel samples respectively Value storage space:

1) a pixel string copying mode, the corresponding reference pixel sample storage space is a reference pixel sample storage space 1 composed of reconstructed pixel samples of a part of the current decoding CU but adjacent to the CU;

2) index copy mode, the corresponding reference pixel sample storage space is the reference pixel sample storage space 2 composed of part or all of the reconstructed index of the current decoding CU;

When decoding a sub-area (a pixel string or an index string), if the reference pixel of the currently decoded pixel is located outside the current decoding CU, the value of the reference pixel is copied from the reference pixel sample storage space 1 by using the pixel string copying method. a predicted value or a reconstructed value of the currently decoded pixel; if the reference pixel of the currently decoded pixel is located inside the currently decoded CU, the value of the reference index is copied from the reference pixel sample storage space 2 as the current decoded pixel by using an index copy manner Index and get the palette color corresponding to the index from the palette as the predicted or reconstructed value of the currently decoded pixel.

Implementation or Variation Example 12 Embodiment 2 of at least 2 decoding modes and at least 2 reference pixel sample storage spaces

The reference pixel sample storage space 1 in the implementation or variant 11 adopts a pixel color format of a non-index color format; the reference pixel sample storage space 2 in the implementation or variant 11 adopts a pixel color format of an index color format.

Implementation or Variation Example 13 Embodiment 3 of at least 2 decoding modes and at least 2 reference pixel sample storage spaces

The decoding method or apparatus adopts at least the following two decoding modes and two corresponding reference pixel sample storage spaces:

2) an index copy mode, the corresponding reference pixel sample storage space is a palette associated with the currently decoded CU, that is, part or all of the pixels of the currently decoded CU may be represented by an index of the palette;

When decoding a sub-area (a pixel string or an index string), if the position of the reference pixel of the currently decoded pixel is outside the currently decoded CU, the value of the reference pixel is copied from the reference pixel sample storage space 1 by the pixel string copy method. As the predicted value or the reconstructed value of the current decoded pixel; if the position of the reference pixel of the currently decoded pixel is within the currently decoded CU, the index copy mode is adopted, the reference index is first decoded, and then the target is obtained from the palette. The palette color corresponding to the index is used as a predicted value or a reconstructed value of the currently decoded pixel.

Implementation or Variation Example 14 Embodiment 4 of at least 2 decoding modes and at least 2 reference pixel sample storage spaces

The reference pixel sample storage space 1 in the implementation or variant example 13 adopts the pixel color format of the non-index color format; the reference pixel sample storage space composed of the palette in the implementation or variant example 13 adopts the pixel color of the index color format. format.

DRAWINGS

1 is a block diagram showing the composition of an encoding apparatus of the present invention.

2 is a block diagram showing the composition of a decoding apparatus of the present invention.

3 is an embodiment of four coding modes and two different forms of reference pixel sample storage spaces

4 is an embodiment of four decoding modes and two different forms of reference pixel sample storage spaces.

Figure 5 is a sub-block of depth 1 and a depth 1 number of each sub-block

Figure 6 is a sub-block of depth 2 and a depth 2 number of each sub-block

Figure 7 is a sub-block of depth 3 and a depth 3 number of each sub-block

Figure 8 is a sub-block of depth 4 and a depth 4 number of each sub-block

Claims

An image encoding method or apparatus, comprising at least one of the steps or modules that perform the following functions and operations:

1) analyzing and evaluating the characteristics of the coding sub-area and/or the adjacent area, and selecting and determining the coding mode of the coding sub-area according to a predetermined evaluation criterion according to the result of the analysis and evaluation;

2) encoding the coding sub-area by using a reference pixel sample storage space corresponding to the coding mode, and writing the coding result to the video code stream; the video code stream includes at least decoding corresponding decoding in the decoding method and the device. Some or all of the information required by the way and reference pixel sample storage space.
An image decoding method or apparatus, comprising at least one of the steps or modules for performing the following functions and operations:

1) parsing the video code stream and/or analyzing and evaluating the characteristics of the decoded sub-area and/or the neighboring area, and selecting and determining the decoding mode of the decoding sub-area according to the results of parsing, analyzing and evaluating;

2) Decoding the decoded sub-area using the decoding mode and its corresponding reference pixel sample storage space to generate a reconstructed pixel.
The encoding method or apparatus according to claim 1 or the decoding method or apparatus according to claim 2, characterized in that:

The coding sub-region or the decoding sub-region is one coding region or one decoding region of an image, and includes at least one of: a maximum coding unit LCU, a coding tree unit CTU, a coding unit CU, a sub-region of a CU, a prediction unit PU, a transform Unit TU, macroblock, microblock, rectangle, line, pixel segment, pixel string, index segment, index string.
The encoding device according to claim 1, characterized in that it comprises at least one of the following modules:

The prediction and matching coding module is configured to be a plurality of coding processing units using different coding modes, wherein the coding processing unit is configured to perform data using one of intra prediction coding, inter prediction coding, or a plurality of different matching coding modes. deal with;

An encoding storage module configured to store a plurality of storage units of different forms of reconstructed pixel sample data;

The coded data control module is configured to control the prediction and matching coding module to read data of a corresponding one or more storage units in the code storage module that is preset; according to the read data, the control The prediction and matching encoding module hands over the data to be encoded to the corresponding encoding processing unit; and controls the prediction and matching encoding module to write the output data of the encoding processing unit to one or more of the preset encoding storage modules. unit.
The encoding device according to claim 1, characterized in that it is composed of at least part or all of the following modules;

Module 1) predictive coding mode module, matching coding mode 1 module, matching coding mode 2 module, ..., matching coding mode A-1 module: A is a positive integer satisfying 2 ≤ A ≤ 5; the A coding mode module One input coding of the input video image using the prediction (intra prediction or inter prediction or both) coding mode, the matching coding mode 1, the matching coding mode 2, ..., the matching coding mode A-1, respectively. Performing a prediction or matching encoding operation on a current coding sub-region in the cell;

Module 2) reference pixel sample storage space 1 module, reference pixel sample storage space 2 module, ..., reference pixel sample storage space B module: B is a positive integer satisfying 2 ≤ B ≤ A ≤ 5; The reference pixel sample storage space stores the reconstructed pixel sample history data generated in the encoding process in a unique and different form, and is used as a reference pixel sample required for subsequent encoding and reconstruction of various operations; Different forms of reference pixel sample storage space may have different spatial sizes to store different amounts of historical data, wherein the one storing the most historical data is called the primary reference pixel sample storage space, and the other reference pixel sample storage is. The historical data in the space is a subset of the historical data of the primary reference pixel sample storage space, but has different representations and may be in different degrees of partial reconstruction; corresponding to the B reference pixel sample storage spaces, The prediction coding mode module, the matching coding mode 1 module, the matching coding mode 2 module, ..., the matching coding mode A-1 module A coding mode Correspondingly, the categorization is merged into the B class, and the B reference pixel sample storage spaces are in one-to-one correspondence; when the current coding sub-region is encoded, each of the prediction coding modes or the matching coding mode is used according to the classification. The reference pixel sample storage space is used for predictive coding or matching encoding;

Module 3) The remaining common techniques of coding and reconstruction modules: performing coding and reconstruction operations and entropy coding operations on various input parameters and variables;

Module 4) Form Conversion Module for Pixel Sample Storage Space: Converts one form of the form of B reference pixel sample storage space into another form.
A decoding apparatus according to claim 2, comprising at least one of the following modules:

A prediction and copy decoding module configured to be a plurality of decoding processing units using different decoding modes, wherein the decoding processing unit is configured to use intra prediction decoding, inter prediction decoding, or multiple different replicas One of the decoding methods processes the data;

Decoding the storage module, configured to store a plurality of storage units of different forms of reconstructed pixel sample data;

a decoding data control module configured to control the prediction and copy decoding module to read data of a corresponding one or more storage units in the preset decoding storage module; and to control the prediction according to the read data The copy decoding module hands over the data to be decoded to the corresponding decoding processing unit; and controls the prediction and copy decoding module to write the output data of the decoding processing unit to one or more storage units in the preset decoding storage module.
A decoding apparatus according to claim 2, comprising at least one of the following modules:

Module 1) code stream data parsing and partial decoding module: entropy is performed on the video code stream of the currently decoded CU including the prediction mode, the motion vector, the matching mode, the matching position, the unmatched sample compressed data, and all other syntax element compressed data. Decoding, and parsing the meaning of various data obtained by entropy decoding;

Module 2) prediction decoding mode module, matching decoding mode 1 module, matching decoding mode 2 module, ..., matching decoding mode A-1 module: A is a positive integer satisfying 2 ≤ A ≤ 5; the A decoding mode module Using the prediction (intra prediction or inter prediction or both) decoding mode, matching decoding mode 1, matching decoding mode 2, ..., matching decoding mode A-1, the A decoding mode respectively decodes the corresponding A type. The sub-area performs prediction or matching decoding operations;

Module 3) reference pixel sample storage space 1 module, reference pixel sample storage space 2 module, ..., reference pixel sample storage space B module: B is a positive integer satisfying 2 ≤ B ≤ A ≤ 5; The reference pixel sample storage spaces store the reconstructed pixel sample history data generated in the decoding process in unique mutually different forms; the B different forms of reference pixel sample storage spaces may have different space sizes to store different The historical data of the quantity, wherein the one storing the most historical data is referred to as the primary reference pixel sample storage space, and the historical data in the other reference pixel sample storage spaces is the child of the historical data of the primary reference pixel sample storage space Set, but different representations and may be in different degrees of partial reconstruction; corresponding to the B reference pixel sample storage spaces, the prediction decoding mode module, the matching decoding mode 1 module, the matching decoding mode 2 module, ..., the A decoding mode of the matching decoding mode A-1 module is also divided into B classes correspondingly, and the B reference pixel sample storage spaces are one by one. Should be; when decoding the current decoding sub-region, each prediction decoding mode or matching decoding mode performs prediction decoding or matching decoding according to the corresponding reference pixel sample storage space according to its classification, and each reference pixel sample The value storage space provides a reference pixel sample amount for all decoding modes (possibly multiple) of the corresponding decoding mode;

Module 4) various other common techniques for decoding and reconstructing modules: performing the rest of the current decoding sub-region Decoding and reconstruction operations of various common techniques;

Module 5) Form Conversion Module for Pixel Sample Storage Space: Converts one form of the form of B reference pixel sample storage space into another form.
The encoding method or apparatus according to claim 1 or the decoding method or apparatus according to claim 2, characterized in that:

The analyzing and evaluating the characteristics of the codec sub-region and/or the neighboring region in the codec method or device to select and determine a codec mode of the codec sub-region and a corresponding reference pixel sample storage space thereof The operation includes at least one of the following or a combination thereof:

1) selecting and determining the codec mode and its corresponding reference pixel sample storage space according to the position of the codec sub-region

2) selecting and determining a codec mode and a corresponding reference pixel sample storage space according to the position of the reference pixel of the codec sub-region;

3) selecting and determining a codec mode and a corresponding reference pixel sample storage space according to a feature quantity of a color of a pixel of the codec sub-region and/or the neighboring region;

4) Selecting and determining the codec mode and its corresponding reference pixel sample storage space according to the feature quantity of the color of the coded sub-area and/or its reference pixel.
A decoding method or apparatus according to claim 2, wherein:

The decoding method or apparatus adopts at least the following two decoding modes and two corresponding reference pixel sample storage spaces:

1) a pixel string copying mode, the corresponding reference pixel sample storage space is a reference pixel sample storage space 1 composed of reconstructed pixel samples of a part of the current decoding CU but adjacent to the CU;

2) index copy mode, the corresponding reference pixel sample storage space is the reference pixel sample storage space 2 composed of part or all of the reconstructed index of the current decoding CU;

When decoding a sub-area, if the reference pixel of the currently decoded pixel is located outside the current decoding CU, the value of the reference pixel is copied from the reference pixel sample storage space 1 as the predicted value of the currently decoded pixel or in a pixel string copy manner. Reconstructing the value; if the reference pixel of the currently decoded pixel is located inside the current decoding CU, copying the value of the reference index from the reference pixel sample storage space 2 as the index of the currently decoded pixel and from the palette The palette color corresponding to the index is obtained as a predicted value or a reconstructed value of the currently decoded pixel.
A decoding method or apparatus according to claim 2, wherein:

The decoding method or apparatus adopts at least the following two decoding modes and two corresponding reference pixel sample storage spaces:

1) a pixel string copying mode, the corresponding reference pixel sample storage space is a reference pixel sample storage space 1 composed of reconstructed pixel samples of a part of the current decoding CU but adjacent to the CU;

2) an index copy mode, the corresponding reference pixel sample storage space is a palette associated with the currently decoded CU, that is, part or all of the pixels of the currently decoded CU may be represented by an index of the palette;

When decoding a sub-area, if the position of the reference pixel of the currently decoded pixel is outside the currently decoded CU, the value of the reference pixel is copied from the reference pixel sample storage space 1 as the current decoded pixel by using a pixel string copy mode. Predicted value or reconstructed value; if the position of the reference pixel of the currently decoded pixel is within the currently decoded CU, the index copy mode is adopted, the reference index is first decoded, and then the palette color corresponding to the index is obtained from the palette. As the predicted or reconstructed value of the currently decoded pixel.