CN114390289A

CN114390289A - Reference pixel candidate list construction method, device, equipment and storage medium

Info

Publication number: CN114390289A
Application number: CN202011114074.XA
Authority: CN
Inventors: 王英彬
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-10-18
Filing date: 2020-10-18
Publication date: 2022-04-22
Also published as: WO2022078339A1

Abstract

The application provides a method, a device, equipment and a storage medium for constructing a reference pixel candidate list, and relates to the technical field of video coding and decoding. The method comprises the following steps: responding to video coding and decoding through an equivalent string mode, and determining an airspace adjacent pixel of a current coding and decoding block, wherein the airspace adjacent pixel is a reconstructed pixel of which the distance from the current coding and decoding block is within a specified distance range; constructing a spatial domain adjacent pixel list of the current coding and decoding block based on the pixel information of the target reference pixel in the spatial domain adjacent pixels of the current coding and decoding block; and constructing a preselected reference pixel candidate list of the current coding and decoding block based on the spatial adjacent pixel list. The scheme can expand the reference pixel selection range in the equivalent string mode, and further improve the coding and decoding efficiency of the expanded equivalent string mode.

Description

Reference pixel candidate list construction method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of video coding and decoding, in particular to a method, a device, equipment and a storage medium for constructing a reference pixel candidate list.

Background

In the current Video compression technology, such as VVC (universal Video Coding) and AVS3(Audio Video Coding Standard 3), a string prediction codec mode is introduced, where an equivalent string mode is one of the string prediction codecs.

In the related art, the encoding and decoding manner of the string prediction is performed depending on the reference pixel prediction list. When the equivalent string prediction coding and decoding are carried out on the current coding and decoding block, firstly, an initial reference pixel prediction list of the current coding and decoding block is constructed through a reconstructed reference pixel prediction list of the coding and decoding block, and then the equivalent string prediction is carried out on the current coding and decoding block according to the initial reference pixel prediction list.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a storage medium for constructing a reference pixel candidate list, which can expand the reference pixel selection range in an equivalent string mode and further improve the coding and decoding efficiency of the expanded equivalent string mode. The technical scheme is as follows:

according to an aspect of an embodiment of the present application, there is provided a reference pixel candidate list construction method, including:

responding to video coding and decoding through an equivalent string mode, and determining an airspace adjacent pixel of a current coding and decoding block, wherein the airspace adjacent pixel is a reconstructed pixel of which the distance from the current coding and decoding block is within a specified distance range;

constructing a spatial domain adjacent pixel list of the current coding and decoding block based on the pixel information of the target reference pixel in the spatial domain adjacent pixels of the current coding and decoding block;

and constructing a preselected reference pixel candidate list of the current coding and decoding block based on the spatial adjacent pixel list.

According to an aspect of an embodiment of the present application, there is provided a reference pixel candidate list construction apparatus, including:

the pixel determination module is used for responding to video coding and decoding through an equivalent string mode and determining the spatial domain adjacent pixels of the current coding and decoding block, wherein the spatial domain adjacent pixels are reconstructed pixels of which the distance from the spatial domain adjacent pixels to the current coding and decoding block is within a specified distance range;

the adjacent pixel list construction module is used for constructing an airspace adjacent pixel list of the current coding and decoding block based on the pixel information of a target reference pixel in the airspace adjacent pixels of the current coding and decoding block;

and the reference pixel list construction module is used for constructing a preselected reference pixel candidate list of the current coding and decoding block based on the spatial domain adjacent pixel list.

In one possible implementation, the target reference pixels are all of spatially adjacent pixels of the current coding and decoding block;

or, the target reference pixel is a pixel at a specified position in the spatial domain neighboring pixels of the current coding and decoding block;

or, the target reference pixel is a pixel at a position determined based on the size of the current coding and decoding block in a spatial neighborhood of the current coding and decoding block.

In one possible implementation, the pixel information of the target reference pixel includes at least one of position information of the target reference pixel and pixel value information of the target reference pixel;

the position information comprises coordinates of corresponding pixels in an image of the current image block;

or, the position information comprises coordinates of the corresponding pixel in a row of largest coding units, LCUs;

alternatively, the position information includes coordinates of the corresponding pixel on the luminance image.

In a possible implementation manner, the neighboring pixel list building module is configured to fill the pixel information of the target reference pixel into the spatial neighboring pixel list according to a specified filling order.

In one possible implementation, the neighbor list construction module is configured to,

filling pixel information of each target reference pixel positioned above the current coding and decoding block into the airspace adjacent pixel list, and filling pixel information of each target reference pixel positioned on the left side of the current coding and decoding block into the airspace adjacent pixel list;

or after filling the pixel information of each target reference pixel positioned at the left of the current coding and decoding block into the airspace adjacent pixel list, filling the pixel information of each target reference pixel positioned above the current coding and decoding block into the airspace adjacent pixel list;

or, alternately filling the pixel information of each target reference pixel above the current coding and decoding block and the pixel information of each target reference pixel at the left of the current coding and decoding block into the spatial domain adjacent pixel list.

for the first reference pixel, obtaining the absolute value of the difference value between the first reference pixel and the pixel value of each existing reference pixel in the spatial domain adjacent pixel list; the first reference pixel is any one of the target reference pixels;

in response to the absolute value of the difference being greater than a first absolute value threshold, filling pixel information of the first reference pixel into the spatial neighborhood pixel list.

In a possible implementation manner, the difference between the pixel values of the first reference pixel and each existing reference pixel in the spatial neighboring pixel list includes a difference between the pixel values of the first reference pixel and each existing reference pixel in the spatial neighboring pixel list in a luminance component and a chrominance component;

alternatively, the first and second electrodes may be,

the difference between the pixel values of the first reference pixel and each of the existing reference pixels in the spatial domain neighboring pixel list includes a difference between the pixel values of the first reference pixel and each of the existing reference pixels in the spatial domain neighboring pixel list in the luminance component.

In one possible implementation, the neighboring pixel list construction module is further configured to,

when the first reference pixel is unavailable, taking a pixel value of a nearest available reference pixel of the first reference pixel as a pixel value of the first reference pixel;

or, when the first reference pixel is not available, setting the pixel value of the first reference pixel to a default value;

alternatively, when the first reference pixel is unavailable, skipping the first reference pixel.

In one possible implementation, the reference pixel list building module includes: a first list acquisition unit, a second list acquisition unit, a third list acquisition unit,

the first list acquisition unit is used for acquiring the spatial domain adjacent pixel list as a preselected reference pixel candidate list of the current coding and decoding block;

the second list acquisition unit is used for merging the spatial domain adjacent pixel list and a historical reference pixel candidate list to obtain a preselected reference pixel candidate list of the current coding and decoding block; the historical reference pixel candidate list is constructed based on a reference pixel candidate list of a reconstructed codec block;

the third list obtaining unit is configured to arrange the historical reference pixel candidate list based on the spatial neighboring pixel list, and obtain a preselected reference pixel candidate list of the current coding and decoding block.

In a possible implementation manner, the second list obtaining unit is configured to,

sequentially filling each pixel information in the spatial domain adjacent pixel list into the preselected reference pixel candidate list, and sequentially filling each pixel information in the historical reference pixel candidate list into the preselected reference pixel candidate list until the filling of each pixel information in the historical reference pixel candidate list is completed, or the quantity of the pixel information in the preselected reference pixel candidate list reaches a quantity threshold;

alternatively, the first and second electrodes may be,

and sequentially filling each pixel information in the historical reference pixel candidate list into the preselected reference pixel candidate list, and sequentially filling each pixel information in the spatial domain adjacent pixel list into the preselected reference pixel candidate list until the filling of each pixel information in the spatial domain adjacent pixel list is completed, or until the number of the pixel information in the preselected reference pixel candidate list reaches a number threshold.

for first pixel information, obtaining an absolute value of a difference value between a pixel value corresponding to the first pixel information and a pixel value corresponding to each piece of pixel information existing in the preselected reference pixel candidate list; the first pixel information is any one of the spatial neighboring pixel list and the historical reference pixel candidate list;

in response to the absolute value of the difference being greater than a second absolute value threshold, populating the first pixel information into the preselected reference pixel candidate list.

In a possible implementation manner, the third list obtaining unit is configured to,

for second pixel information, obtaining an absolute value of a difference value between a pixel value corresponding to the second pixel information and a pixel value corresponding to each pixel information in the spatial domain adjacent pixel list; the second pixel information is any one piece of pixel information sequentially selected from the historical reference pixel candidate list;

in response to the pixel value corresponding to the second pixel information, the absolute value of the difference between the pixel values corresponding to the fourth pixel information being less than a third absolute value threshold, populating the second pixel information into the preselected candidate list of reference pixels; the fourth pixel information is any one of the spatial neighboring pixel lists.

for fifth pixel information, obtaining an absolute value of a difference value between a pixel value corresponding to the fifth pixel information and a pixel value corresponding to each pixel information in the spatial domain adjacent pixel list; the fifth pixel information is any one of pixel information sequentially selected from the historical reference pixel candidate list;

in response to the pixel value corresponding to the fifth pixel information, the absolute value of the difference between the pixel values corresponding to the sixth pixel information is less than or equal to a fourth absolute value threshold, and the fifth pixel information is filled into a first candidate list; the sixth pixel information is any one of the spatial domain neighboring pixel lists;

in response to the pixel value corresponding to the fifth pixel information and the absolute value of the difference between the pixel values corresponding to the sixth pixel information being greater than a fourth absolute value threshold, populating the fifth pixel information into a second candidate list;

sequentially filling each pixel information in the first candidate list and each pixel information in the second candidate list into the preselected reference pixel candidate list; the position of each pixel information in the first candidate list in the preselected reference pixel candidate list is located before the position of each pixel information in the second candidate list in the preselected reference pixel candidate list.

According to an aspect of embodiments of the present application, there is provided a computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the above-mentioned reference pixel candidate list construction method.

According to an aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the above-mentioned reference pixel candidate list construction method.

In yet another aspect, embodiments of the present application provide a computer program product or a computer program, where the computer program product or the computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the reference pixel candidate list construction method.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

before the current coding and decoding block is coded/decoded, a preselected reference pixel candidate list of the current coding and decoding block is constructed through related information of reconstructed pixels adjacent to the current coding block, so that in the subsequent coding/decoding process of the current coding and decoding block, pixels of an adjacent space domain are introduced as references to carry out equivalent string prediction coding/decoding, the reference pixel selection range under an equivalent string mode can be expanded, and the coding/decoding efficiency of the equivalent string mode is further improved.

Drawings

FIG. 1 is a basic flow diagram of a video encoding process as exemplarily shown herein;

FIG. 2 is a diagram illustrating inter prediction modes according to an embodiment of the present application;

FIG. 3 is a diagram illustrating candidate motion vectors according to an embodiment of the present application;

FIG. 4 is a diagram of an intra block copy mode, as provided by one embodiment of the present application;

FIG. 5 is a diagram illustrating an intra-string copy mode according to an embodiment of the present application;

FIG. 6 is a simplified block diagram of a communication system provided by one embodiment of the present application;

FIG. 7 is a schematic diagram of the placement of a video encoder and a video decoder in a streaming environment as exemplary illustrated herein;

FIG. 8 is a flowchart of a method for constructing a candidate list of reference pixels according to an embodiment of the present application;

FIG. 9 is a schematic diagram of spatially adjacent pixel locations according to the embodiment of FIG. 8;

FIG. 10 is a numbered schematic of spatially adjacent pixels according to the embodiment of FIG. 8;

FIG. 11 is a flow diagram of an iso-string prediction flow provided by one embodiment of the present application;

FIG. 12 is a block diagram of a reference pixel candidate list construction apparatus according to an embodiment of the present application;

fig. 13 is a block diagram of a computer device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Before describing the embodiments of the present application, a brief description of the video encoding technique will be provided with reference to fig. 1. Fig. 1 illustrates a basic flow diagram of a video encoding process.

A video signal refers to a sequence of images comprising a plurality of frames. A frame (frame) is a representation of spatial information of a video signal. Taking the YUV mode as an example, one frame includes one luminance sample matrix (Y) and two chrominance sample matrices (Cb and Cr). From the viewpoint of the manner of acquiring the video signal, the method can be divided into two manners, that is, a manner shot by a camera and a manner generated by a computer. Due to the difference of statistical characteristics, the corresponding compression encoding modes may be different.

In some mainstream Video Coding technologies, such as h.265/HEVC (High efficiency Video Coding), h.266/VVC (universal Video Coding) Standard, and AVS (Audio Video Coding Standard) (such as AVS3), a hybrid Coding framework is adopted to perform a series of operations and processes on an input original Video signal as follows:

1. block Partition Structure (Block Partition Structure): the input image is divided into several non-overlapping processing units, each of which will perform a similar compression operation. This processing Unit is called a CTU (Coding Tree Unit), or LCU (Large Coding Unit). The CTU can continue to perform finer partitioning further down to obtain one or more basic Coding units, called CU (Coding Unit). Each CU is the most basic element in an encoding link. Described below are various possible encoding schemes for each CU.

2. Predictive Coding (Predictive Coding): the method comprises the modes of intra-frame prediction, inter-frame prediction and the like, and residual video signals are obtained after the original video signals are predicted by the selected reconstructed video signals. The encoding side needs to decide for the current CU the most suitable one among the many possible predictive coding modes and inform the decoding side. The intra-frame prediction means that the predicted signal comes from an already encoded and reconstructed region in the same image. Inter-prediction means that the predicted signal is from a picture (called a reference picture) that has already been coded and is different from the current picture.

3. Transform coding and Quantization (Transform & Quantization): the residual video signal is subjected to Transform operations such as DFT (Discrete Fourier Transform), DCT (Discrete Cosine Transform), etc., to convert the signal into a Transform domain, which is referred to as Transform coefficients. In the signal in the transform domain, a lossy quantization operation is further performed to lose certain information, so that the quantized signal is favorable for compressed representation. In some video coding standards, there may be more than one transform mode that can be selected, so the encoding side also needs to select one of the transforms for the current CU and inform the decoding side. The degree of refinement of the quantization is generally determined by the quantization parameter. QP (Quantization Parameter) values are larger, and coefficients representing a larger value range are quantized into the same output, so that larger distortion and lower code rate are generally brought; conversely, the QP value is smaller, and the coefficients representing a smaller value range will be quantized to the same output, thus usually causing less distortion and corresponding to a higher code rate.

4. Entropy Coding (Entropy Coding) or statistical Coding: and (3) carrying out statistical compression coding on the quantized transform domain signals according to the frequency of each value, and finally outputting a compressed code stream of binarization (0 or 1). Meanwhile, the encoding generates other information, such as the selected mode, motion vector, etc., which also needs to be entropy encoded to reduce the code rate. The statistical coding is a lossless coding mode, and can effectively reduce the code rate required by expressing the same signal. Common statistical Coding methods include Variable Length Coding (VLC) or context-based Binary Arithmetic Coding (CABAC).

5. Loop Filtering (Loop Filtering): the coded image is subjected to operations of inverse quantization, inverse transformation and prediction compensation (the operations 2 to 4 are reversed), and a reconstructed decoded image can be obtained. Compared with the original image, the reconstructed image has a distortion (distortion) due to the difference between partial information and the original image due to the quantization effect. The distortion degree generated by quantization can be effectively reduced by performing filtering operation on the reconstructed image, such as deblocking (deblocking), SAO (Sample Adaptive Offset), ALF (Adaptive Lattice Filter), or other filters. Since these filtered reconstructed pictures are to be used as reference for subsequent coded pictures for prediction of future signals, the above-mentioned filtering operation is also referred to as loop filtering, and filtering operation within the coding loop.

According to the above coding process, at the decoding end, after the decoder obtains the compressed code stream for each CU, the decoder performs entropy decoding to obtain various mode information and quantized transform coefficients. And carrying out inverse quantization and inverse transformation on each coefficient to obtain a residual signal. On the other hand, the prediction signal corresponding to the CU is obtained from the known coding mode information, and the prediction signal and the CU are added to obtain a reconstructed signal. Finally, the reconstructed value of the decoded image needs to undergo loop filtering operation to generate a final output signal.

Some mainstream video coding standards, such as HEVC, VVC, AVS3, etc., adopt a hybrid coding framework based on blocks. The original video data are divided into a series of coding blocks, and the compression of the video data is realized by combining video coding methods such as prediction, transformation, entropy coding and the like. Motion compensation is a type of prediction method commonly used in video coding, and the motion compensation derives a prediction value of a current coding block from a coded area based on the redundancy characteristic of video content in a time domain or a space domain. Such prediction methods include: inter prediction, intra block copy prediction, intra string copy prediction, etc., which may be used alone or in combination in a particular coding implementation. For coding blocks using these prediction methods, it is generally necessary to encode, either explicitly or implicitly in the code stream, one or more two-dimensional displacement vectors indicating the displacement of the current block (or of a co-located block of the current block) with respect to its reference block or blocks.

It should be noted that the displacement vector may have different names in different prediction modes and different implementations, and is described herein in the following manner: 1) the displacement Vector in the inter prediction mode is called Motion Vector (MV for short); 2) a displacement Vector in an IBC (Intra Block Copy) prediction mode is called a Block Vector (BV); 3) the displacement Vector in the ISC (Intra String Copy) prediction mode is called a String Vector (SV). Intra-frame string replication is also referred to as "string prediction" or "string matching," etc.

MV refers to a displacement vector for inter prediction mode, pointing from the current picture to a reference picture, whose value is the coordinate offset between the current block and the reference block, where the current block and the reference block are in two different pictures. In the inter-frame prediction mode, motion vector prediction can be introduced, a prediction motion vector corresponding to the current block is obtained by predicting the motion vector of the current block, and the difference value between the prediction motion vector corresponding to the current block and the actual motion vector is coded and transmitted. In the embodiment of the present application, predicting a motion vector refers to obtaining a prediction value of a motion vector of a current block by a motion vector prediction technique.

BV refers to a displacement vector for IBC prediction mode, whose value is the coordinate offset between the current block and the reference block, both in the current picture. In the IBC prediction mode, block vector prediction may be introduced, a prediction block vector corresponding to the current block is obtained by predicting the block vector of the current block, and a difference between the prediction block vector corresponding to the current block and the actual block vector is encoded and transmitted. In the embodiment of the present application, the prediction block vector refers to a prediction value of a block vector of the current block obtained by a block vector prediction technique.

SV is a displacement vector for ISC prediction mode, and its value is the coordinate offset between the current string and the reference string, both in the current picture. In the ISC prediction mode, string vector prediction can be introduced, a predicted string vector corresponding to the current string is obtained by predicting the string vector of the current string, and the difference value between the predicted string vector corresponding to the current string and the actual string vector is coded and transmitted. In the embodiment of the present application, the predicted string vector is a predicted value of a string vector of a current string obtained by a string vector prediction technique.

Several different prediction modes are described below:

one, inter prediction mode

As shown in fig. 2, inter-frame prediction uses the correlation of the video time domain, and uses the pixels of the neighboring coded pictures to predict the pixels of the current picture, so as to achieve the purpose of effectively removing the video time domain redundancy, and effectively save the bits of the coding residual data. Wherein, P is the current frame, Pr is the reference frame, B is the current block to be coded, and Br is the reference block of B. The coordinate positions of B 'and B in the image are the same, the coordinate of Br is (xr, yr), and the coordinate of B' is (x, y). The displacement between the current block to be coded and its reference block, called Motion Vector (MV), is:

MV＝(xr-x,yr-y)。

the bits required to encode MVs can be further reduced by using MV prediction techniques, considering that temporal or spatial neighboring blocks have strong correlation. In h.265/HEVC, inter Prediction includes two MV Prediction techniques, Merge and AMVP (Advanced Motion Vector Prediction).

The Merge mode establishes an MV candidate list for the current PU (Prediction Unit), where there are 5 candidate MVs (and their corresponding reference pictures). And traversing the 5 candidate MVs, and selecting the optimal MV with the minimum rate-distortion cost. If the codec builds the candidate list in the same way, the encoder only needs to transmit the index of the optimal MV in the candidate list. It should be noted that the MV prediction technique of HEVC also has a skip mode, which is a special case of the Merge mode. After finding the optimal MV in the Merge mode, if the current block and the reference block are substantially the same, no residual data need to be transmitted, only the index of the MV and a skip flag need to be transmitted.

The MV candidate list established in the Merge mode includes two cases of spatial domain and temporal domain, and also includes a combined list mode for B Slice (B frame image). In which the spatial domain provides a maximum of 4 candidate MVs, the establishment of which is shown in part (a) of fig. 3. The spatial domain list is established according to the sequence of A1 → B1 → B0 → A0 → B2, wherein B2 is alternative, namely when one or more of A1, B1, B0 and A0 do not exist, the motion information of B2 is needed to be used; the time domain provides only 1 candidate MV at most, and its establishment is shown as part (b) in fig. 3, and is scaled by the MV of the co-located PU according to the following formula:

curMV＝td*colMV/tb；

wherein, curMV represents the MV of the current PU, colMV represents the MV of the co-located PU, td represents the distance between the current picture and the reference picture, and tb represents the distance between the co-located picture and the reference picture. If the PU at the D0 position on the co-located block is not available, the co-located PU at the D1 position is used for replacement. For a PU in B Slice, since there are two MVs, its MV candidate list also needs to provide two MVPs (Motion Vector Predictor). HEVC generates a combined list for B Slice by pairwise combining the first 4 candidate MVs in the MV candidate list.

Similarly, the AMVP mode builds a MV candidate list for the current PU using MV correlation of spatial and temporal neighboring blocks. Different from the Merge mode, the optimal predicted MV is selected from the MV candidate list in the AMVP mode, and differential coding is performed on the optimal predicted MV obtained by Motion search with the current block to be coded, that is, the coded MVD is MV-MVP, where MVD is a Motion Vector Difference (Motion Vector Difference); the decoding end can calculate the MV of the current decoding block only by the serial numbers of the MVD and the MVP in the list by establishing the same list. The MV candidate list of AMVP mode also includes both spatial and temporal cases, except that the MV candidate list of AMVP mode is only 2 in length.

As described above, in the AMVP mode of HEVC, MVDs need to be encoded. In HEVC, the resolution of the MVD is controlled by use _ integer _ mv _ flag in slice _ header, and when the value of the flag is 0, the MVD is encoded at 1/4 (luminance) pixel resolution; when the flag has a value of 1, the MVD is encoded with full (luminance) pixel resolution. A method of Adaptive Motion Vector Resolution (AMVR) is used in the VVC. This method allows adaptive selection of the resolution of the coded MV per CU. In the normal AMVP mode, alternative resolutions include 1/4, 1/2, 1 and 4 pixel resolution. For a CU with at least one non-zero MVD component, a flag is first encoded to indicate whether quarter luma sample MVD precision is used for the CU. If the flag is 0, the MVD of the current CU is encoded with 1/4 pixel resolution. Otherwise, a second flag needs to be encoded to indicate that the CU uses 1/2 pixel resolution or other MVD resolution. Otherwise, a third flag is encoded to indicate whether 1-pixel resolution or 4-pixel resolution is used for the CU.

Two, IBC prediction mode

The IBC is an intra-frame Coding tool adopted in HEVC Screen Content Coding (SCC) extension, and significantly improves the Coding efficiency of Screen Content. In AVS3 and VVC, IBC techniques have also been adopted to improve the performance of screen content encoding. The IBC uses the spatial correlation of the screen content video to predict the pixels of the current block to be coded by using the pixels of the image coded on the current image, thereby effectively saving the bits required by the coded pixels. As shown in fig. 4, the displacement between the current block and its reference block in IBC is called BV (block vector). The H.266/VVC employs a BV prediction technique similar to inter prediction to further save the bits needed to encode BV. VVC predicts BV using AMVP mode similar to that in inter prediction and allows BVD to be encoded using 1 or 4 pixel resolution.

Third, ISC prediction mode

ISC techniques divide an encoded block into a series of pixel strings or unmatched pixels in some scan order (e.g., raster scan, round-trip scan, and Zig-Zag scan). Similar to IBC, each string finds a reference string of the same shape in the encoded region of the current picture, derives the predicted value of the current string, and can effectively save bits by encoding the residual between the pixel value of the current string and the predicted value instead of directly encoding the pixel value. Fig. 5 shows a schematic diagram of intra-frame string replication, where the dark gray regions are coded regions, the 28 pixels in white are string 1, the 35 pixels in light gray are string 2, and the 1 pixel in black represents an unmatched pixel. The displacement between string 1 and its reference string is string vector 1 in fig. 5; the displacement between string 2 and its reference string is string vector 2 in fig. 5.

The intra-frame string copy technique needs to encode the SV corresponding to each string in the current coding block, the string length, and a flag indicating whether there is a matching string. Where SV represents the displacement of the string to be encoded to its reference string. The string length represents the number of pixels that the string contains.

Four, equal value string mode

The equivalent string mode is a sub-mode of intra-frame string copy and is similar to the intra-frame string copy, and a coding and decoding block in the equivalent string mode is divided into a series of pixel strings according to a certain scanning sequence, and is characterized in that all pixels in the pixel strings have the same predicted value. The equivalent string mode requires encoding the length and prediction value of each string in the current coding and decoding block.

The prediction value can be encoded by the following methods:

1) directly coding the predicted value;

2) constructing a reference pixel candidate list L1, and coding the index of a predicted value in L1;

3) a reference pixel prediction list L0 is constructed, from which a reference pixel candidate list L1, a coded reuse _ flag and an index of the coded prediction value in L1 are derived based on a reuse flag (reuse _ flag).

In the current iso-string implementation, the above method 3) is currently used to encode the prediction values.

The current equivalent string is implemented as follows:

isostrings often appear in screen content images, and the pixel values of these isostrings have themselves a high frequency of occurrence. According to this feature, the isostring coding technique of intra prediction by string copy records and saves the location of these pixels in the current LCU row, called the current location, when they first appear in an LCU row, while the pixels of the isostring are also called the current location pixels.

The pixels at the current position and the current position are then repeatedly fetched for use as reference pixels via a Candidate List of historical point vector predictors (HpvpCandList, which is equivalent to HmvpCandList for inter prediction or HbvpCandList for IBC, and can contain 15 point vectors at most). The frequent location is represented by coordinates with the top left position of the current LCU row as the origin, called the Point Vector (PV) of Point prediction.

Before the CU starts to carry out point prediction of the equivalent string and unit basis vector string sub-mode, pixels inside the CU are clustered, and K pixel values with high occurrence frequency are obtained. If the value of a certain pixel is the same as the value of a pixel at a current position corresponding to a certain pv placed in a prevhpvppcandlist array or the difference is less than a certain threshold, the pv is directly placed in the hpvppcandlist of the current CU to obtain the initial hpvppcandlist of the current CU.

The hpvpandlist is continually expanded in the process of encoding evs in the current cu one by one. Whenever a new current location occurs, the pv for that current location is added to the HpvpCandList.

PrevHpvpCandList is initially empty. After completing the string copy intra prediction encoding of a current CU, the PrevHpvpCandList needs to be updated:

first, the duplicate of the current hpvppcandlist is deleted from the prevhpvppcandlist, and the pv in the hpvppcandlist of the CU whose decoding has been completed is filled into the prevhpvppcandlist from the header. PrevHpvpCandList stores up to 28 pre-pvs, the excess will be removed.

The encoding flow of the equivalent string prediction value is as follows:

s1, encoding reuse _ flag, indicating whether pv in prevhpvppcandlist is present in hpvpandlist;

s2, encoding the length of the equivalent string;

s3, encoding a predicted value, and writing an index corresponding to pv into a code stream if the pixel value of the equivalent string appearing in the encoding is equal to the value of a pixel pointed by a certain pv in the hpvpcandList; otherwise, writing the pixel value into the code stream, and expanding the reference pixel candidate list by using the value.

The decoding flow of the equivalent string prediction value is as follows:

s1, constructing PrevHpvpCandList in a manner similar to the encoding end;

s2, constructing an initial HpvpCandList according to reuse _ flag;

s3, decoding to obtain the length of the equivalent string;

s4, decoding the obtained index idx, if idx is smaller than the length of HpvpCandList, taking out pv from HpvpCandList according to idx, and then obtaining the value of the pixel from the constant position specified by pv; otherwise, decoding the code stream to obtain a pixel value, and expanding the reference pixel candidate list by using the value.

As shown in fig. 6, a simplified block diagram of a communication system provided by one embodiment of the present application is shown. Communication system 200 includes a plurality of devices that may communicate with each other over, for example, network 250. By way of example, the communication system 200 includes a first device 210 and a second device 220 interconnected by a network 250. In the embodiment of fig. 6, the first device 210 and the second device 220 perform unidirectional data transfer. For example, the first apparatus 210 may encode video data, such as a video picture stream captured by the first apparatus 210, for transmission over the network 250 to the second apparatus 220. The encoded video data is transmitted in the form of one or more encoded video streams. The second device 220 may receive the encoded video data from the network 250, decode the encoded video data to recover the video data, and display a video picture according to the recovered video data. Unidirectional data transmission is common in applications such as media services.

In another embodiment, the communication system 200 includes a third device 230 and a fourth device 240 that perform bi-directional transmission of encoded video data, which may occur, for example, during a video conference. For bi-directional data transfer, each of the third device 230 and the fourth device 240 may encode video data (e.g., a stream of video pictures captured by the devices) for transmission over the network 250 to the other of the third device 230 and the fourth device 240. Each of third apparatus 230 and fourth apparatus 240 may also receive encoded video data transmitted by the other of third apparatus 230 and fourth apparatus 240, and may decode the encoded video data to recover the video data, and may display video pictures on an accessible display device according to the recovered video data.

In the embodiment of fig. 6, the first device 210, the second device 220, the third device 230, and the fourth device 240 may be computer devices such as a server, a personal computer, and a smart phone, but the principles disclosed herein may not be limited thereto. The embodiment of the application is suitable for a Personal Computer (PC), a mobile phone, a tablet Computer, a media player and/or a special video conference device. Network 250 represents any number of networks that communicate encoded video data between first device 210, second device 220, third device 230, and fourth device 240, including, for example, wired and/or wireless communication networks. The communication network 250 may exchange data in circuit-switched and/or packet-switched channels. The network may include a telecommunications network, a local area network, a wide area network, and/or the internet. For purposes of this application, the architecture and topology of network 250 may be immaterial to the operation of the present disclosure, unless explained below.

By way of example, fig. 7 illustrates the placement of a video encoder and a video decoder in a streaming environment. The subject matter disclosed herein is equally applicable to other video-enabled applications including, for example, video conferencing, Digital TV (television), storing compressed video on Digital media including CD (Compact Disc), DVD (Digital Versatile Disc), memory stick, and the like.

The streaming system may include an acquisition subsystem 313, which may include a video source 301, such as a digital camera, that creates an uncompressed video picture stream 302. In an embodiment, the video picture stream 302 includes samples taken by a digital camera. The video picture stream 302 is depicted as a thick line to emphasize a high data amount video picture stream compared to the encoded video data 304 (or encoded video code stream), the video picture stream 302 may be processed by an electronic device 320, the electronic device 320 comprising a video encoder 303 coupled to a video source 301. The video encoder 303 may comprise hardware, software, or a combination of hardware and software to implement or embody aspects of the disclosed subject matter as described in greater detail below. The encoded video data 304 (or encoded video codestream 304) is depicted as a thin line compared to the video picture stream 302 to emphasize the lower data amount of the encoded video data 304 (or encoded video codestream 304), which may be stored on the streaming server 305 for future use. One or more streaming client subsystems, such as client subsystem 306 and client subsystem 308 in fig. 7, may access streaming server 305 to retrieve

copies

307 and 309 of encoded video data 304. The client subsystem 306 may include, for example, a video decoder 310 in an electronic device 330. Video decoder 310 decodes incoming copies 307 of the encoded video data and generates an output video picture stream 311 that may be presented on a display 312, such as a display screen, or another presentation device (not depicted). In some streaming systems, encoded video data 304, video data 307, and video data 309 (e.g., video streams) may be encoded according to certain video encoding/compression standards.

It should be noted that

electronic devices

320 and 330 may include other components (not shown). For example, the electronic device 320 may include a video decoder (not shown), and the electronic device 330 may also include a video encoder (not shown). Wherein the video decoder is configured to decode the received encoded video data; a video encoder is used to encode video data.

It should be noted that the technical solution provided in the embodiment of the present application may be applied to the h.266/VVC standard, the h.265/HEVC standard, the AVS (e.g., AVS3), or the next-generation video codec standard, and the embodiment of the present application does not limit this.

In current iso-string prediction, the construction of the reference pixel candidate list only uses the information of the historical coding and decoding blocks, and ignores the correlation between adjacent pixels in the video image. Due to insufficient information of the historical coding blocks, efficient reference pixel prediction cannot be realized, and coding efficiency is affected.

The method comprises the steps of combining spatial domain adjacent pixels to derive a reference pixel candidate list of an equivalent string, namely, responding to video coding and decoding performed by the equivalent string mode, and determining spatial domain adjacent pixels of a current coding and decoding block, wherein the spatial domain adjacent pixels are reconstructed pixels of which the distance from the spatial domain adjacent pixels to the current coding and decoding block is within a specified distance range; constructing a spatial domain adjacent pixel list of the current coding and decoding block based on the pixel information of the target reference pixel in the spatial domain adjacent pixels of the current coding and decoding block; and constructing a preselected reference pixel candidate list of the current coding and decoding block based on the spatial domain adjacent pixel list.

By the scheme, when the predicted value of the equivalent string is encoded based on the preselected reference pixel candidate list, the encoding efficiency is improved. The scheme can be applied to video codecs or products of video compression which use equivalent strings.

In the method provided by the embodiment of the present application, the execution main body of each step may be a decoding-side device or an encoding-side device. In the process of video decoding and video encoding, the technical scheme provided by the embodiment of the application can be adopted to reconstruct the image. The decoding end device and the encoding end device can be computer devices, and the computer devices refer to electronic devices with data calculation, processing and storage capabilities, such as PCs, mobile phones, tablet computers, media players, special video conference devices, servers and the like.

In addition, the methods provided herein can be used alone or in any order in combination with other methods. The encoder and decoder based on the methods provided herein may be implemented by 1 or more processors or 1 or more integrated circuits.

Referring to fig. 8, a flowchart of a reference pixel candidate list construction method according to an embodiment of the present application is shown. For convenience of explanation, only the steps executed by the computer device will be described. The method may include the steps of:

step 801, in response to video encoding and decoding through the equivalent string mode, determining the spatial domain adjacent pixels of the current encoding and decoding block.

The spatial neighboring pixels are reconstructed pixels whose distance from the current coding and decoding block is within a specified distance range.

For example, taking the codec block above the current coding block and/or the codec block on the left side as the reconstructed codec block as an example, the spatial domain neighboring pixels may be the pixels in the row closest to the current coding block and/or the pixels in the column closest to the left of the current coding block; or, the pixels adjacent to the airspace may be pixels in a second row above and to the left of the current coding block and/or pixels in a second column to the left of the current coding block; or, the pixels adjacent to the airspace may be pixels in the two closest rows above the current coding block and/or pixels in the two closest columns to the left of the current coding block.

For example, please refer to fig. 9, which illustrates a schematic diagram of spatial neighborhood pixel positions according to an embodiment of the present application. As shown in fig. 9, there are a reconstructed codec block 92, a reconstructed codec block 93, and a reconstructed codec block 94 above and to the left of the current codec block 91, respectively. The region formed by the reconstructed codec block 92, the reconstructed codec block 93 and the reconstructed codec block 94 includes a first pixel row 91a and a second pixel row 91b spatially adjacent to the current codec block 91, where the first pixel row 91a is a first row of pixels above the current codec block 91, and the second pixel row 91b is a second row of pixels above the current codec block 91; the region formed by the reconstructed codec block 92, the reconstructed codec block 93 and the reconstructed codec block 94 includes a first pixel column 91c and a second pixel column 91d which are spatially adjacent to the current codec block 91, the first pixel column 91c is a first column of pixels on the left of the current codec block 91, and the second pixel column 91d is a second column of pixels on the left of the current codec block 91; in the image shown in fig. 9, the codec may determine each pixel in the first pixel row 91a and the first pixel column 91c as a spatial neighboring pixel of the current codec block 91; alternatively, the codec may determine each pixel in the second pixel row 91b and the second pixel column 91d as a spatial neighboring pixel of the current codec block 91; alternatively, the codec may determine each pixel in the first pixel row 91a, the first pixel column 91c, the second pixel row 91b, and the second pixel column 91d as a spatial neighboring pixel of the current codec block 91.

The scheme shown in fig. 9 only uses some or all of the pixels in two rows/two columns above and to the left of the current coding/decoding block as the spatial neighboring pixels of the current coding/decoding block 91. In other possible implementations, the spatial neighboring pixels of the current codec block 91 may be more or less, for example, some or all of the three rows/three columns of pixels above and to the left of the current codec block may be used as the spatial neighboring pixels of the current codec block 91.

Step 802, constructing a spatial domain neighboring pixel list based on the pixel information of the target reference pixel in the spatial domain neighboring pixels of the current coding and decoding block.

or, the position information comprises coordinates of the corresponding pixel in the row of largest coding units LCU;

In the embodiment of the present application, the pixel information of the spatial neighboring pixel includes the position of the pixel, such as the coordinates of the pixel in the image, or the coordinates of the pixel in the LCU row; for the image of YUV420, the pixel information of the spatial neighboring pixel may also be the coordinates of the pixel on the luminance image.

Alternatively, the pixel information of the spatial domain neighboring pixels includes a pixel value of the pixel.

Or, the pixel information of the spatial domain neighboring pixels includes both the position and the pixel value of the pixel.

In one possible implementation, the target reference pixel is all of the spatially neighboring pixels of the current coded block;

or, the target reference pixel is a pixel at a position determined based on the size of the current coding and decoding block in a spatial neighboring pixel of the current coding and decoding block.

In the embodiment of the present application, the target reference pixel may be from the following selectable positions in the spatial neighboring pixels:

1) the target reference pixels comprise reconstructed pixels that are immediately adjacent to the current codec block.

For example, taking the width of the current codec block as W and the height as H (i.e. the width is W pixels and the height is H pixels) as an example, W reconstructed pixels in the upper row and H reconstructed pixels in the left column of the current codec block are added to the spatial neighboring pixel list (for example, numbered as list L1).

2) The target reference pixels comprise reconstructed pixels that are not immediately adjacent to the current codec block.

For example, the pixel information of the W reconstructed pixels in the second row above the current codec block and the H reconstructed pixels in the second column to the left are added to the list L1.

3) The target reference pixels comprise partially reconstructed pixels that are directly and/or indirectly adjacent to the current codec block.

For example, referring to fig. 10, which shows a numbering diagram of spatial neighboring pixels according to an embodiment of the present application, taking an example that a target reference pixel includes a portion of reconstructed pixels directly adjacent to a current codec block, pixel information with position numbers TL, T0, T [ W/2] (or T [ W/2-1]), T [ W ] (or T [ W-1]), and L0, L [ H/2] (or L [ H/2-1]), L [ H ] (or L [ H-1]) may be added to a spatial neighboring pixel list.

4) The target reference pixels are all or some of the spatially neighboring pixels determined based on the size (e.g., height and/or width) of the current codec block.

In the embodiment of the present application, the codec may also select the target reference pixel according to the size of the current decoding block.

For example, taking fig. 10 as an example, if the width of the current codec block is less than 32, W pieces of pixel information with pixel positions { T [0], T [1], …, T [ W-1] } are added to the list L1, and if the height of the current codec block is less than 32, H pieces of pixel information with pixel positions { L [0], L [1], …, L [ H-1] } are added to the list L1.

Or, taking fig. 10 as an example, if the width of the current codec block is less than 32, add W/2 pixels (assuming that W is an even number) with pixel positions { T [0], T [2], …, T [ W-2] } to the list L1, and if the height of the current codec block is greater than 32, add H/2 pixels (assuming that H is an even number) with pixel positions { L [0], L [2], …, L [ H-2] } to the list L1.

In a possible implementation manner, when the spatial neighboring pixel list is constructed based on the pixel information of the target reference pixel in the spatial neighboring pixels of the current coding and decoding block, the pixel information of the target reference pixel is filled into the spatial neighboring pixel list according to a specified filling order.

In the embodiment of the present application, when constructing the spatial neighboring pixel list L1, the codec may fill the pixel information of each target reference pixel into the list L1 according to a certain priority order.

In one possible implementation, when the codec fills the pixel information of the target reference pixel into the spatial neighboring pixel list according to the specified filling order, the following operations may be performed:

filling the pixel information of each target reference pixel positioned above the current coding and decoding block into the airspace adjacent pixel list, and filling the pixel information of each target reference pixel positioned on the left side of the current coding and decoding block into the airspace adjacent pixel list;

or filling the pixel information of each target reference pixel positioned on the left side of the current coding and decoding block into the airspace adjacent pixel list, and then filling the pixel information of each target reference pixel positioned on the upper side of the current coding and decoding block into the airspace adjacent pixel list;

or, alternately filling the pixel information of each target reference pixel above the current coding and decoding block and the pixel information of each target reference pixel at the left of the current coding and decoding block into the spatial domain neighbor pixel list.

In the embodiment of the present application, the codec may fill the pixel information of each target reference pixel into the list L1 in the following selectable order:

1) filling target reference pixels above the current coding and decoding block, and filling target reference pixels on the left of the current coding and decoding block;

for example, the codec fills the list L1 with the pixel information of each target reference pixel above the current codec block in sequence from left to right, and fills the list L1 with the pixel information of each target reference pixel on the left of the current codec block in sequence from top to bottom after the pixel information of each target reference pixel above the current codec block is filled.

2) Filling target reference pixels on the left side of the current coding and decoding block, and filling target reference pixels above the current coding and decoding block;

for example, the codec fills the list L1 with the pixel information of each target reference pixel on the left of the current codec block in sequence from top to bottom, and fills the list L1 with the pixel information of each target reference pixel on the top of the current codec block in sequence from left to right after the pixel information of each target reference pixel on the left of the current codec block is filled.

3) The target reference pixels to the left of the current codec block and above the current codec block are alternately filled, for example, taking FIG. 10 as an example, the codec may sequentially fill the pixel information of the target reference pixels into a list L1 in the order of L [0], T [0], L [1], T [1], …, L [ H-1], and T [ W-1 ].

In one possible implementation, when the pixel information of the target reference pixel is filled into the spatial neighboring pixel list according to a specified filling order, the codec performs the following operations:

In a possible implementation manner, the difference between the pixel values of the first reference pixel and each of the existing reference pixels in the spatial neighboring pixel list includes a difference between the pixel values of the first reference pixel and each of the existing reference pixels in the spatial neighboring pixel list in the luminance component and the chrominance component;

alternatively, the first and second electrodes may be,

the difference between the pixel values of the first reference pixel and the existing reference pixels in the spatial neighboring pixel list comprises the difference between the pixel values of the first reference pixel and the existing reference pixels in the spatial neighboring pixel list in the luminance component.

In the embodiment of the present application, for a first reference pixel in each target reference pixel, when the pixel information of the first reference pixel needs to be filled into the list L1 according to the filling order, the codec may perform the following duplication checking policy:

1) corresponding pixel information is directly filled into a list L1 without duplicate checking;

2) when the absolute value of the pixel difference between the first reference pixel and any pixel in the list in each component is greater than a preset threshold (where the preset threshold is 0, that is, it means that the component pixel values are not repeated), the list L1 is filled with the pixel information of the first reference pixel, which needs to be compared with the Y, U, V component pixel values of the existing pixels in the list L1.

3) It is necessary to compare the Y component pixel value of the existing pixel in the list L1, and when the absolute value of the pixel difference between the first reference pixel and any pixel in the list in the Y component is greater than the preset threshold, the list L1 is filled with the pixel information of the first reference pixel.

In one possible implementation, when the first reference pixel is not available, the pixel value of the nearest available reference pixel of the first reference pixel is taken as the pixel value of the first reference pixel; for example, using a rule like intra prediction reference pixel extension, the pixel value of the first reference pixel is set to the value of the nearest available reference pixel.

Alternatively, when a first reference pixel is not available, the pixel value of the first reference pixel is set to a default value.

Alternatively, when the first reference pixel is unavailable, the first reference pixel is skipped.

And step 803, constructing a preselected reference pixel candidate list of the current coding and decoding block based on the spatial domain adjacent pixel list.

In one possible implementation, in constructing the pre-selected candidate list of reference pixels for the current codec block based on the spatial neighboring pixel list, the codec may perform the following operations:

acquiring the spatial domain adjacent pixel list as a preselected reference pixel candidate list of the current coding and decoding block;

alternatively, the first and second electrodes may be,

merging the spatial adjacent pixel list and the historical reference pixel candidate list to obtain a preselected reference pixel candidate list of the current coding and decoding block; the historical reference pixel candidate list is constructed based on a reference pixel candidate list of a reconstructed coding and decoding block;

alternatively, the first and second electrodes may be,

and arranging the historical reference pixel candidate list based on the spatial domain adjacent pixel list to obtain a preselected reference pixel candidate list of the current coding and decoding block.

In this embodiment, the codec may derive a preselected candidate list of reference pixels of the current codec block based on the spatial neighboring pixel list, such as a list L, where the length of the list L may be set to N, and the maximum length of N is smaller than the preset threshold N _ T. The codec may derive the list L in the following alternative ways:

1) list L consists of L1, i.e. N ═ N1, N1 is the length of list L1;

2) list L consists of list L1 and list L2, and list L2 records the equivalent string reference pixel information of the historically decoded block, with a length of N2.

In one possible implementation, the merging the spatial neighboring pixel list and the historical reference pixel candidate list to obtain a preselected reference pixel candidate list of the current codec block includes:

sequentially filling each pixel information in the spatial domain adjacent pixel list into the preselected reference pixel candidate list, and sequentially filling each pixel information in the historical reference pixel candidate list into the preselected reference pixel candidate list until the filling of each pixel information in the historical reference pixel candidate list is completed, or until the quantity of the pixel information in the preselected reference pixel candidate list reaches a quantity threshold;

alternatively, the first and second electrodes may be,

and filling each pixel information in the historical reference pixel candidate list into the preselected reference pixel candidate list in sequence, and filling each pixel information in the spatial domain adjacent pixel list into the preselected reference pixel candidate list in sequence until the filling of each pixel information in the spatial domain adjacent pixel list is completed, or until the number of the pixel information in the preselected reference pixel candidate list reaches a number threshold.

For example, when list L consists of list L1 and list L2, the codec may derive list L in the following alternative ways:

1) filling each pixel information in the list L1 in the list L, and filling each pixel information in the list L2 in the list L; if the length of list L reaches N _ T during the filling process, or the filling of pixel information in both list L1 and list L2 is completed, the filling is ended and list L is derived.

2) Filling each pixel information in the list L2 in the list L, and filling each pixel information in the list L1 in the list L; if the length of list L reaches N _ T during the filling process, or the filling of pixel information in both list L1 and list L2 is completed, the filling is ended and list L is derived.

for first pixel information, acquiring an absolute value of a difference value between a pixel value corresponding to the first pixel information and a pixel value corresponding to each piece of pixel information existing in the preselected reference pixel candidate list; the first pixel information is any one of the spatial neighboring pixel list and the historical reference pixel candidate list;

in response to the absolute value of the difference being greater than a second absolute value threshold, the first pixel information is populated into the preselected reference pixel candidate list.

In the embodiment of the present application, when the codec fills one pixel information in the list L1 or the list L2 into the list L in the process of constructing the list L, the codec may perform a duplicate checking on the pixel information through a duplicate checking policy, that is, it is checked whether there is pixel information in the list L where the current pixel information corresponds to the same or similar pixel value.

The duplication checking strategy in the process of filling the list L with information of one pixel in the list L1 or the list L2 is similar to the duplication checking strategy in the process of filling the list L1 with information of the first reference pixel, and is not described herein again.

In one possible implementation, when the historical reference pixel candidate list is arranged based on the spatial neighboring pixel list to obtain the preselected reference pixel candidate list of the current codec block, the codec may perform the following operations:

for second pixel information, obtaining the absolute value of the difference between the pixel value corresponding to the second pixel information and the pixel value corresponding to each pixel information in the spatial domain adjacent pixel list; the second pixel information is any one of pixel information sequentially selected from the historical reference pixel candidate list;

In the embodiment of the present application, the codec may perform filtering sorting on the list L2 according to the list L1 to obtain a list L3, where the list L may be composed of the list L3. In an exemplary scenario, the above procedure for obtaining the list L3 may be as follows:

assuming that the length of the list L2 is N2, the pixel values of each piece of pixel information in the list L2 are sequentially compared with the pixel values of each piece of pixel information in the list L1 in a specified order (e.g., forward/reverse order), and if the absolute value of the difference between the pixel value of one piece of pixel information in the list L2 and the pixel value of any one piece of pixel information in the list L1 is less than or equal to a third absolute value threshold, the pixel information is filled into the list L3.

When the pixel value of one pixel information in the list L2 is sequentially compared with the pixel values of the pixel information in the list L1, Y, U, V component pixel values of the two pixel information may be compared, and if the absolute value of the difference between the pixel values of the components of the two pixel information is less than or equal to the third absolute value threshold, the pixel information in the list L2 is filled into the list L3.

Alternatively, when the pixel value of one pixel information in the list L2 is sequentially compared with the pixel values of the respective pixel information in the list L1, the Y component pixel values of the two pixel information may be compared, and if the absolute value of the difference between the Y component pixel values of the two pixel information is less than or equal to the third absolute value threshold, the pixel information in the list L2 is filled into the list L3.

In one possible implementation, when the historical reference pixel candidate list is arranged based on the spatial neighboring pixel list to obtain the preselected reference pixel candidate list of the current codec block, the codec performs the following operations:

for the fifth pixel information, obtaining the absolute value of the difference between the pixel value corresponding to the fifth pixel information and the pixel value corresponding to each pixel information in the spatial domain adjacent pixel list; the fifth pixel information is any one of pixel information sequentially selected from the historical reference pixel candidate list;

in response to the pixel value corresponding to the fifth pixel information, and the absolute value of the difference between the pixel values corresponding to the sixth pixel information being less than or equal to a fourth absolute value threshold, populating the fifth pixel information into a first candidate list; the sixth pixel information is any one of the spatial neighboring pixel lists;

filling fifth pixel information into a second candidate list in response to the absolute value of the difference between the pixel value corresponding to the fifth pixel information and the pixel value corresponding to the sixth pixel information being greater than a fourth absolute value threshold;

filling each pixel information in the first candidate list and each pixel information in the second candidate list into the preselected reference pixel candidate list in turn; the position of each pixel information in the first candidate list in the preselected reference pixel candidate list precedes the position of each pixel information in the second candidate list in the preselected reference pixel candidate list.

In the embodiment of the present application, the codec may perform filtering sorting on the list L2 according to the list L1 to obtain a list L3 and a list L4, where the list L may be composed of a list L3 and a list L4. In an exemplary scenario, the above procedure for obtaining the list L may be as follows:

assuming that the length of L2 is N2, the pixel values of each piece of pixel information in list L2 are sequentially compared with the pixel values of each piece of pixel information in list L1 in a specified order (e.g., forward/reverse order), and if the absolute value of the difference between the pixel value of one piece of pixel information in list L2 and the pixel value of any one piece of pixel information in list L1 is less than or equal to a fourth absolute value threshold, that pixel information is filled into list L3. If the absolute value of the difference between the pixel value of one pixel information in the list L2 and the pixel value of any one pixel information in the list L1 is greater than the fourth absolute value threshold, the pixel information is filled into the list L4. List L is then composed by list L3 and list L4, in which pixel information belonging to list L3 is ranked ahead of pixel information belonging to list L4.

Wherein the preselected reference pixel candidate list corresponds to the PrevHpvpCandList.

Step 804, encoding/decoding the current encoding/decoding block based on the preselected reference pixel candidate list.

In the embodiment of the present application, after the codec constructs the preselected reference pixel candidate list, the codec may encode or decode the current codec block based on the preselected reference pixel candidate list.

The decoding method of the current codec block may be as follows:

1) deriving the reference pixel candidate list for the current codec block may be done as follows:

mode a, without decoding reuse _ flag, use list L as the initial reference pixel candidate list;

mode b, decode reuse _ flag, derive the initial reference pixel candidate list using a subset of list L.

2) Decoding the length of the equivalent string in the current coding and decoding block;

3) and decoding the reference pixels of the equivalent string in the current coding and decoding block, wherein the decoding process comprises the following steps:

and taking the position of the reference pixel from the reference pixel candidate list according to the decoded index idx, and then deriving the pixel value from the position to be used as a predicted value of the current string.

And if idx is larger than the length of the reference pixel candidate list, directly decoding from the code stream to obtain the numerical value of the reference pixel as the predicted value of the current string, and expanding the reference pixel candidate list by using the numerical value.

In summary, according to the scheme shown in the embodiment of the present application, before the current coding/decoding block is coded/decoded, the preselected reference pixel candidate list of the current coding/decoding block is constructed through the related information of the reconstructed pixels adjacent to the current coding block, so that in the subsequent coding/decoding process of the current coding/decoding block, the pixels in the adjacent airspace are introduced as references to perform equivalent string prediction coding/decoding, thereby being capable of expanding the reference pixel selection range in the equivalent string mode, and further improving the coding/decoding efficiency in the equivalent string mode.

Referring to fig. 11, a flow chart of the isostring prediction process provided in an embodiment of the present application is shown. As shown in fig. 11, before the codec encodes/decodes the current codec block 1101, a list L1 is constructed based on the positions of neighboring spatial domain pixels corresponding to the current codec block 1101 in the reconstructed codec block (step S1), and then an initial reference pixel candidate list 1102 of the current codec block 1101 is derived in combination with the list L1 and a preselected reference pixel candidate list, i.e., a list L2 (e.g., PrevHpvpCandList) constructed and updated in the process of encoding/decoding the previous codec block (step S2), and then the equivalent strings in the current codec block 1101 are encoded/decoded based on the reference pixel candidate list 1102, and in the process of encoding/decoding, the reference pixel candidate list 1102 may be updated (step S3), and after the encoding/decoding of the current codec block 1101 is finished, the list L2 is updated by the reference pixel candidate list.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Referring to fig. 12, a block diagram of a reference pixel candidate list construction apparatus according to an embodiment of the present application is shown. The device has the functions of realizing the method examples, and the functions can be realized by hardware executing corresponding software. The device may be the computer device described above, or may be provided on a computer device. The apparatus may include:

a pixel determining module 1201, configured to determine, in response to video encoding and decoding performed in an equivalent string mode, an airspace neighboring pixel of a current encoding and decoding block, where the airspace neighboring pixel is a reconstructed pixel whose distance from the current encoding and decoding block is within a specified distance range;

an adjacent pixel list constructing module 1202, configured to construct an airspace adjacent pixel list of the current coding and decoding block based on pixel information of a target reference pixel in the airspace adjacent pixels of the current coding and decoding block;

a reference pixel list construction module 1203, configured to construct a preselected reference pixel candidate list of the current coding and decoding block based on the spatial neighboring pixel list.

In a possible implementation manner, the neighboring pixel list constructing module 1202 is configured to fill the pixel information of the target reference pixel into the spatial neighboring pixel list according to a specified filling order.

In one possible implementation, the neighbor list construction module 1202 is configured to,

alternatively, the first and second electrodes may be,

In one possible implementation, the neighbor list construction module 1202 is further configured to,

In a possible implementation manner, the reference pixel list constructing module 1203 includes: a first list acquisition unit, a second list acquisition unit or a third list acquisition unit,

and the third list acquisition unit is configured to screen the historical reference pixel candidate list based on the spatial neighboring pixel list to obtain a preselected reference pixel candidate list of the current coding and decoding block.

alternatively, the first and second electrodes may be,

Referring to fig. 13, a block diagram of a computer device according to an embodiment of the present application is shown. The computer device may be the encoding side device described above, or may be the decoding side device described above. The computer device 130 may include: processor 131, memory 132, communication interface 133, encoder/decoder 134, and bus 135.

The processor 131 includes one or more processing cores, and the processor 131 executes various functional applications and information processing by executing software programs and modules.

The memory 132 may be used to store a computer program for execution by the processor 131 to implement the reference pixel candidate list construction method described above.

The communication interface 133 may be used to communicate with other devices, such as to transmit and receive audio-visual data.

The encoder/decoder 134 may be used to perform encoding and decoding functions, such as encoding and decoding audio-visual data.

The memory 132 is connected to the processor 131 by a bus 135.

Further, the memory 132 may be implemented by any type or combination of volatile or non-volatile storage devices, including, but not limited to: magnetic or optical disk, EEPROM (Electrically Erasable Programmable Read-Only Memory), EPROM (Erasable Programmable Read-Only Memory), SRAM (Static Random-Access Memory), ROM (Read-Only Memory), magnetic Memory, flash Memory, PROM (Programmable Read-Only Memory).

Those skilled in the art will appreciate that the configuration shown in FIG. 13 is not intended to be limiting of the computer device 130, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

In an exemplary embodiment, there is also provided a computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions which, when executed by a processor, implements the above-described reference pixel candidate list construction method.

In an exemplary embodiment, a computer program product or computer program is also provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the reference pixel candidate list construction method.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for constructing a candidate list of reference pixels, the method comprising:

2. The method of claim 1,

the target reference pixels are all pixels in the spatial domain adjacent pixels of the current coding and decoding block;

3. The method according to claim 1, wherein the pixel information of the target reference pixel comprises at least one of position information of the target reference pixel and pixel value information of the target reference pixel;

4. The method of claim 1, wherein constructing the list of spatially neighboring pixels based on pixel information of a target reference pixel of the spatially neighboring pixels of the current codec block comprises:

and filling the pixel information of the target reference pixel into the spatial adjacent pixel list according to the specified filling sequence.

5. The method according to claim 4, wherein the filling the pixel information of the target reference pixel into the spatial neighboring pixel list according to the specified filling order comprises:

6. The method according to claim 4, wherein the filling the pixel information of the target reference pixel into the spatial neighboring pixel list according to the specified filling order comprises:

7. The method of claim 6,

the difference value between the pixel value of each reference pixel existing in the spatial domain adjacent pixel list and the pixel value of each reference pixel existing in the spatial domain adjacent pixel list comprises the difference value of the pixel value of each reference pixel existing in the spatial domain adjacent pixel list and the pixel value of each reference pixel existing in the spatial domain adjacent pixel list on the luminance component and the chrominance component;

alternatively, the first and second electrodes may be,

8. The method of claim 6, further comprising:

9. The method of claim 1, wherein constructing a pre-selected candidate list of reference pixels for the current codec block based on the spatial neighborhood pixel list comprises:

alternatively, the first and second electrodes may be,

merging the spatial adjacent pixel list and a historical reference pixel candidate list to obtain a preselected reference pixel candidate list of the current coding and decoding block; the historical reference pixel candidate list is constructed based on a reference pixel candidate list of a reconstructed codec block;

alternatively, the first and second electrodes may be,

and arranging the historical reference pixel candidate list based on the spatial adjacent pixel list to obtain a preselected reference pixel candidate list of the current coding and decoding block.

10. The method of claim 9, wherein said merging said spatial neighborhood pixel list with a historical reference pixel candidate list to obtain a preselected reference pixel candidate list for said current codec block comprises:

alternatively, the first and second electrodes may be,

11. The method of claim 10, wherein said merging said spatial neighborhood pixel list with a historical reference pixel candidate list to obtain a preselected reference pixel candidate list for said current codec block comprises:

12. The method of claim 9, wherein said ranking said historical reference pixel candidate list based on said spatial neighborhood pixel list to obtain a preselected reference pixel candidate list for said current codec block comprises:

13. The method of claim 9, wherein said ranking said historical reference pixel candidate list based on said spatial neighborhood pixel list to obtain a preselected reference pixel candidate list for said current codec block comprises:

14. An apparatus for constructing a candidate list of reference pixels, the apparatus comprising:

the adjacent pixel list construction module is used for constructing an airspace adjacent pixel list of the current coding and decoding block based on the airspace adjacent pixels of the current coding and decoding block;

15. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the reference pixel candidate list construction method according to any one of claims 1 to 13.