CN115190312A

CN115190312A - Cross-component chromaticity prediction method and device based on neural network

Info

Publication number: CN115190312A
Application number: CN202110363526.6A
Authority: CN
Inventors: 马彦卓; 霍俊彦; 万帅; 杨付正; 王丹妮; 冉启宏
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-04-02
Filing date: 2021-04-02
Publication date: 2022-10-14
Anticipated expiration: 2041-04-02

Abstract

The invention discloses a cross-component chromaticity prediction method and a device based on a neural network, wherein the method comprises the following steps: acquiring adjacent regions of a coding block, wherein the coding block comprises a plurality of points to be predicted, and the adjacent regions comprise a plurality of reference points; carrying out data preprocessing on reference points in adjacent areas to obtain a plurality of prediction reference points; inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into a neural network model to realize prediction of the colorimetric value of the point to be predicted, wherein the relevant information of the prediction reference point comprises at least one of a brightness value and a colorimetric value of the prediction reference point, and the relevant information of the point to be predicted comprises at least one of a brightness value of the point to be predicted and a brightness difference value between the point to be predicted and the prediction reference point. According to the invention, after data with strong correlation with the pixel to be predicted is selected through data preprocessing, the neural network is used for prediction to obtain more accurate cross-component prediction, and a unified prediction method is used for inter-component prediction in video coding and decoding, so that the coding and decoding complexity is reduced.

Description

Cross-component chromaticity prediction method and device based on neural network

Technical Field

The invention belongs to the technical field of video coding, and particularly relates to a cross-component chrominance prediction method and device based on a neural network.

Background

h.266/VVC, a universal Video Coding standard (VVC for short), is a new generation of standard specifically built for 4K and 8K streaming media, which will help users store more high definition Video on devices and reduce data usage on networks. H.266/VVC represents the peak of the international standard for four generations of video coding. The video utilization will increase further on a global scale due to the dramatic increase in the efficiency of h.266/VVC coding.

In the new generation universal video coding standard h.266/VVC, there are 8 chroma intra Prediction modes in total, wherein Cross-component Linear Model Prediction (CCLM) is an added coding tool for chroma components. Since there is strong correlation between components of the same color space, removing redundancy by using the correlation between components is a good method for improving compression efficiency. The existing method for predicting by using an inter-component linear model in H.266/VVC comprises the following steps: constructing a prediction value of chroma according to the reconstructed brightness value of the same coding block, such as formula Pred _C (i,j)＝α·Rec _L (i, j) + β, wherein Pred _C (i, j) denotes the chroma predictor, rec, of the coding block _L And (i, j) represents the reconstructed brightness value in the same coding block, and alpha and beta are linear model parameters and are obtained by deducing the reconstructed brightness value and the chrominance value of adjacent reference points. Three inter-component linear model prediction modes are included in the H.266/VVC: INTRA _ LT _ CCLM, INTRA _ L _ CCLM, and INTRA _ T _ CCLM. Each mode will pick up a maximum of 4 neighboring reference points for derivation of the linear model parameters a and β. The three modes differ in the selection area of adjacent reference points used to derive the linear model parameters. After the selection areas of the three modes are determined, the selection areas are further used for linesSelection of adjacent reference points for sexual model parameter derivation. After the adjacent reference points are obtained, the derivation of the linear model parameters is carried out according to the number of the effective adjacent reference points, and after the linear model parameters are derived, the current coding block is processed by a formula Pred _C (i,j)＝α·Rec _L (i, j) + β.

However, in the conventional inter-component linear model prediction method, the same simple linear function Pred is used for each pixel in the coding block _C (i,j)＝α·Rec _L (i, j) + beta and the same alpha and beta are used for predicting the colorimetric value of each point, firstly, the linear function is fast in calculation but too simple, the error is large, secondly, all the predicted points of one coding block use the same function with the same parameters, and the large error is also introduced. That is, the prediction block accuracy obtained by the conventional inter-component linear model prediction method is low.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides a cross-component chroma prediction method and apparatus based on a neural network.

One embodiment of the present invention provides a cross-component chroma prediction method based on a neural network, including:

acquiring an adjacent region of a coding block, wherein the coding block comprises a plurality of points to be predicted, and the adjacent region comprises a plurality of reference points;

carrying out data preprocessing on the reference points in the adjacent areas to obtain a plurality of prediction reference points of points to be predicted;

inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into a neural network model to realize prediction of the colorimetric value of the point to be predicted, wherein the relevant information of the prediction reference point comprises at least one of a brightness value of the prediction reference point and a colorimetric value of the prediction reference point, and the relevant information of the point to be predicted comprises at least one of a brightness value of the point to be predicted and a brightness difference value of the point to be predicted and the prediction reference point.

In an embodiment of the present invention, the obtained adjacent regions of the coding block include at least one of an upper adjacent region, an upper right adjacent region, a left adjacent region, and a lower left adjacent region.

In an embodiment of the present invention, the obtaining a plurality of prediction reference points of the point to be predicted by performing data preprocessing on the reference points in the adjacent area includes:

storing the reference points in the adjacent areas;

and acquiring a plurality of prediction reference points of the points to be predicted from all the stored reference points.

In one embodiment of the present invention, obtaining a plurality of prediction reference points of the point to be predicted from all the stored reference points comprises:

selecting N reference points near a preset reference brightness value as prediction reference points, wherein N is an integer larger than 0;

and if the number of the selected prediction reference points is less than N, calculating filling reference chromatic values and filling reference brightness values, and supplementing the number of the prediction reference points into N by using the filling reference chromatic values and the filling reference brightness values.

In an embodiment of the present invention, obtaining a plurality of prediction reference points of the point to be predicted from all the stored reference points further includes:

sequencing all the stored reference points according to the reference brightness values;

and acquiring a plurality of prediction reference points of the points to be predicted from all the ordered reference points.

In an embodiment of the present invention, before acquiring the adjacent area of the coding block, the method further includes:

dividing the coding block to obtain a plurality of sub-coding blocks;

acquiring an adjacent area of the sub-coding block, wherein the sub-coding block comprises a plurality of points to be predicted, and the adjacent area comprises a plurality of reference points;

and carrying out data preprocessing on the reference points in the adjacent areas of the sub-coding blocks to obtain a plurality of prediction reference points of the points to be predicted in the sub-coding blocks.

In an embodiment of the present invention, the first neural network model used for training each of the sub-coding blocks is different.

In one embodiment of the present invention, further comprising:

the related information of the prediction reference point also comprises position information of the prediction reference point;

the related information of the point to be predicted also comprises the position information of the point to be predicted.

In one embodiment of the present invention, further comprising:

the neural network model comprises a plurality of sub neural network models;

and respectively inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into the plurality of sub-neural network models to realize the prediction of the colorimetric values of the point to be predicted.

Another embodiment of the present invention provides a neural network-based cross-component chroma prediction apparatus, including:

the device comprises a data acquisition module, a prediction module and a prediction module, wherein the data acquisition module is used for acquiring an adjacent region of a coding block, the coding block comprises a plurality of points to be predicted, and the adjacent region comprises a plurality of reference points;

the data processing module is used for carrying out data preprocessing on the reference points in the adjacent areas to obtain a plurality of prediction reference points of the points to be predicted;

and the data prediction module is used for inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into a neural network model to realize prediction of the colorimetric value of the point to be predicted, wherein the relevant information of the prediction reference point comprises at least one of a brightness value of the prediction reference point and the colorimetric value of the prediction reference point, and the relevant information of the point to be predicted comprises at least one of a brightness value of the point to be predicted and a brightness difference value of the point to be predicted and the prediction reference point.

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a cross-component chroma prediction method based on a neural network, which is characterized in that after data with strong correlation with current pixels are selected through data preprocessing, a neural network model is used for processing the preprocessed data to obtain the cross-component prediction of more accurate predicted values.

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Drawings

Fig. 1 is a schematic flowchart of a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 2 is a schematic diagram of an adjacent area of a coding block in a neural network-based cross-component chroma prediction method according to an embodiment of the present invention;

fig. 3 (a) to fig. 3 (h) are schematic diagrams illustrating selection conditions of adjacent areas of several coding blocks in a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a network structure in a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 5 (a) -5 (e) are schematic structural diagrams of cross-component chroma prediction based on a neural network under different input information provided by an embodiment of the present invention;

fig. 6 is a schematic network structure diagram in another cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a network structure in a further cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 8 is a schematic network structure diagram in another cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a network structure in a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of an encoder according to an embodiment of the present invention;

FIG. 11 is a block diagram of a decoder according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of a cross-component chroma prediction apparatus based on a neural network according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.

Example one

Referring to fig. 1, fig. 1 is a schematic flowchart of a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention. The embodiment provides a cross-component chromaticity prediction method based on a neural network, which comprises the following steps:

step 1, obtaining an adjacent area of a coding block, wherein the coding block comprises a plurality of points to be predicted, and the adjacent area comprises a plurality of reference points.

Specifically, an input video includes a plurality of coding blocks, each coding block includes a plurality of points to be predicted, please refer to fig. 2, fig. 2 is a schematic diagram of an adjacent region of the coding block in a cross-component chroma prediction method based on a neural network according to an embodiment of the present invention, for each coding block, the adjacent region includes an upper adjacent region, an upper right adjacent region, a left adjacent region, and a lower left adjacent region, and the adjacent region includes a plurality of reference points, and the adjacent region in this embodiment includes at least one of the upper adjacent region, the upper right adjacent region, the left adjacent region, and the lower left adjacent region, please refer to fig. 3 (a) to fig. 3 (h), fig. 3 (a) to fig. 3 (h) are schematic diagrams of selection conditions of the adjacent regions of several coding blocks in the cross-component chroma prediction method based on the neural network according to an embodiment of the present invention, and the adjacent regions of the coding blocks as shown in fig. 3 (a) to fig. 3 (h) are obtained in this embodiment.

And 2, carrying out data preprocessing on the reference points in the adjacent areas to obtain a plurality of prediction reference points of the points to be predicted.

Specifically, in this embodiment, the effective areas of all the adjacent areas obtained in step 1 are used as selection areas input by the network, data preprocessing is performed on all reference points of the selection areas, and the data preprocessing sequentially includes storing and truncating, where step 2 specifically includes step 2.1 and step 2.2:

and 2.1, storing the reference points in the adjacent areas.

Specifically, in this embodiment, the reference points of the adjacent areas of the coding blocks acquired in step 1 are first stored, and the storage manner is not limited. For example, for the storage manner corresponding to the coding block and the adjacent region shown in fig. 3 (h), the reference point of the upper adjacent region and the reference point of the right adjacent region may be directly stored, and then the reference point of the left adjacent region and the reference point of the left lower adjacent region are stored, where the reference points of all the adjacent regions include the reference luminance value and the reference chrominance value; for the storage manner corresponding to the coding block and the adjacent region shown in fig. 3 (h), the reference point of the left adjacent region and the reference point of the lower left adjacent region may be stored first, and then the reference point of the upper adjacent region and the reference point of the upper right adjacent region may be stored; in the way of alternately storing the reference points, the same way as the storage way corresponding to the coding block and the adjacent region shown in fig. 3 (h) can be to store the reference point of the upper adjacent region first, then store the reference point of the left adjacent region, then store the reference point of the upper right adjacent region, and then store the reference point of the lower left adjacent region; and other ways not limited to the above three storage ways. The storage modes selected in other fig. 3 (a) to 3 (g) are as above, and are not described again here. The adjacent area may be a reference point of multiple reference rows/columns, that is, the reference point of the upper adjacent area/the upper right adjacent area may be one row or multiple rows, and the reference point of the left adjacent area/the lower left adjacent area may be one column or multiple columns.

And 2.2, acquiring a plurality of prediction reference points of the points to be predicted from all the stored reference points.

Specifically, in the present embodiment, from the stored result in step 2.1, N reference points near the preset reference brightness value Y are intercepted as prediction reference points, where N is an integer greater than 0, and the preset reference brightness value is selected according to actual design requirements, for example, the present embodiment may select a brightness value of a point to be predicted as the preset reference brightness value. During interception, the absolute value | Δ Y | of the reference brightness difference between the point to be predicted and the adjacent reference points is calculated, and the N reference points with smaller absolute value | Δ Y | of the reference brightness difference in the reference points are used as prediction reference points. If the number of the truncated alternative prediction reference points is less than N, the filling reference chrominance value and the filling reference luminance value need to be calculated, and the number of the prediction reference points is supplemented to N by using the filling reference chrominance value and the filling reference luminance value, specifically:

if the number of the selected prediction reference points is less than N, the number of the prediction reference points may be N, for example, 1023 is used for filling the reference luminance values for a depth of 10 bits, and meanwhile, the corresponding reference chrominance values may be filled with a median 512 or 511 of chrominance effective values, etc., the number of the prediction reference points is N, or all the effective reference luminance values may be summed and averaged, the average value is used as a default filling value of the reference luminance values, the filling reference chrominance values are also the average values of all the effective reference chrominance values as default filling values, or the reference luminance value and the reference chrominance value of the last effective reference point of the point to be predicted may be directly used as a default filling reference luminance value and a filling reference chrominance value, respectively. If the number of the intercepted prediction reference points is 0, directly taking a chroma default value as a chroma prediction value without neural network prediction, specifically taking the default value in the H.266/VVC as a chroma intermediate value, wherein the chroma intermediate value is calculated by a method of 1< < (BitDepthC-1), bitDepthC is the bit depth of chroma, when the current video is an 8-bit video, the range of the chroma value is 0-255, the chroma intermediate value is 128, and the current video is a 10-bit video, the range of the chroma value is 0-1023, and the chroma intermediate value is 512.

In this embodiment, the step 2.2 of obtaining a plurality of prediction reference points of the point to be predicted from all the stored reference points further includes: sequencing all the stored reference points according to the reference brightness values; and acquiring a plurality of prediction reference points of the points to be predicted from all the ordered reference points.

Specifically, in this embodiment, all the reference points stored in step 2.1 are sorted according to their reference brightness values, and the sorting manner is not limited. For example, the reference brightness values may be uniformly sorted in the order from small to large; or uniformly sorting according to the sequence of the reference brightness values from small to large; sorting can also be achieved by building an index table, specifically: taking the reference brightness value of the current coding block as an index, establishing two index tables which are respectively a reference brightness value index table and a reference chromatic value index table corresponding to the reference brightness value index table, and subsequently, looking up and comparing the reference brightness value index tables to obtain reference points in a required sequence; or calculating the absolute value | delta Y | of the reference brightness difference between the point to be predicted and the adjacent reference point, and then sorting according to the sequence from small to large of the absolute values | delta Y | of all the reference brightness differences of the adjacent areas. Finally, the interception of the N prediction reference points in step 2.2 may be the interception that is not sorted in step 2.2, or may be the interception after sorting.

The present embodiment performs data preprocessing on an input video coding block: on one hand, the method is beneficial to enhancing the correlation between the absolute value delta Y of the reference brightness difference value and the contribution degree, wherein the contribution degree refers to the contribution degree of the colorimetric value of the reference point to the colorimetric value of the point to be predicted, so that the interference information when the function is fitted by a subsequent neural network is reduced, the convergence speed is accelerated, and the neural network can be more easily fitted to form a proper general nonlinear function; on the other hand, due to the data preprocessing, all points to be predicted of the coding block can share a set of network structure and network parameter weight, and the required storage space is greatly reduced; and finally, performing truncation operation in data preprocessing, namely selecting the absolute value of the smaller reference brightness difference value (delta Y) with the fixed quantity of N and the corresponding reference chromatic value thereof, so that the network structure can be suitable for coding blocks with any size, and can be used as a general module, thus being simple and efficient.

Taking N =8, for example, the adjacent regions include an upper adjacent region, an upper right adjacent region, a left adjacent region, and a lower left adjacent region, and by data preprocessing, the first 8 prediction reference points (including the reference luminance value and the reference chrominance value) with smaller absolute value | Δ Y | of the reference luminance difference are obtained in 4 selected regions of the upper adjacent region, the upper right adjacent region, the left adjacent region, and the lower left adjacent region. Firstly, 8 prediction reference points which are most suitable for the characteristics of the points to be predicted are selected from rich adjacent areas, so that the data of the input network of the points to be predicted have high-strength correlation with the points to be predicted, and the prediction error is favorably reduced; and secondly, eliminating the interference of most space position information, namely the absolute value | delta Y | of the reference brightness difference value at each position of adjacent areas is different from the information represented by different points to be predicted, and sequencing the absolute value | delta Y | of the reference brightness difference value and the corresponding reference chromatic value thereof to ensure that the absolute value | delta Y | of the reference brightness difference value at the same position of the points to be predicted are approximately consistent in characteristics, thereby greatly improving the convergence speed and the prediction accuracy of a subsequent neural network.

Taking the prediction reference selection of the neighboring areas corresponding to fig. 3 (c), fig. 3 (d), and fig. 3 (e) as an example: for a coding block of W × H, defining a width of an upper neighboring region W '(the corresponding height defaults to 1), and a height of a left selected region H' (the corresponding width defaults to 1), then in fig. 3 (c), prediction reference point selection is performed in the upper neighboring region and the left neighboring region, the selected neighboring region W '= W, H' = H, in fig. 3 (d), the upper neighboring region and the upper right neighboring region are selected, the selected neighboring region W '= W + H is set, and H' =0 is set, in fig. 3 (e), selection is performed in the left neighboring region and the lower left neighboring region, the selected neighboring region H '= W + H is set, and W' =0 is set, in this embodiment, a range of the upper neighboring region + the upper right neighboring region is W '= min { W + H }, and a range of the left neighboring region + lower left neighboring region is H' = min { W + H, H }. After the selection areas of the three modes are determined, the selection of the prediction reference point is further performed in the selected adjacent areas.

And 3, inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into a neural network model to realize prediction of the colorimetric value of the point to be predicted, wherein the relevant information of the prediction reference point comprises at least one of a brightness value of the prediction reference point and a colorimetric value of the prediction reference point, and the relevant information of the point to be predicted comprises at least one of a brightness value of the point to be predicted and a brightness difference value of the point to be predicted and the prediction reference point.

Specifically, please refer to fig. 4, where fig. 4 is a schematic diagram of a network structure in a cross-component chrominance prediction method based on a neural network according to an embodiment of the present invention, where the embodiment performs prediction by using a neural network model, data input to the network during prediction may include related information of a prediction reference point and/or related information of a point to be predicted, the related information of the prediction reference point includes at least one of a luminance value of the prediction reference point and a chrominance value of the prediction reference point, and the related information of the point to be predicted includes at least one of a luminance value of the point to be predicted and a luminance difference value between the point to be predicted and the prediction reference point, that is, information input to the network may only have the related information of the prediction reference point, or only have the related information of the point to be predicted, or both the related information of the reference point to be predicted and the related information of the point to be predicted. The related information of the point to be predicted may include only the luminance value of the point to be predicted or the luminance difference between the point to be predicted and the prediction reference point, or may include both the luminance value of the point to be predicted and the luminance difference between the point to be predicted and the prediction reference point. Referring to fig. 5 (a) -5 (e), fig. 5 (a) -5 (e) are schematic structural diagrams of cross-component chroma prediction based on neural networks under different input information provided by an embodiment of the present invention, specifically:

taking N =8 as an example, please refer to fig. 5 (a), in this embodiment, the absolute value | [ Δ ] Y | of the difference between the luminance value of the prediction reference point obtained in step 2 and the luminance value of the point to be predicted is used as the input data of the neural network model, and the output data of the corresponding neural network model is the colorimetric value Pred _ C of the point to be measured. In this embodiment, the absolute value Δ Y of 8 luminance difference values after data preprocessing is used as a network input, and the colorimetric value Pred _ C corresponding to the point to be measured is directly obtained through prediction by a neural network model. The absolute value of the reference luminance difference after data preprocessing is used as input in the network, and the absolute value of the luminance difference is divided by the intermediate value of the luminance effective value to perform normalization, so that the absolute value of the input luminance difference can be normalized to be (0,2), for example, the intermediate value 511 or 512 is selected when the luminance is 10 bits deep, and the purpose is to improve the convergence speed of the network. The normalization layer Softmax is used for final output, and the Softmax is considered to be more sensitive to smaller gaps.

In this embodiment, the neural network model network structure is not limited, and may be formed by stacking a fully connected layer, an active layer Relu, and a normalization layer Softmax as shown in fig. 5 (a), or may be formed by stacking a convolutional layer and a fully connected layer, or may be formed by stacking only a convolutional layer, the active layer Relu may also be replaced by other activation functions, such as LeakyRelu, prilu, and the like, or may be used by combining a BN layer, an active layer Relu, and the like, and the normalization layer Softmax function may also be replaced by another function having a normalization function, such as Sigmoid, and the like. It can be seen that the neural network model structure of the present embodiment is not unique, and may be in other stacking manners, and the specific neuron number, the network layer number, the normalization method, the activation function, and the like may all be adjusted according to the specific target. The embodiment can use a small-scale neural network model, namely a lightweight neural network model, has low operation complexity and is easy to converge.

The relevant information of the prediction reference point in the network input information of this embodiment may also include the location information of the prediction reference point, the relevant information of the point to be predicted also includes the location information of the point to be predicted, please refer to fig. 5 (b) -5 (e), the input of the neural network model of this embodiment may also be a combination of the luminance difference value Δ Y |, the chromaticity value of the prediction reference point, a combination of the luminance value of the prediction reference point, the luminance value of the point to be predicted, and the chromaticity value of the prediction reference point shown in fig. 5 (b), a combination of the luminance difference value Δ Y |, the location information R of the prediction reference point/the point to be predicted, and the chromaticity value of the prediction reference point shown in fig. 5 (d), a combination of the luminance value of the prediction reference point, the luminance value of the point to be predicted, the location information R of the prediction reference point/the point to be predicted, and the chromaticity value of the prediction reference point shown in fig. 5 (d). The embodiment is also not limited to the network input in fig. 5 (a) to 5 (e), and more pieces of relevant information of the prediction reference point and relevant information of the point to be predicted may be input according to actual needs.

Referring to fig. 6, fig. 6 is a schematic diagram of a network structure in another cross-component chroma prediction method based on a neural network according to an embodiment of the present invention, and the embodiment may choose to perform enhancement processing on chroma prediction results through an enhancement network to improve the accuracy of data output by the neural network.

Please refer to fig. 7 and 8, fig. 7 is a schematic diagram of a network structure in another cross-component chrominance prediction method based on a neural network according to an embodiment of the present invention, and fig. 8 is a schematic diagram of a network structure in another cross-component chrominance prediction method based on a neural network according to an embodiment of the present invention, in this embodiment, after inputting the related information of the prediction reference points and the related information of the points to be predicted in any combination into the neural network model, the colorimetric value Pred _ C corresponding to the points to be predicted can be directly predicted, a reference weight value as shown in fig. 7 can be predicted, and a set of reference weight values as shown in fig. 8 can be predicted. For example, when N is 8, the weights of the 8 reference colorimetric values are dot-multiplied with the reference colorimetric values Ref _ C of the corresponding 8 prediction reference points, and then the dot-multiplied results are added to predict to obtain the colorimetric values Pred _ C of the points to be predicted.

Referring to fig. 9, fig. 9 is a schematic diagram of a network structure in a cross-component chromaticity prediction method based on a neural network according to an embodiment of the present invention, in which a set of reference weight values is obtained by calculation as shown in fig. 8, a set of reference weight values is subjected to network fusion processing to obtain a fusion reference weight value corresponding to a prediction reference point, and a chromaticity value of a point to be predicted in a coding block is predicted according to a reference chromaticity value Ref _ C of the prediction reference point and the corresponding fusion reference weight value.

In this embodiment, for the case of multiple input information in fig. 5 (b) to 5 (e), the neural network model may further include a plurality of sub-neural network models, and the multiple input information may be respectively input into different sub-neural network models, for example, the information input into the neural network model includes a chrominance value and a luminance difference value Δ Y | of a prediction reference point, the chrominance value and the luminance difference value Δ Y | of the prediction reference point may be respectively input into two different sub-neural network models, and the two sub-neural network models may be the same or different light neural networks. And the sub-neural network models output a group of reference weight values, a final fusion reference weight is obtained through data fusion network fusion processing, and the point multiplication of the fusion reference weight and the reference colorimetric value Ref _ C of the prediction reference point is used for realizing the prediction of the colorimetric value Pred _ C of the point to be predicted in the coding block.

Further, in this embodiment, before the step 1 of acquiring the adjacent area of the coding block, the method further includes: the method comprises the steps of dividing the coding block to obtain a plurality of sub-coding blocks, obtaining adjacent areas of the sub-coding blocks, enabling the sub-coding blocks to comprise a plurality of points to be predicted, enabling the adjacent areas to comprise a plurality of reference points, and conducting data preprocessing on reference points in the adjacent areas of the sub-coding blocks to obtain a plurality of prediction reference points of the points to be predicted in the sub-coding blocks. In this embodiment, all points to be predicted in the same coding block are predicted by using the same network structure, parameters and network weights, or the coding block is divided into a plurality of sub-coding blocks, adjacent regions of the sub-coding blocks are obtained, each sub-coding block includes a plurality of points to be predicted, each adjacent region includes a plurality of reference points, different sub-coding blocks adopt the same processing mode as the above complete coding block, each sub-coding block is trained, a network structure which is the same as or different from the first neural network model can be adopted, the network structure, the parameters and the network weights which are suitable for the points to be predicted in the sub-coding blocks are preferred, and even the points to be predicted can be used as one sub-coding block to use the network weights which independently belong to the point to be predicted. The data pre-processing of the sub-coding at this time is the same as the above-described complete coding block.

Referring to fig. 10 and 11, fig. 10 is a schematic diagram of an encoder structure provided in an embodiment of the present invention, and fig. 11 is a schematic diagram of a decoder structure provided in an embodiment of the present invention, where the neural network-based cross-component chroma prediction method provided in this embodiment may be applied to a video codec chip, and performs inter-component prediction using a uniform prediction method, and reduces the complexity of coding and decoding, where the prediction method affects intra-frame prediction portions in a video coding and decoding frame, and may be specifically applied to chroma value prediction portions in intra-frame prediction, and acts on both a coding end and a decoding end.

In order to verify the effectiveness of the cross-component chroma prediction method based on the neural network, the following tests are performed: the new chroma prediction mode added in the H.266/VVC reference software VTM10.0 competes with the original LT _ LM, L _ LM, T _ LM and other traditional chroma prediction modes. Under All Intra (AI) conditions, the test sequences required for JFET (both in YUV420 format) use 8 time domain intervals, and the BD-rate average variation on the YUV components is-0.27%, -1.54%, -1.84% (negative values are the result with gain).

In summary, in the cross-component chroma prediction method based on the neural network provided in this embodiment, after data with strong correlation with each current pixel is selected through data preprocessing, the preprocessed data is processed by using the lightweight neural network to obtain the cross-component prediction of a more accurate predicted value, because the data is preprocessed and the correlation of the data participating in operation is strong, the scale of the used neural network is small, the operation complexity is low and is easy to converge, and in a video coding and decoding chip, a uniform prediction method is used for inter-component prediction, so that the coding and decoding complexity is reduced; the method can not only improve the prediction accuracy among the color components, but also process coding blocks with any size by using a neural network, and can be used for various video coding technologies under the existing mixed coding framework, such as H.266/VVC, VC1, AVS3 and the like, so that the video coding efficiency is improved; from another perspective, it can be used for inter-component prediction in YUV color space, and also in other color spaces, such as RGB.

Example two

On the basis of the first embodiment, please refer to fig. 12, where fig. 12 is a schematic structural diagram of a cross-component chrominance prediction apparatus based on a neural network according to an embodiment of the present invention, the present embodiment provides a cross-component chrominance prediction apparatus based on a neural network, and the cross-component chrominance prediction apparatus based on a neural network includes:

the data acquisition module is used for acquiring an adjacent area of the coding block, wherein the coding block comprises a plurality of points to be predicted, and the adjacent area comprises a plurality of reference points.

Specifically, the adjacent area of the coding block acquired by the data acquisition module of this embodiment includes at least one of an upper adjacent area, an upper right adjacent area, a left adjacent area, and a lower left adjacent area.

And the data processing module is used for carrying out data preprocessing on the reference points in the adjacent areas to obtain a plurality of prediction reference points of the points to be predicted.

Specifically, in this embodiment, the obtaining of the plurality of prediction reference points of the point to be predicted by performing data preprocessing on the reference points in the adjacent area includes:

storing reference points in adjacent areas;

Further, obtaining a plurality of prediction reference points of the points to be predicted from all the stored reference points further comprises:

Further, obtaining a plurality of prediction reference points of the points to be predicted from all the ordered reference points comprises:

and if the number of the selected prediction reference points is less than N, calculating the filling reference chromatic value and the reference brightness value, and supplementing the number of the prediction reference points into N by using the filling reference chromatic value and the filling reference brightness value.

And the data prediction module is used for inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into the neural network model to realize the prediction of the colorimetric value of the point to be predicted, wherein the relevant information of the prediction reference point comprises at least one of a brightness value of the prediction reference point and the colorimetric value of the prediction reference point, and the relevant information of the point to be predicted comprises at least one of a brightness value of the point to be predicted and a brightness difference value of the point to be predicted and the prediction reference point.

Specifically, the relevant information of the prediction reference point in the data prediction module of the present embodiment further includes the position information of the prediction reference point; the related information of the point to be predicted also includes the position information of the point to be predicted.

Further, still include:

the neural network model comprises a plurality of sub neural network models;

and respectively inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into a plurality of sub neural network models to realize the prediction of the colorimetric value of the point to be predicted.

Further, before acquiring the adjacent area of the coding block, the method further includes:

dividing the coding block to obtain a plurality of sub-coding blocks;

acquiring adjacent regions of the sub-coding blocks, wherein the sub-coding blocks comprise a plurality of points to be predicted, and the adjacent regions comprise a plurality of reference points;

Furthermore, the neural network models adopted for predicting the colorimetric values of the points to be predicted of each sub-coding block are different.

The cross-component chroma prediction apparatus based on the neural network provided in this embodiment may implement the embodiment of the cross-component chroma prediction method based on the neural network described in the first embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A cross-component chroma prediction method based on a neural network is characterized by comprising the following steps:

2. The method of claim 1, wherein the obtained neighboring regions of the coding block comprise at least one of an upper neighboring region, an upper right neighboring region, a left neighboring region, and a lower left neighboring region.

3. The cross-component chromaticity prediction method based on the neural network as claimed in claim 1, wherein the data preprocessing of the reference points in the neighboring area to obtain a plurality of prediction reference points of the points to be predicted comprises:

storing the reference points in the adjacent areas;

4. The neural network-based cross-component chroma prediction method of claim 3, wherein obtaining a number of prediction reference points of points to be predicted from all stored reference points comprises:

and if the number of the selected prediction reference points is less than N, calculating filling reference chromatic values and reference brightness values, and supplementing the number of the prediction reference points into N by using the filling reference chromatic values and the filling reference brightness values.

5. The neural network-based cross-component chroma prediction method of claim 3, wherein obtaining a number of prediction reference points of points to be predicted from all stored reference points further comprises:

6. The method of claim 1, wherein obtaining the neighboring region of the coding block further comprises:

dividing the coding block to obtain a plurality of sub-coding blocks;

7. The method according to claim 6, wherein the neural network models used for predicting the chroma value of the point to be predicted for each sub-coding block are different.

8. The neural network-based cross-component chroma prediction method according to claim 1, wherein the information about the prediction reference point further includes position information of the prediction reference point;

the related information of the current point also comprises the position information of the point to be predicted.

9. The neural network-based cross-component chroma prediction method of claim 1, further comprising:

the neural network model comprises a plurality of sub neural network models;

and respectively inputting the relevant information of the prediction reference point and/or the relevant information of the point to be predicted into the plurality of sub neural network models to realize the prediction of the colorimetric values of the point to be predicted.

10. A cross-component chroma prediction apparatus based on a neural network, comprising: