CN117557807B

CN117557807B - Convolutional neural network image prediction method based on weighted filtering enhancement

Info

Publication number: CN117557807B
Application number: CN202410041032.XA
Authority: CN
Inventors: 马宾; 段泓韬; 段培永; 舒明磊; 刘兆伟; 方崇荣
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2024-01-11
Filing date: 2024-01-11
Publication date: 2024-04-02
Anticipated expiration: 2044-01-11
Also published as: CN117557807A

Abstract

The invention relates to the technical field of image processing, in particular to a convolutional neural network image prediction method based on weighted filtering enhancement. Dividing an original image into a smooth block group and a texture block group according to the texture complexity of the original image, and dividing the smooth block group and the texture block group into four pixel sets respectively according to pixel positions to generate a preprocessed image; the method utilizes convolution kernels with different sizes in the convolution neural network to have the capacities of multiple receptive fields and global optimality, improves a predictor for reversible information hiding, improves the precision and efficiency of image prediction, and improves the embedded capacity.

Description

Convolutional neural network image prediction method based on weighted filtering enhancement

Technical Field

The invention relates to the technical field of image processing, in particular to a convolutional neural network image prediction method based on weighted filtering enhancement.

Background

Reversible data hiding (Reversible Data Hiding, RDH) is an important branch of information security technology that can correctly extract secret information from the hidden carrier and recover it into the original carrier without distortion. It has been widely used in sensitive fields such as medicine and military with excellent performance.

At present, there are two main development ways of reversible data hiding algorithm, one way is to optimize the data embedding process and reduce the distortion of the marked image. Such as an RDH method based on differential expansion proposed by Tian et al, an RDH method based on histogram translation proposed by Ni et al, an RDH method based on two-dimensional histogram mapping proposed by Thodi et al, an RDH method based on prediction error expansion proposed by Thodi et al, and a reversible data hiding algorithm based on Code Division Multiple Access (CDMA) proposed by Ma; another way is to improve the prediction of the original image, establish a highly sparse prediction error plane, and thereby improve the data embedding performance of the RDH scheme. Such as the Differential Predictor (DP) proposed by Tian, the Gradient Adaptive Predictor (GAP) proposed by farlahpour, the Median Edge Direction Predictor (MEDP) proposed by Weinberger et al, the Bilinear Interpolation Predictor (BIP) proposed by Sachnev. Recently, he et al have proposed a two-pair Prediction Error Extension (PEE) strategy, whereby a new set of error pairs for a two-pair PEE can be obtained by considering each two adjacent sequence errors simultaneously. An error prediction algorithm based on multiple linear regression realizes accurate prediction of a target pixel by constructing a multiple linear regression function according to the consistency of pixel distribution of a local area of a natural image; wang et al propose a ridge regression-based RDH high-precision error prediction algorithm. By minimizing the square residual between the predicted pixel and the target pixel under the L2 parameter constraint, prediction accuracy is improved.

In the improvement of the prediction mode, there is a common disadvantage that only one or a few adjacent pixels can be used as the predicted pixel value, and the disadvantage that the correlation of more adjacent pixels cannot be used is generated, so that the reversible data hiding capacity is not high.

In recent years, convolutional Neural Networks (CNNs) have been attracting attention for their successful application in the fields of image classification, video classification, object detection, face recognition, image super-resolution, and the like. Pixel prediction also attempts to build a prediction for the current pixel. CNNs can be built and trained to accurately predict pixels. However, the image content distribution has a large influence on the performance of the predictor based on the convolutional neural network, the prediction accuracy of the pixel value is easily influenced, and the problem that the conventional method has low pixel utilization rate and easily generates prediction deviation is solved.

Disclosure of Invention

In view of the above, the present invention provides a convolutional neural network image prediction method based on weighted filtering enhancement, which is used for improving a predictor for reversible information hiding, improving the accuracy and efficiency of image prediction, and improving the embedded capacity.

In a first aspect, the present invention provides a convolutional neural network image prediction method based on weighted filtering enhancement, the method comprising:

step one, dividing an original image into a smooth block group and a texture block group according to texture complexity of the original image;

dividing the smooth block group and the texture block group in the first step into four pixel sets according to pixel positions, predicting all pixels of each pixel set by 8 pixels around the pixel set to generate a preprocessed image, and taking the preprocessed image as a training set, wherein the preprocessed image comprises a flat slider preprocessed image and a texture block preprocessed image;

and thirdly, training the convolution neural network enhanced by weighted filtering by utilizing the smooth block preprocessed image and the texture block preprocessed image in the second step, adjusting the weight of the pixel values around the target pixel by weighted filtering, and obtaining pixel prediction by mixing and expanding a convolutional HDC extended network receiving domain to generate a predicted image.

Optionally, the step one includes:

s1, calculating MSE of each image sub-block in an original image through a mean square error MSE formula, and setting a threshold T to measure texture complexity;

step S2, judging whether each image sub-block is a smooth area or not according to the threshold T set in the step S1;

step S3, according to the judgment of the step S2, when the MSE of the image sub-block is larger than the threshold T, the image sub-block is a texture area; when the MSE of the image sub-block is smaller than or equal to the threshold T, the image sub-block is a smooth area;

and S4, dividing the flat sliding block group and the texture block group according to the texture area and the smooth area in the step S3.

Optionally, the second step includes:

dividing the flat slider group and texture block group into red according to pixel positionYellow->Blue->Green and environment-friendlyFour sets of pixels; when red pixels are predicted, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When predicting yellow pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When predicting blue pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When the green pixel is predicted, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When the convolution kernel size is 3×3, performing pixel point prediction by using at most 8/9 pixel values; when the convolution kernel size is 5×5, pixel point prediction is performed using at most 16/25 pixel values.

Optionally, the convolution neural network with weighted filtering enhancement is composed of three networks, namely a feature extraction network, a weighted hybrid expansion convolution network WHDCnet and a pixel prediction network.

Optionally, the feature extraction network is used for extracting features of different layers of the image, and consists of three parallel convolution kernels, wherein the sizes of the convolution kernels are 3×3, 5×5 and 7×7 respectively; the regional features of different layers of the image are extracted from different receptive fields, and after three parallel convolution kernels are processed, a latent feature matrix with 96 dimensions is generated by superposing three feature matrices with 32 dimensions.

Optionally, the weighted hybrid expansion convolution network is used for expanding the receptive field range to assist in extracting image features, and is composed of a weighted filter kernel and three serial expansion convolution kernels, wherein the weighted filter kernel is used for balancing weight attributes of pixels with different distances, the three serial expansion convolution kernels have a size of 3×3, and expansion rates are 1,2 and 3 respectively; the images are input into a weighted mixed expansion convolution network, and a 32-channel feature map is obtained through wide-area feature extraction.

Optionally, the pixel prediction network is configured to optimize a feature matrix to generate a predicted image, where the predicted image is composed of four convolution blocks, and each convolution block is composed of three convolution layers and two activation functions, leakyRelu; the kernel size of each convolution layer is 3 multiplied by 3, the channels of the first three blocks are set to 128, the output of the last block is a predicted image, meanwhile, the outputs of different convolution blocks are connected by residual connection, and the low-dimensional features and the high-dimensional features are connected in series to collect image features; and calculates a mean square error loss from the predicted image and a quarter of the preprocessed image.

Optionally, the loss function loss in the training process is expressed as:

；

wherein,for the number of training data, +.>For the target image +.>For predicting an image +.>For deviation (I)>Is the ownership value in the network.

Optionally, embedding of data is also included;

the embedding of the data comprises:

step V1, determining the secret information S to be embedded, dividing the secret information S into four parts with the same size as S1, S2, S3 and S4,which respectively correspond to redYellow->Blue->Green->Four pixel sets;

step V2, adaptively allocating predictors for a smooth block preprocessing image and a texture block preprocessing image according to the division in the step V1, wherein the smooth block preprocessing image uses a smooth area predictor; texture block pre-processing the image using a texture region predictor;

step V3, calculating the prediction errors of the predicted image and the pixel set according to the allocation in the step V2, obtaining a prediction error plane and generating a prediction error histogram;

and V4, embedding the secret information S1, S2, S3 and S4 to be embedded into a prediction error plane in the step V3 by utilizing a histogram translation technology, wherein a subset part for embedding the secret information replaces an original subset part, so that a secret-loaded image after embedding the secret information is obtained.

In a second aspect, the present invention provides an electronic device comprising: one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions that, when executed by the apparatus, cause the apparatus to perform the convolutional neural network image prediction method based on weighted filtering enhancement in the first aspect or any of the possible implementations of the first aspect.

According to the technical scheme, the method comprises the steps of dividing an original image into a smooth block group and a texture block group according to texture complexity of the original image, dividing the smooth block group and the texture block group into four pixel sets according to pixel positions respectively, predicting all pixels of each pixel set by 8 pixels around the pixel set to generate a preprocessed image, and taking the preprocessed image as a training set, wherein the preprocessed image comprises a smooth block preprocessed image and a texture block preprocessed image; the method utilizes convolution kernels with different sizes in the convolution neural network to have the capacities of multiple receptive fields and global optimality, improves a predictor for reversible information hiding, improves the precision and efficiency of image prediction, and improves the embedded capacity.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a convolutional neural network image prediction method provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of a flat slider and texture block provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a pixel image according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another pixel image according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of another pixel image according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of another pixel image according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a 3×3 convolution kernel pixel utilization provided by an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a 5×5 convolution kernel pixel utilization provided by an embodiment of the present disclosure;

FIG. 9 is a flowchart of a data embedding method according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a convolutional neural network with weighted filter enhancement provided by an embodiment of the present invention;

FIG. 11 is a schematic diagram of a weighted filter kernel according to an embodiment of the present invention;

FIG. 12 is a graph showing a comparison of two predictors according to an embodiment of the present invention;

fig. 13 is a schematic diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this embodiment of the invention, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be understood that the term "and/or" as used herein is merely one way of describing an association of associated objects, meaning that there may be three relationships, e.g., a and/or b, which may represent: the first and second cases exist separately, and the first and second cases exist separately. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.

Fig. 1 is a flowchart of a convolutional neural network image prediction method provided in an embodiment of the present invention, as shown in fig. 1, where the method includes:

step one, dividing an original image into a smooth block group and a texture block group according to texture complexity of the original image.

In an embodiment of the present invention, step one includes:

and S1, calculating the MSE of each image sub-block in the original image through a Mean Square Error (MSE) formula, and setting a threshold T to measure texture complexity.

In the embodiment of the invention, the Mean Square Error (MSE) can be used for evaluating the variation degree of the image, so that the MSE is used for calculating the texture complexity of the image, m and n respectively represent the rows and columns of the image, I represents the original image block, IAVE represents the pixel average value of the image block, and the MSE of each sub-block of the image is calculated according to a formula.

Step S2, judging whether each image sub-block is a smooth area according to the threshold T set in the step S1.

FIG. 2 is a schematic diagram of a flat slider and a texture block according to an embodiment of the present invention, where, as shown in FIG. 2, the pixel values of the image block vary greatly and are set as texture areas; the pixel value of the image block is less changed and is set as a smooth area, and the image is divided into 16 blocks as shown in fig. 2, wherein the gray block is a texture block and the white block is a smooth block.

Dividing the smooth block group and the texture block group in the first step into four pixel sets according to pixel positions, predicting all pixels of each pixel set by 8 pixels around the pixel set to generate a preprocessed image, and taking the preprocessed image as a training set, wherein the preprocessed image comprises a flat slider preprocessed image and a texture block preprocessed image.

In the embodiment of the invention, the second step comprises the following steps:

as shown in fig. 3 to 6, the flat slider group and the texture block group are respectively divided into red according to pixel positions) Yellow ()>) Blue ()>) Green (+)>) Four sets of pixels; when red pixels are predicted, the image input to the predictor is +.>++/>The method comprises the steps of carrying out a first treatment on the surface of the When predicting yellow pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the Predictive blueIn the case of color pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When the green pixel is predicted, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the As shown in fig. 7, when the convolution kernel size is 3×3, pixel point prediction is performed using at most 8/9 pixel values, where pixels 1 to 8 represent available pixels, and 0 represents a pixel to be predicted; as shown in fig. 8, when the convolution kernel size is 5×5, pixel point prediction is performed using at most 16/25 pixel values, where pixels 1 to 16 represent available pixels and 0 represents a pixel to be predicted.

In the embodiment of the invention, the method also comprises the embedding of data;

fig. 9 is a flowchart of a data embedding method according to an embodiment of the present invention, as shown in fig. 9, taking data hiding of a yellow image set as an example, the embedding of data includes:

step V1, determining the secret information S to be embedded, and dividing the secret information S into four parts with the same size, namely S1, S2, S3 and S4, which respectively correspond to red (a)) Yellow ()>) Blue ()>）、Green (/ -for)>) Four pixel sets;

Note that the data extraction process is reverse to the order of the data embedding process.

And thirdly, respectively training a convolution neural network enhanced by weighted filtering by utilizing the flat slider preprocessed image and the texture block preprocessed image in the second step, adjusting the weight of the pixel values around the target pixel by the weighted filtering, and expanding a network receiving domain by mixed expansion convolution (HDC) to obtain pixel prediction so as to generate a predicted image.

In the embodiment of the invention, after the image preprocessing is finished, texture blocks and flat sliders are respectively used as training samples after pixel processing, block classification is optimized, and HDC is used for improving a convolutional neural network pixel predictor, so that the predictor can obtain more proper coefficients, and higher-precision pixel prediction is realized.

The convolutional neural network is used as one of the representative algorithms of deep learning, has two main characteristics of multiple receiving domains and global optimization capability, and is very suitable for image feature extraction, so that the convolutional neural network can be used for pixel prediction. The invention designs a pixel predictor for enhancing a convolutional neural network to realize accurate pixel prediction.

Fig. 10 is a schematic diagram of a convolutional neural network with weighted filtering enhancement provided in an embodiment of the present invention, and as shown in fig. 10, the convolutional neural network with weighted filtering enhancement is composed of three networks, which are a feature extraction network, a weighted hybrid expansion convolutional network (WHDCnet), and a pixel prediction network, respectively.

In the embodiment of the invention, the feature extraction network is used for extracting features of different layers of images, and consists of three parallel convolution kernels, wherein the sizes of the convolution kernels are 3×3, 5×5 and 7×7 respectively; convolution kernels with different sizes can extract the relation among pixels with different space sizes, so that accurate pixel prediction is realized; the regional features of different layers of the image are extracted from different receptive fields, and after three parallel convolution kernels are processed, a latent feature matrix with 96 dimensions is generated by superposing three feature matrices with 32 dimensions.

While the traditional expanded convolution can expand the receptive field range, but can cause partial pixel value loss and receptive field discontinuity, the weighted mixed expanded convolution network designed by the invention can optimize the defects.

In the embodiment of the invention, the weighted mixed expansion convolution network is used for expanding the receptive field range to assist in extracting image features so as to improve network efficiency and pixel prediction accuracy, and the weighted mixed expansion convolution network consists of a weighted filter kernel and three serial expansion convolution kernels, wherein the weighted filter kernel is used for balancing weight attributes of pixels at different distances; as shown in fig. 11, the numbers represent the utilization weights of different pixel positions, the higher weight is set at the position nearer to the target pixel, and the lower weight is set at the pixel value farther from the target pixel, so that surrounding pixels are reasonably utilized, and the pixel prediction performance is improved. Three serial expansion convolution kernels, wherein the size of each convolution kernel is 3 multiplied by 3, and the expansion rates are 1,2 and 3 respectively; the expansion rate is set in such a way that the receptive field range can be enlarged, and the problem of related pixel information loss can not be caused. The images are input into a weighted mixed expansion convolution network, and a 32-channel feature map is obtained through wide-area feature extraction.

In the embodiment of the invention, a pixel prediction network is used for optimizing a feature matrix to generate a prediction image, and the prediction image consists of four convolution blocks, wherein each convolution block consists of three convolution layers and two activation functions LeakyRelu; the kernel size of each convolution layer is 3×3, the channels of the first three blocks are set to 128, the output of the last block is a predicted image, and the outputs of different convolution blocks are connected by residual connection, so that the purpose is to enrich image characteristics and reduce network gradient degradation, thereby providing enough network optimization loss. The low-dimensional features and the high-dimensional features are connected in series to collect the image features, so that the network optimization speed is increased, and the pixel prediction precision is improved; and calculates a mean square error loss from the predicted image and a quarter of the preprocessed image.

To optimize the performance of the proposed predictor, 3000 images were randomly selected from the ImageNet dataset to train an improved pixel prediction scheme based on convolutional neural networks. The selected image is first modified to a gray scale image of size 512 x 512. These images are then segmented according to the method described above and are divided into two classes, texture sets and smooth sets, according to the complexity of the image block. The smooth set and the texture set are then used to train a convolutional neural network-based predictor, respectively. The present invention employs Adam optimizers to optimize the proposed pixel prediction algorithm. The batch size was set to 4.

In the embodiment of the invention, the loss function loss in the training process has the following formula:

；

The invention firstly provides a two-stage image preprocessing strategy. The image is segmented into non-overlapping blocks. The blocks are further divided into smooth and texture classes according to the complexity of the pixel distribution; then dividing the sub-block into four parts according to pixel positions, predicting pixels to be predicted by eight surrounding pixels, and respectively training the proposed convolutional neural network predictor by using a texture set and a smoothing set after two-stage preprocessing of the image, so as to obtain pixel predictors aiming at different region characteristics, obtain optimal coefficients, and further improve the accuracy of pixel prediction. In addition, a predictor enhanced by weighted filtering and mixed expansion convolution (HDC) is carefully designed, pixels with different distances can deviate from the characteristics of the current pixel to be predicted, the weighted filtering is used for adjusting the utilization weight of the pixel values around the target pixel, and the utilization weight can be reduced for singular points so as to effectively utilize the surrounding pixels. The HDC can enlarge the receiving domain of the network on the basis of not increasing the operation cost, and extract the depth correlation of adjacent pixels, so that the pixel prediction precision is improved.

In the invention, the image is divided into texture blocks and smooth blocks, and 8/9 pixel points can be used for prediction, so that the pixel predictor based on the convolutional neural network is well optimized. Meanwhile, the pixel value of the feature image is optimized through the weighting filter, global optimization capacity can be improved, correlation between adjacent pixels is further utilized through mixed expansion convolution (HDC), the receptive field range is further enlarged, and high-precision pixel prediction is promoted. The prediction performance of the reversible information hidden pixels is improved, and the embedding capacity of the carrier is improved. Table 1 shows the prediction performance comparisons of convolutional neural network error prediction and median edge predictor on Lena images, respectively.

TABLE 1

。

As shown in table 1, the mean square error, the mean value and the variance are respectively selected as the evaluation criteria of the prediction performance of the predictor, and it can be obtained that the prediction performance is far higher than the traditional prediction performance based on the convolutional neural network.

Fig. 12 is a schematic diagram of a comparison between two predictors according to an embodiment of the present invention, where, as shown in fig. 12, the peak point of error prediction of the convolutional neural network is far higher than the peak point of difference image between the error prediction image of the median edge predictor on the Lena image and the target pixel.

Compared with the prior art, the invention has the following beneficial effects:

1. the traditional pixel prediction strategy has low utilization rate of pixel points, and linear prediction is carried out on target pixels, so that global features cannot be perceived, and the problem of poor prediction accuracy is caused. The invention provides a novel image pixel preprocessing strategy, and 7/8 pixel prediction can be utilized. The number of pixels is increased, which is beneficial to improving the prediction accuracy.

2. The conventional prediction algorithm mechanically utilizes pixels with different distances in an image, cannot reasonably distribute pixel utilization weights, and pixel values with large differences from target pixels are easy to be excessively utilized, so that the problem of misalignment of prediction accuracy is caused. The invention uses the weighted filter kernel, which is beneficial to reasonably distributing the weights of pixels in different ranges.

In the embodiment of the invention, each step can be executed by the electronic equipment. For example, electronic devices include, but are not limited to, tablet computers, portable PCs, desktop PCs, and the like.

The embodiment of the invention provides a computer readable storage medium, which comprises a stored program, wherein the electronic equipment where the computer readable storage medium is located is controlled to execute the embodiment of the convolutional neural network image prediction method based on weighted filtering enhancement when the program runs.

Fig. 13 is a schematic diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 13, an electronic device 21 includes: the processor 211, the memory 212, and the computer program 213 stored in the memory 212 and executable on the processor 211, the computer program 213 when executed by the processor 211 implements the convolutional neural network image prediction method based on weighted filtering enhancement in the embodiment, and is not described herein in detail to avoid repetition.

The electronic device 21 includes, but is not limited to, a processor 211, a memory 212. It will be appreciated by those skilled in the art that fig. 13 is merely an example of an electronic device 21 and is not meant to be limiting as to the electronic device 21, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., an electronic device may also include an input-output device, a network access device, a bus, etc.

The processor 211 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 212 may be an internal storage unit of the electronic device 21, such as a hard disk or a memory of the electronic device 21. The memory 212 may also be an external storage device of the electronic device 21, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 21. Further, the memory 212 may also include both internal storage units and external storage devices of the electronic device 21. The memory 212 is used to store computer programs and other programs and data required by the network device. The memory 212 may also be used to temporarily store data that has been output or is to be output.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims

1. A convolutional neural network image prediction method based on weighted filtering enhancement, the method comprising:

the second step comprises the following steps:

dividing the flat slider group and texture block group into red according to pixel positionYellow->Blue->Green->Four sets of pixels; when red pixels are predicted, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When predicting yellow pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When predicting blue pixels, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When the green pixel is predicted, the image input to the predictor is +.>+/>+/>The method comprises the steps of carrying out a first treatment on the surface of the When the convolution kernel size is 3×3, performing pixel point prediction by using at most 8/9 pixel values; when the convolution kernel size is 5×5, performing pixel point prediction by using at most 16/25 pixel values;

2. The method of claim 1, wherein the step one includes:

3. The method of claim 1, wherein the weighted filter enhanced convolutional neural network consists of three networks, a feature extraction network, a weighted hybrid dilation convolutional network WHDCnet, and a pixel prediction network, respectively.

4. A method according to claim 3, wherein the feature extraction network is used to extract features of different levels of the image, consisting of three parallel convolution kernels of 3 x 3, 5 x 5 and 7 x 7, respectively; the regional features of different layers of the image are extracted from different receptive fields, and after three parallel convolution kernels are processed, a latent feature matrix with 96 dimensions is generated by superposing three feature matrices with 32 dimensions.

5. A method according to claim 3, wherein the weighted hybrid dilation convolutional network is used to expand the receptive field range to assist in extracting image features, and is composed of a weighted filter kernel and three serial dilation convolutional kernels, wherein the weighted filter kernel is used to balance the weight attributes of pixels at different distances, the three serial dilation convolutional kernels have a size of 3 x 3, and the dilation rates are 1,2, and 3, respectively; the images are input into a weighted mixed expansion convolution network, and a 32-channel feature map is obtained through wide-area feature extraction.

6. A method according to claim 3, wherein the pixel prediction network is configured to optimize a feature matrix to generate a predicted image consisting of four convolution blocks, each of the convolution blocks consisting of three convolution layers and two activation functions, leakyRelu; the kernel size of each convolution layer is 3 multiplied by 3, the channels of the first three blocks are set to 128, the output of the last block is a predicted image, meanwhile, the outputs of different convolution blocks are connected by residual connection, and the low-dimensional features and the high-dimensional features are connected in series to collect image features; and calculates a mean square error loss from the predicted image and a quarter of the preprocessed image.

7. The method of claim 1, wherein the loss function loss during training is formulated as:

；

8. The method of claim 1, further comprising embedding data;

the embedding of the data comprises:

step V1, determining the secret information S to be embedded, and dividing the secret information S into four parts with the same size S1, S2, S3 and S4, which respectively correspond to redYellow->Blue->Green->Four pixel sets;

9. An electronic device, comprising: one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions that, when executed by the apparatus, cause the apparatus to perform the weighted filter enhancement based convolutional neural network image prediction method of any one of claims 1-8.