CN110796622B

CN110796622B - Image bit enhancement method based on multi-layer characteristics of series neural network

Info

Publication number: CN110796622B
Application number: CN201911043280.3A
Authority: CN
Inventors: 于洁潇; 张春萍; 刘婧
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2023-04-18
Anticipated expiration: 2039-10-30
Also published as: CN110796622A

Abstract

The invention discloses an image bit enhancement method based on a series neural network multilayer characteristic. The method comprises the following steps: constructing a training set, quantizing high-bit images of the training set into low-bit images, obtaining a residual image by pixel difference between the high-bit images and the low-bit images, and obtaining zero-filling high-bit images by zero-filling the low-bit images; removing random variables in the VAE network, directly inputting the characteristic diagram generated by the encoder into a decoder, and establishing a deep learning network model on the basis of the characteristic diagram; adding a plurality of serial jump connections in a network model, and transmitting each layer of feature map to all the following layers; inputting the zero-padding high-bit image into a deep learning network model to generate a residual image, and training a network by using an Adam optimizer; and quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images.

Description

Image bit enhancement method based on multi-layer characteristics of series neural network

Technical Field

The invention relates to the field of deep neural networks, in particular to an image bit enhancement method based on multi-layer characteristics of a series neural network.

Background

With the development and penetration of the visual information industry, the requirements of people on the visual quality provided by the display are continuously increased. The High definition display and the HDR (High Dynamic Range) display can greatly expand the display brightness Range, show more brightness and dark details, bring richer colors and more vivid and natural detailed expressions for pictures, and make the pictures closer to the human eyes. Therefore, high definition displays and HDR displays are becoming mainstream devices in the market.

However, due to the limitations of current capture devices, each color channel of each pixel in most images and videos is stored with 8 bits, and thus each color channel exhibits up to 256 colors. There are some webcams that even use 5,6,5 bits to represent the three color channels red, green, and blue, respectively. In addition, images and videos also often compress high-bit images to low bits when compressed at high magnification.

When the low-bit image is simply converted and then displayed on the high-bit display, obvious false contour effect can occur, and color distortion phenomenon can occur in the area with larger brightness ^[1] . Thus, for picture ratioThe research of the ultra-deep enhancement has very important value.

The image Bit Enhancement (Bit-depth Enhancement) is a technique for improving image quality by overcoming the inherent limitations of imaging hardware such as image sensors, i.e., reconstructing a high-Bit image from a low-Bit image by means of an algorithm. The bit enhancement algorithm can be divided into 3 research directions: methods based on simple calculations, based on interpolation and based on deep learning. Methods based on simple calculations can effectively enhance the bit rate of an image, but do not solve the problem of image false contours well. The interpolation-based algorithm aims to reconstruct the lost bit information of a degraded image, can eliminate false contours to a great extent, but generally can cause blurring of image details and light-colored contours, and cannot reconstruct an image with a complex structure.

In recent years, convolutional neural networks have become a research hotspot in the field of computer vision by virtue of strong feature learning capability and modeling capability. It is divided semantically ^[2,3] Super resolution of images ^[4] Object recognition and tracking ^[5,6] Migration of styles ^[7] And the task of the method obtains better results than the traditional algorithm. The image bit enhancement algorithm based on the simple convolutional neural network proves that the deep learning network can learn more features, can effectively blur the false contour and reconstruct a high-bit image, but in a larger color transition region, the false contour cannot be completely eliminated, so that the lower image visual quality is caused.

Disclosure of Invention

The invention provides an image bit enhancement method based on multilayer characteristics of a series neural network, which indirectly restores a high bit depth image by restoring a residual image with a structure similar to that of a low bit depth image and adding the reconstructed residual image and the low bit depth image. In addition, the features of each layer are input into the subsequent convolutional layer after being connected in series, so that the bottom layer features are directly transmitted into the high-layer convolutional layer, and a high-quality high-bit image is quickly and accurately generated, which is described in detail in the following description:

an image bit enhancement method based on a series neural network multi-layer feature, the bit enhancement method comprising:

constructing a training set, quantizing high-bit images of the training set into low-bit images, obtaining a residual image by pixel difference between the high-bit images and the low-bit images, and obtaining zero-filling high-bit images by zero-filling the low-bit images;

removing random variables in the VAE network, directly inputting the characteristic diagram generated by the encoder into a decoder, and establishing a deep learning network model on the basis of the characteristic diagram;

adding a plurality of series jump connections in a network model, and transmitting each layer characteristic diagram to all the following layers;

inputting the zero-padding high-bit image into a deep learning network model to generate a residual image, and training a network by using an Adam optimizer gradient descent perceptual loss function;

and quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images.

The constructing of the training set further comprises:

the training set consisted of 1000 pictures randomly selected from the Sintel database.

The deep learning network model specifically comprises the following steps:

the convolutional neural network takes a VAE network as a main network and consists of 8 convolutional layers and 8 transposition convolutional layers, and each layer is followed by a batch normalization layer and an activation function ReLU layer.

Wherein the series multilayer is characterized in that:

the features of the layers being connected in series in the depth direction, i.e.

Wherein, X _i-1 Is an output characteristic diagram of the upper layer, X _i Is the output characteristic diagram of the current layer, X _i+1 Is the input characteristic diagram of the next layer.

The loss function is specifically as follows:

a characteristic diagram, W, representing the output of the ith convolutional layer of the ith convolutional block _i,j And H _i,j Respectively representing the width and height of a characteristic diagram output by the ith convolutional layer of the pre-training network; i is _Res Represents the true residual image, is>

Representing the reconstruction of a residual image by a convolutional neural network; n is a radical of hydrogen _c ＝5，/>

Layer 2 of the 1 st volume block, layer 2 of the 2 nd volume block, layer 3 of the 3 rd volume block, layer 4 of the 4 th convolution and layer 4 of the 5 th volume block of the pre-trained VGG-19 network are used.

Wherein, the test set specifically is as follows:

the test set consisted of Sintel randomly extracting 50 images and all images in the UST-HK, KODAK database except the training set.

The technical scheme provided by the invention has the beneficial effects that:

1. the invention relates to a VAE (variable automatic encoder) network ^[8] A residual image between a high-bit image and a low-bit image is generated for a backbone network, and the network computation amount is reduced while higher structure information is kept.

2. The invention transmits the characteristic graph of each layer to all the following layers through the serial jump connection, the serial jump connection can stabilize the image gradient and directly provide independent detail characteristics of the bottom layer for a high-level neural network, and the obtained high-bit image has higher visual quality and better objective evaluation result.

Drawings

FIG. 1 is a block diagram of an image bit enhancement method based on a series neural network multi-layer feature;

FIG. 2 is a convolutional layer of a convolutional neural network;

FIG. 3 is a transposed convolutional layer of a convolutional neural network.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.

Example 1

The embodiment of the invention provides a variational self-encoder-based multi-feature fusion convolutional neural network for image bit enhancement, and a network model is optimized through a gradient descent perceptual loss function, wherein the method comprises the following steps:

101: sintel for high bit lossless picture quality ^[9] 、UST-HK ^[10] 、KODAK ^[11] Preprocessing the images in the database, firstly quantizing the high-bit images into low-bit images, and then performing pixel difference between the high-bit images and the low-bit images to obtain a residual image.

Wherein, the Sintel database is from a cartoon short film without loss of image quality, and the UST-HK and KODAK databases are truly taken photos. And randomly selecting 1000 pictures in the Sintel database as a training set, and all pictures in the UST-HK and KODAK databases and 50 pictures except the training set in the Sintel database as a test set.

102: the present invention takes an improved VAE network as a backbone network. The improved VAE network consists of a Convolutional Layer and a Transposed Convolutional Layer (Transposed Convolutional Layer) ^[12] Two parts are formed. On the premise of ensuring the maximum information transmission between layers in the network, all the layers are directly connected in series. In order to ensure the feedforward characteristic, each layer splices the input of all the previous layers, and transmits the output characteristic diagram to all the subsequent layers, thereby ensuring the structural integrity of the generated residual diagram to the maximum extent.

103: in the training stage, the zero-padding high-bit image obtained by inverse quantization of the low-bit image of the training set is used as network input, andtaking gradient descent perception Loss (Perceptial Loss) between a residual image generated by a network and a real residual image as a Loss function, and passing through an Adam optimizer ^[13] The gradient descent loss function trains the network model parameters.

104: in the testing stage, the low-bit images of the testing set are inversely quantized to obtain zero-padding high-bit images, the high-bit images are used for generating a residual error map through an improved VAE network loaded with training model parameters, and the low-bit images and the generated residual error map are added according to pixels to obtain the high-bit images. The validity of the method is verified by calculating the similarity between the generated high-bit image and the true high-bit image using the relevant objective evaluation criterion.

In summary, the embodiment of the present invention designs a bit depth enhancement network based on multi-feature concatenation of VAE networks through steps 101 to 104. And performing inverse quantization on the low-bit image to obtain a zero-padding high-bit image, generating a residual image between the high-bit image and the low-bit image through a network by using the zero-padding high-bit image as an input image, and adding the residual image and the low-bit image to obtain a high-bit image with higher quality. The invention designs the network from the angle of optimizing the residual error image, adds the serial jump connection to enhance the network learning ability, trains the network parameters by using the gradient descent perception loss function, and ensures that the reconstructed high-bit image has high subjective visual quality.

Example 2

The protocol of example 1 is further described below for details as described below:

201: because the Sintel database formed by the animation images is completely generated by computer software, and the images have no noise interference, the images in the Sintel database have smoother color gradient structures, and the edges and textures in the images are clearer. Such near-ideal structural features can help the neural network learn the features of smooth regions and edge structures, help the model to reconstruct color gradient structures in the image and keep the contours relatively sharp, so the deep neural network proposed herein is trained with Sintel animated images. The UST-HK, KODAK database and part of Sintel consisting of the real shot pictures are used as a test set to verify the effect of the invention.

In view of the image structure features of the low bit depth image and the corresponding high bit depth image at the same position, how to remove the false contour structure and restore the structure of color smooth transition while preserving the detail texture is a key task of image bit depth enhancement. When the neural network directly reconstructs the high bit depth image, the output pixel value change range is very large, a large number of structures of surrounding pseudo contours are required for reconstructing the color gradient trend, and the effect of directly reconstructing the high bit depth image with a large pixel threshold value is poor. The aim of bit enhancement can also be achieved by reconstructing a residual image between the high bit depth image and the linearly amplified image from the low bit depth image and indirectly restoring the high bit depth image by adding the residual image and the linearly amplified image, and the indirect restoration process is easier. Here, the linear enlargement operation may be implemented by a Zero Padding algorithm (Zero Padding), and thus the linearly enlarged high bit depth image is represented by a Zero padded image. The pixel value output by the network when a 16-bit image is directly reconstructed is in the range of 0 to 65,535, and the pixel value output by the network when a high-bit depth image is indirectly restored by reconstructing a residual image is in the range of 0 to 4,095, so that the difficulty of reconstructing the residual image by the neural network is lower.

In addition, the structure features of the false contour and the real edge in the residual image are greatly different, and the information lost by the real edge and the texture in the quantization process is less, so that the false contour is not clear in the residual image, the pixel value is lower, and the false contour has a sharp edge structure and is greatly different from the false contour. The structural difference between the two can help the neural network to learn and distinguish the false contour and the real edge, and reconstruct the corresponding structure according to the expectation of the task.

Based on the above analysis, the inventive section proposes to indirectly restore a high bit depth image by reconstructing a residual image between the high bit depth image and a zero-padded image, and pixel-wise adding the restored residual image and the zero-padded image. This can achieve the same task as directly restoring the high bit depth image end to end, but is simpler than directly restoring the high bit depth image, and the reconstruction effect of the high bit depth image is better. Therefore, the preprocessing part of the database image of the invention mainly comprises two points: 1) Converting low bit occurrences into zero-padded high bit images by a zero-padding algorithm; 2) And the high-bit image in the database and the zero-filling high-bit image are subjected to pixel subtraction to obtain a residual image. The zero padding high-bit image is used as an input image of the improved VAE network, and the residual image is used for solving the loss with the residual image generated by the VAE network to train the network model.

202: in a conventional VAE network, the potential distribution of encoders is randomly sampled and input to a decoder, thereby generating an image that is completely different from the input image. This runs contrary to the task of reconstructing a high bit image from a low bit image, which is mainly used to generate a completely new image different from the input image, and the image bit enhancement algorithm requires the generation of a high bit image from a low bit image with similar structural information and smooth color transition regions. Therefore, the invention adopts an improved VAE network as a base network, and the characteristic diagram generated by the encoder is directly input to the decoder to recover the high-bit image. This ensures that the reconstructed high bit image and low bit image have similar structural and content characteristics.

As shown in FIG. 1, the VAE convolutional network employs convolutional layer and transposed convolutional layer structures. The convolutional layer extracts the local structural features of the input image, and main image content is reserved. The transposed convolutional layer takes the local structural features as input to compensate for the detail information. In addition, a plurality of jump-type connections are added to the two parts to splice different semantic features, which is beneficial to generating a residual error map with complete structural information by a model.

The transposed convolution is also called deconvolution, and is often used for upsampling a feature map in CNN, and the size and structural information of an image can be restored by using the transposed convolution, so that a residual map with higher quality is obtained. Let F be the image or feature map of the transposed convolution input with M channels, i.e. F ₁ ,f ₂ ,......,f _M . Each channel f _m Is N potential feature maps k _n And a convolution kernel g _m,n The linear sum of the convolution results. Is formulated as:

where denotes a two-dimensional convolution operation.

Conventional VAE networks use step-size convolution and deconvolution layers to downsample and upsample images, respectively. Due to sampling in the step convolution, detailed information of the image can be input at any time, and the visual quality of the reconstructed image is reduced. Therefore, step convolution is not used in the deep network learning in the present invention to preserve as many structural features as possible. For the same reason, the widely used pooling layer and anti-pooling layer are not employed in the network.

In view of computational complexity, the present invention consists of 8 convolutional layers and 8 transposed convolutional layers. As shown in FIGS. 2 and 3, each layer is followed by a BN layer (Batch Normalization) ^[14] And activation function ReLU layer ^[15] (Standard corrected Linear Unit). The BN layer can greatly reduce the calculation cost and the operation time, and the ReLU layer can relieve the problems of gradient elimination and overfitting. Adding these two layers will greatly improve the visual quality of the resulting residual map. The 3*3 convolution kernel employed in the present invention utilizes more of the local structure of the signature graph than 1*1 convolution kernels. Whereas, the convolution of 3*3 facilitates not only the generation of sharp edges but also the reduction of model runtime as compared to the convolution kernel of 9*9.

Furthermore, the increase in the depth of the network means that the input information and gradient information are transferred between many layers, which is highly susceptible to the problem of gradient disappearance, resulting in a model that does not achieve satisfactory results. To solve the problem, a plurality of jump connections are added in the network, each layer splices the input of all the previous layers, and then transmits the output characteristic diagram to all the subsequent layers. Each layer can directly utilize the gradient of the loss function and the initial input information. The characteristics of each layer are connected in series according to the depth direction, and the formula is as follows:

wherein, X _i-1 Is the output characteristic diagram of the previous layer, X _i Is the output characteristic diagram of the current layer, X _i+1 Is the input characteristic diagram of the next layer.

The connection mode enables the transfer of the characteristics and the gradient to be more effective, and the network can effectively reduce the problem of gradient disappearance and is easier to train. In addition, because the output characteristic diagram of each layer is the input of all the following layers, shallow layer characteristics in the convolution layer can be directly input into the deconvolution layer, a network does not need to learn redundant characteristic diagrams again, and the method has fewer parameters compared with the traditional convolution neural network, and is beneficial to the network to obtain a residual error diagram more quickly.

203: the model is trained by an Adam Optimizer (Adaptive motion Estimation Optimizer) gradient descent perceptual loss function, and the optimization formula is shown as follows

Wherein, g _t The gradient (vector, containing the corresponding partial derivatives of the respective parameters,

representing the partial derivative at time t of the ith parameter), ->

Represents the gradient squared at time step t. The Adam optimizer adds a denominator when calculating envelope step length: the square root of the gradient squared cumulative sum. This can be accumulated for various parameters>

The square of the historical gradient is calculated, the accumulated denominator item is gradually larger when the gradient is updated frequently, the updating step length is relatively smaller, and the sparse gradient is ledSo that the corresponding value in the accumulated denominator term is small, the step size of the update is relatively large. Therefore, parameters in the training process are stable, and the method is beneficial to keeping the structural information of the residual error map. />

Most classical image-to-image neural networks have MSE (Mean Square Error) as a loss function. The mean square error calculates the pixel-level similarity between the reconstructed image and the target image, assuming that the recovered high bit depth image is

The true high bit depth image is I _HBD And the width and height of the image are W and H, respectively, the mean square error loss function can be defined as:

however, the mean square error simply calculates the similarity of corresponding pixels between two images, and does not consider the local and global structural similarity of the images, so that the detail part in the high bit depth image recovered by the model trained by the mean square error loss function is fuzzy, and the false contour is difficult to be completely eliminated. Therefore, in order to better measure the similarity between the output and the real image and guide the training of the deep neural network to the direction of higher structural similarity, the invention adopts perceptual Loss (Perception Loss) as a Loss function. The perception loss extracts the local features and high-level semantic features of the image through a pre-trained neural network, and the structural similarity of the input image is calculated according to the similarity of the extracted features. The perceptual loss function is defined as follows:

wherein the content of the first and second substances,

convolutional layer j of i convolutional block representing pre-training networkOutput characteristic graph, W _i,j And H _i,j Respectively representing the width and height of the characteristic diagram output by the ith convolutional layer of the pre-training network. I is _Res Represents the true residual image, is>

Representing the residual image generated by the convolution algorithm.

In the present invention, N _c ＝5，

Using pre-trained VGG-19 ^[16] Layer 2 of the 1 st volume block, layer 2 of the 2 nd volume block, layer 3 of the 3 rd volume block, layer 4 of the 4 th convolution and layer 4 of the 5 th volume block of the network.

Example 3

The protocols of examples 1 and 2 were evaluated for efficacy in combination with specific experimental data as described below:

301: data composition

The training set consists of 1000 pictures randomly drawn from the Sintel database.

302: evaluation criterion

The invention mainly adopts two evaluation indexes to evaluate the quality of the reconstructed high-bit image:

PSNR (Peak Signal to Noise Ratio) is the most widely used objective criterion for evaluating image quality. It is the ratio of the mean square error between the original image and the contrast image (2) ⁿ -1) ² The logarithm of (i.e. the square of the maximum value of the signal, where n is the number of bits). The larger the PSNR value between 2 images, the more similar.

SSIM (Structural Similarity Index) ^[17] The method is an index for measuring the structural similarity of two images. Structural similarity index defines structural information from the perspective of image composition as reflecting objects in a scene independent of brightness, contrastThe nature of the structure. Therefore, it can compare image distortion from three levels of brightness (mean), contrast (variance), and structure, with structure accounting for the dominant contributor. SSIM is a number between 0 and 1, and a larger SSIM indicates a smaller difference between the original image and the contrast image, i.e., better image quality.

303: comparison algorithm

The present invention was compared in experiments with eight methods, 7 traditional methods, 1 deep learning method.

The 7 conventional methods include: 1) ZP (Zero Padding algorithm, zero Padding); 2) MIG (Ideal Gain product algorithm, multiplication by an Ideal Gain); 3) BR (Bit Replication algorithm) ^[18] (ii) a 4) MRC (MRC based on Minimum Risk based Classification algorithm) ^[19] (ii) a 5) CRR (Contour Region Reconstruction algorithm based on the Hongman algorithm, contour Region Reconstruction) ^[20] (ii) a 6) CA (Content Adaptive Image Bit-Depth enhancement algorithm, depth Expansion) ^[1] (ii) a 7) ACDC (Maximum a Posteriori Estimation AC Signal Algorithm, maximum a Posteriori Estimation of AC Signal) ^[22] 。

The deep learning method is BE-CNN (Bit-Depth Enhancement method of image based on end-to-end Convolutional Neural Network), the method directly reconstructs a high-Bit image based on a simple Convolutional Neural Network, and the generation of image false contours can BE inhibited to a certain extent. This method does not restore a wide range of smooth transition regions very well.

Tables 1-2 show the evaluation results of the method and other methods for reconstructing high-bit image quality in the Sintel, UST-HK and KODAK databases, respectively (the best evaluation results are shown in bold letters). The evaluation results in table 1 were based on 50 images randomly selected from the sinter database, the evaluation results in table two were based on all 40 images from the UST-HK database, and the evaluation results in table 3 were based on all 24 pictures from the KODAK database. As can BE seen from 3 tables, the evaluation results obtained by the deep learning method BE-CNN and the method provided by the invention are obviously higher than those obtained by traditional methods such as ZP, MIG, BR, MRC, CRR, CA, ACDC and the like. Compared with BE-CNN, the method obtains higher evaluation results in a plurality of bit enhancement modes of 3 databases. This demonstrates the effectiveness of the method in an objective way.

TABLE 1

TABLE 2

TABLE 3

Reference to the literature

[1]WAN P,AU O C,TANG K,et al.From 2D Extrapolation to 1D Interpolation:Content Adaptive Image Bit-Depth Expansion[J].2012,170-5.

[2]TSOGKAS S,KOKKINOS I,PAPANDREOU G,et al.Deep Learning for Semantic Part Segmentation with High-Level Guidance[J].Computer Science,2015,530-8.

[3]RADMAN A,ZAINAL N,SUANDI S A.Automated segmentation of iris images acquired in an unconstrained environment using HOG-SVM and GrowCut[J].Digital Signal Processing,2017,64(60-70.

[4]TSENG C W,SU H R,LAI S H,et al.Depth image super-resolution via multi-frame registration and deep learning；proceedings of the Signaland Information Processing Association Summit and Conference,F,2017[C].

[5]MA C,HUANG J B,YANG X,et al.Hierarchical Convolutional Features for Visual Tracking；proceedings of the IEEE International Conference on Computer Vision,F,2015[C].

[6]WANG N,LI S,GUPTA A,et al.Transferring Rich Feature Hierarchies for Robust Visual Tracking [J].Computer Science,2015,

[7]JOHNSON J,ALAHI A,LI F F.Perceptual Losses for Real-Time Style Transfer and Super-Resolution[J].2016,694-711.

[8]LINGMA D P,WELLING M,et al.Auto-encoding variational bayes；International Conference on Learning Representations,F,2014[C].

[9]FOUNDATION X.Xiph.Org,https://www.xiph.org/,2016.

[10]WAN P,CHEUNG G,FLORENCIO D,ZHANG C,AU O C.Image Bitdepth Enhancement via Maximum a Posteriori Estimation of AC Signal[J].2016,2896-2909.

[11]KODAK.Kodak Lossless True Color Image Suite,http://r0k.us/graphics/kodak/

[12]ZEILER M D,KRISHNAN D,TAYLOR G W,et al.Deconvolutional networks；proceedings of the Computer Vision and Pattern Recognition,F,2010[C].

[13]KINGMA D P,BAJ.ADAM:A Method for Stochastic Optimization,arXiv preprint arXiv:1412.6980,2014.

[14]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[J].2015,448-56.

[15]NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines；proceedings of the International Conference on International Conference on Machine Learning,F,2010[C].

[16]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014.

[17]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-12.

[18]ULICHNEY R A,CHEUNG S.Bit-depth increase by bit replication[M].CiteSeer.2000:232--41.

[19]MITTAL G,JAKHETIYA V,JAISWAL S P,et al.Bit-depth expansion using Minimum Risk Based Classification；proceedings of the Visual Communications and Image Processing,F,2013[C].

[20]CHENG C H,AU O C,LIU C H,et al.Bit-depth expansion by contour region reconstruction；proceedings of the IEEE International Symposium on Circuits and Systems,F,2009[C].

[21]WAN P,CHEUNG G,FLORENCIO D,et al.Image bit-depth enhancement via maximum-a-posteriori estimation of graph AC component；proceedings of the IEEE International Conference on Image Processing,F,2015[C].

Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-mentioned serial numbers of the embodiments of the present invention are only for description and do not represent the merits of the embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. An image bit enhancement method based on a series neural network multi-layer feature is characterized in that the bit enhancement method comprises the following steps:

adding a plurality of serial jump connections in a network model, and transmitting each layer of feature map to all the following layers;

quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images;

wherein the series jump connection specifically comprises:

Wherein, X _i-1 Is an output characteristic diagram of the upper layer, X _i Is the output characteristic diagram of the current layer, X _i+1 Is the input feature map of the next layer.

2. The method of claim 1, wherein the constructing the training set further comprises:

3. The method according to claim 1, wherein the deep learning network model specifically comprises:

4. The method as claimed in claim 1, wherein the loss function is specifically as follows:

a characteristic diagram, W, representing the output of the ith convolutional layer of the ith convolutional block _i,j And H _i,j Respectively representing the width and the height of a characteristic diagram output by the jth convolutional layer of the ith block of the pre-training network; i is _Res Represents the true residual image, is>

Representing the reconstruction of a residual image by a convolutional neural network; n is a radical of _c ＝5，/>

5. The method according to claim 1, wherein the test set is specifically as follows: