CN110796622B - Image bit enhancement method based on multi-layer characteristics of series neural network - Google Patents

Image bit enhancement method based on multi-layer characteristics of series neural network Download PDF

Info

Publication number
CN110796622B
CN110796622B CN201911043280.3A CN201911043280A CN110796622B CN 110796622 B CN110796622 B CN 110796622B CN 201911043280 A CN201911043280 A CN 201911043280A CN 110796622 B CN110796622 B CN 110796622B
Authority
CN
China
Prior art keywords
bit
image
images
layer
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911043280.3A
Other languages
Chinese (zh)
Other versions
CN110796622A (en
Inventor
于洁潇
张春萍
刘婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201911043280.3A priority Critical patent/CN110796622B/en
Publication of CN110796622A publication Critical patent/CN110796622A/en
Application granted granted Critical
Publication of CN110796622B publication Critical patent/CN110796622B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image bit enhancement method based on a series neural network multilayer characteristic. The method comprises the following steps: constructing a training set, quantizing high-bit images of the training set into low-bit images, obtaining a residual image by pixel difference between the high-bit images and the low-bit images, and obtaining zero-filling high-bit images by zero-filling the low-bit images; removing random variables in the VAE network, directly inputting the characteristic diagram generated by the encoder into a decoder, and establishing a deep learning network model on the basis of the characteristic diagram; adding a plurality of serial jump connections in a network model, and transmitting each layer of feature map to all the following layers; inputting the zero-padding high-bit image into a deep learning network model to generate a residual image, and training a network by using an Adam optimizer; and quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images.

Description

Image bit enhancement method based on multi-layer characteristics of series neural network
Technical Field
The invention relates to the field of deep neural networks, in particular to an image bit enhancement method based on multi-layer characteristics of a series neural network.
Background
With the development and penetration of the visual information industry, the requirements of people on the visual quality provided by the display are continuously increased. The High definition display and the HDR (High Dynamic Range) display can greatly expand the display brightness Range, show more brightness and dark details, bring richer colors and more vivid and natural detailed expressions for pictures, and make the pictures closer to the human eyes. Therefore, high definition displays and HDR displays are becoming mainstream devices in the market.
However, due to the limitations of current capture devices, each color channel of each pixel in most images and videos is stored with 8 bits, and thus each color channel exhibits up to 256 colors. There are some webcams that even use 5,6,5 bits to represent the three color channels red, green, and blue, respectively. In addition, images and videos also often compress high-bit images to low bits when compressed at high magnification.
When the low-bit image is simply converted and then displayed on the high-bit display, obvious false contour effect can occur, and color distortion phenomenon can occur in the area with larger brightness [1] . Thus, for picture ratioThe research of the ultra-deep enhancement has very important value.
The image Bit Enhancement (Bit-depth Enhancement) is a technique for improving image quality by overcoming the inherent limitations of imaging hardware such as image sensors, i.e., reconstructing a high-Bit image from a low-Bit image by means of an algorithm. The bit enhancement algorithm can be divided into 3 research directions: methods based on simple calculations, based on interpolation and based on deep learning. Methods based on simple calculations can effectively enhance the bit rate of an image, but do not solve the problem of image false contours well. The interpolation-based algorithm aims to reconstruct the lost bit information of a degraded image, can eliminate false contours to a great extent, but generally can cause blurring of image details and light-colored contours, and cannot reconstruct an image with a complex structure.
In recent years, convolutional neural networks have become a research hotspot in the field of computer vision by virtue of strong feature learning capability and modeling capability. It is divided semantically [2,3] Super resolution of images [4] Object recognition and tracking [5,6] Migration of styles [7] And the task of the method obtains better results than the traditional algorithm. The image bit enhancement algorithm based on the simple convolutional neural network proves that the deep learning network can learn more features, can effectively blur the false contour and reconstruct a high-bit image, but in a larger color transition region, the false contour cannot be completely eliminated, so that the lower image visual quality is caused.
Disclosure of Invention
The invention provides an image bit enhancement method based on multilayer characteristics of a series neural network, which indirectly restores a high bit depth image by restoring a residual image with a structure similar to that of a low bit depth image and adding the reconstructed residual image and the low bit depth image. In addition, the features of each layer are input into the subsequent convolutional layer after being connected in series, so that the bottom layer features are directly transmitted into the high-layer convolutional layer, and a high-quality high-bit image is quickly and accurately generated, which is described in detail in the following description:
an image bit enhancement method based on a series neural network multi-layer feature, the bit enhancement method comprising:
constructing a training set, quantizing high-bit images of the training set into low-bit images, obtaining a residual image by pixel difference between the high-bit images and the low-bit images, and obtaining zero-filling high-bit images by zero-filling the low-bit images;
removing random variables in the VAE network, directly inputting the characteristic diagram generated by the encoder into a decoder, and establishing a deep learning network model on the basis of the characteristic diagram;
adding a plurality of series jump connections in a network model, and transmitting each layer characteristic diagram to all the following layers;
inputting the zero-padding high-bit image into a deep learning network model to generate a residual image, and training a network by using an Adam optimizer gradient descent perceptual loss function;
and quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images.
The constructing of the training set further comprises:
the training set consisted of 1000 pictures randomly selected from the Sintel database.
The deep learning network model specifically comprises the following steps:
the convolutional neural network takes a VAE network as a main network and consists of 8 convolutional layers and 8 transposition convolutional layers, and each layer is followed by a batch normalization layer and an activation function ReLU layer.
Wherein the series multilayer is characterized in that:
the features of the layers being connected in series in the depth direction, i.e.
Figure BDA0002253439330000021
Wherein, X i-1 Is an output characteristic diagram of the upper layer, X i Is the output characteristic diagram of the current layer, X i+1 Is the input characteristic diagram of the next layer.
The loss function is specifically as follows:
Figure BDA0002253439330000022
Figure BDA0002253439330000023
a characteristic diagram, W, representing the output of the ith convolutional layer of the ith convolutional block i,j And H i,j Respectively representing the width and height of a characteristic diagram output by the ith convolutional layer of the pre-training network; i is Res Represents the true residual image, is>
Figure BDA0002253439330000024
Representing the reconstruction of a residual image by a convolutional neural network; n is a radical of hydrogen c =5,/>
Figure BDA0002253439330000031
Layer 2 of the 1 st volume block, layer 2 of the 2 nd volume block, layer 3 of the 3 rd volume block, layer 4 of the 4 th convolution and layer 4 of the 5 th volume block of the pre-trained VGG-19 network are used.
Wherein, the test set specifically is as follows:
the test set consisted of Sintel randomly extracting 50 images and all images in the UST-HK, KODAK database except the training set.
The technical scheme provided by the invention has the beneficial effects that:
1. the invention relates to a VAE (variable automatic encoder) network [8] A residual image between a high-bit image and a low-bit image is generated for a backbone network, and the network computation amount is reduced while higher structure information is kept.
2. The invention transmits the characteristic graph of each layer to all the following layers through the serial jump connection, the serial jump connection can stabilize the image gradient and directly provide independent detail characteristics of the bottom layer for a high-level neural network, and the obtained high-bit image has higher visual quality and better objective evaluation result.
Drawings
FIG. 1 is a block diagram of an image bit enhancement method based on a series neural network multi-layer feature;
FIG. 2 is a convolutional layer of a convolutional neural network;
FIG. 3 is a transposed convolutional layer of a convolutional neural network.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
Example 1
The embodiment of the invention provides a variational self-encoder-based multi-feature fusion convolutional neural network for image bit enhancement, and a network model is optimized through a gradient descent perceptual loss function, wherein the method comprises the following steps:
101: sintel for high bit lossless picture quality [9] 、UST-HK [10] 、KODAK [11] Preprocessing the images in the database, firstly quantizing the high-bit images into low-bit images, and then performing pixel difference between the high-bit images and the low-bit images to obtain a residual image.
Wherein, the Sintel database is from a cartoon short film without loss of image quality, and the UST-HK and KODAK databases are truly taken photos. And randomly selecting 1000 pictures in the Sintel database as a training set, and all pictures in the UST-HK and KODAK databases and 50 pictures except the training set in the Sintel database as a test set.
102: the present invention takes an improved VAE network as a backbone network. The improved VAE network consists of a Convolutional Layer and a Transposed Convolutional Layer (Transposed Convolutional Layer) [12] Two parts are formed. On the premise of ensuring the maximum information transmission between layers in the network, all the layers are directly connected in series. In order to ensure the feedforward characteristic, each layer splices the input of all the previous layers, and transmits the output characteristic diagram to all the subsequent layers, thereby ensuring the structural integrity of the generated residual diagram to the maximum extent.
103: in the training stage, the zero-padding high-bit image obtained by inverse quantization of the low-bit image of the training set is used as network input, andtaking gradient descent perception Loss (Perceptial Loss) between a residual image generated by a network and a real residual image as a Loss function, and passing through an Adam optimizer [13] The gradient descent loss function trains the network model parameters.
104: in the testing stage, the low-bit images of the testing set are inversely quantized to obtain zero-padding high-bit images, the high-bit images are used for generating a residual error map through an improved VAE network loaded with training model parameters, and the low-bit images and the generated residual error map are added according to pixels to obtain the high-bit images. The validity of the method is verified by calculating the similarity between the generated high-bit image and the true high-bit image using the relevant objective evaluation criterion.
In summary, the embodiment of the present invention designs a bit depth enhancement network based on multi-feature concatenation of VAE networks through steps 101 to 104. And performing inverse quantization on the low-bit image to obtain a zero-padding high-bit image, generating a residual image between the high-bit image and the low-bit image through a network by using the zero-padding high-bit image as an input image, and adding the residual image and the low-bit image to obtain a high-bit image with higher quality. The invention designs the network from the angle of optimizing the residual error image, adds the serial jump connection to enhance the network learning ability, trains the network parameters by using the gradient descent perception loss function, and ensures that the reconstructed high-bit image has high subjective visual quality.
Example 2
The protocol of example 1 is further described below for details as described below:
201: because the Sintel database formed by the animation images is completely generated by computer software, and the images have no noise interference, the images in the Sintel database have smoother color gradient structures, and the edges and textures in the images are clearer. Such near-ideal structural features can help the neural network learn the features of smooth regions and edge structures, help the model to reconstruct color gradient structures in the image and keep the contours relatively sharp, so the deep neural network proposed herein is trained with Sintel animated images. The UST-HK, KODAK database and part of Sintel consisting of the real shot pictures are used as a test set to verify the effect of the invention.
In view of the image structure features of the low bit depth image and the corresponding high bit depth image at the same position, how to remove the false contour structure and restore the structure of color smooth transition while preserving the detail texture is a key task of image bit depth enhancement. When the neural network directly reconstructs the high bit depth image, the output pixel value change range is very large, a large number of structures of surrounding pseudo contours are required for reconstructing the color gradient trend, and the effect of directly reconstructing the high bit depth image with a large pixel threshold value is poor. The aim of bit enhancement can also be achieved by reconstructing a residual image between the high bit depth image and the linearly amplified image from the low bit depth image and indirectly restoring the high bit depth image by adding the residual image and the linearly amplified image, and the indirect restoration process is easier. Here, the linear enlargement operation may be implemented by a Zero Padding algorithm (Zero Padding), and thus the linearly enlarged high bit depth image is represented by a Zero padded image. The pixel value output by the network when a 16-bit image is directly reconstructed is in the range of 0 to 65,535, and the pixel value output by the network when a high-bit depth image is indirectly restored by reconstructing a residual image is in the range of 0 to 4,095, so that the difficulty of reconstructing the residual image by the neural network is lower.
In addition, the structure features of the false contour and the real edge in the residual image are greatly different, and the information lost by the real edge and the texture in the quantization process is less, so that the false contour is not clear in the residual image, the pixel value is lower, and the false contour has a sharp edge structure and is greatly different from the false contour. The structural difference between the two can help the neural network to learn and distinguish the false contour and the real edge, and reconstruct the corresponding structure according to the expectation of the task.
Based on the above analysis, the inventive section proposes to indirectly restore a high bit depth image by reconstructing a residual image between the high bit depth image and a zero-padded image, and pixel-wise adding the restored residual image and the zero-padded image. This can achieve the same task as directly restoring the high bit depth image end to end, but is simpler than directly restoring the high bit depth image, and the reconstruction effect of the high bit depth image is better. Therefore, the preprocessing part of the database image of the invention mainly comprises two points: 1) Converting low bit occurrences into zero-padded high bit images by a zero-padding algorithm; 2) And the high-bit image in the database and the zero-filling high-bit image are subjected to pixel subtraction to obtain a residual image. The zero padding high-bit image is used as an input image of the improved VAE network, and the residual image is used for solving the loss with the residual image generated by the VAE network to train the network model.
202: in a conventional VAE network, the potential distribution of encoders is randomly sampled and input to a decoder, thereby generating an image that is completely different from the input image. This runs contrary to the task of reconstructing a high bit image from a low bit image, which is mainly used to generate a completely new image different from the input image, and the image bit enhancement algorithm requires the generation of a high bit image from a low bit image with similar structural information and smooth color transition regions. Therefore, the invention adopts an improved VAE network as a base network, and the characteristic diagram generated by the encoder is directly input to the decoder to recover the high-bit image. This ensures that the reconstructed high bit image and low bit image have similar structural and content characteristics.
As shown in FIG. 1, the VAE convolutional network employs convolutional layer and transposed convolutional layer structures. The convolutional layer extracts the local structural features of the input image, and main image content is reserved. The transposed convolutional layer takes the local structural features as input to compensate for the detail information. In addition, a plurality of jump-type connections are added to the two parts to splice different semantic features, which is beneficial to generating a residual error map with complete structural information by a model.
The transposed convolution is also called deconvolution, and is often used for upsampling a feature map in CNN, and the size and structural information of an image can be restored by using the transposed convolution, so that a residual map with higher quality is obtained. Let F be the image or feature map of the transposed convolution input with M channels, i.e. F 1 ,f 2 ,......,f M . Each channel f m Is N potential feature maps k n And a convolution kernel g m,n The linear sum of the convolution results. Is formulated as:
Figure BDA0002253439330000061
where denotes a two-dimensional convolution operation.
Conventional VAE networks use step-size convolution and deconvolution layers to downsample and upsample images, respectively. Due to sampling in the step convolution, detailed information of the image can be input at any time, and the visual quality of the reconstructed image is reduced. Therefore, step convolution is not used in the deep network learning in the present invention to preserve as many structural features as possible. For the same reason, the widely used pooling layer and anti-pooling layer are not employed in the network.
In view of computational complexity, the present invention consists of 8 convolutional layers and 8 transposed convolutional layers. As shown in FIGS. 2 and 3, each layer is followed by a BN layer (Batch Normalization) [14] And activation function ReLU layer [15] (Standard corrected Linear Unit). The BN layer can greatly reduce the calculation cost and the operation time, and the ReLU layer can relieve the problems of gradient elimination and overfitting. Adding these two layers will greatly improve the visual quality of the resulting residual map. The 3*3 convolution kernel employed in the present invention utilizes more of the local structure of the signature graph than 1*1 convolution kernels. Whereas, the convolution of 3*3 facilitates not only the generation of sharp edges but also the reduction of model runtime as compared to the convolution kernel of 9*9.
Furthermore, the increase in the depth of the network means that the input information and gradient information are transferred between many layers, which is highly susceptible to the problem of gradient disappearance, resulting in a model that does not achieve satisfactory results. To solve the problem, a plurality of jump connections are added in the network, each layer splices the input of all the previous layers, and then transmits the output characteristic diagram to all the subsequent layers. Each layer can directly utilize the gradient of the loss function and the initial input information. The characteristics of each layer are connected in series according to the depth direction, and the formula is as follows:
Figure BDA0002253439330000062
wherein, X i-1 Is the output characteristic diagram of the previous layer, X i Is the output characteristic diagram of the current layer, X i+1 Is the input characteristic diagram of the next layer.
The connection mode enables the transfer of the characteristics and the gradient to be more effective, and the network can effectively reduce the problem of gradient disappearance and is easier to train. In addition, because the output characteristic diagram of each layer is the input of all the following layers, shallow layer characteristics in the convolution layer can be directly input into the deconvolution layer, a network does not need to learn redundant characteristic diagrams again, and the method has fewer parameters compared with the traditional convolution neural network, and is beneficial to the network to obtain a residual error diagram more quickly.
203: the model is trained by an Adam Optimizer (Adaptive motion Estimation Optimizer) gradient descent perceptual loss function, and the optimization formula is shown as follows
Figure BDA0002253439330000071
Figure BDA0002253439330000072
Wherein, g t The gradient (vector, containing the corresponding partial derivatives of the respective parameters,
Figure BDA0002253439330000073
representing the partial derivative at time t of the ith parameter), ->
Figure BDA0002253439330000074
Represents the gradient squared at time step t. The Adam optimizer adds a denominator when calculating envelope step length: the square root of the gradient squared cumulative sum. This can be accumulated for various parameters>
Figure BDA0002253439330000075
The square of the historical gradient is calculated, the accumulated denominator item is gradually larger when the gradient is updated frequently, the updating step length is relatively smaller, and the sparse gradient is ledSo that the corresponding value in the accumulated denominator term is small, the step size of the update is relatively large. Therefore, parameters in the training process are stable, and the method is beneficial to keeping the structural information of the residual error map. />
Most classical image-to-image neural networks have MSE (Mean Square Error) as a loss function. The mean square error calculates the pixel-level similarity between the reconstructed image and the target image, assuming that the recovered high bit depth image is
Figure BDA0002253439330000076
The true high bit depth image is I HBD And the width and height of the image are W and H, respectively, the mean square error loss function can be defined as:
Figure BDA0002253439330000077
however, the mean square error simply calculates the similarity of corresponding pixels between two images, and does not consider the local and global structural similarity of the images, so that the detail part in the high bit depth image recovered by the model trained by the mean square error loss function is fuzzy, and the false contour is difficult to be completely eliminated. Therefore, in order to better measure the similarity between the output and the real image and guide the training of the deep neural network to the direction of higher structural similarity, the invention adopts perceptual Loss (Perception Loss) as a Loss function. The perception loss extracts the local features and high-level semantic features of the image through a pre-trained neural network, and the structural similarity of the input image is calculated according to the similarity of the extracted features. The perceptual loss function is defined as follows:
Figure BDA0002253439330000078
wherein the content of the first and second substances,
Figure BDA0002253439330000079
convolutional layer j of i convolutional block representing pre-training networkOutput characteristic graph, W i,j And H i,j Respectively representing the width and height of the characteristic diagram output by the ith convolutional layer of the pre-training network. I is Res Represents the true residual image, is>
Figure BDA00022534393300000710
Representing the residual image generated by the convolution algorithm.
In the present invention, N c =5,
Figure BDA00022534393300000711
Using pre-trained VGG-19 [16] Layer 2 of the 1 st volume block, layer 2 of the 2 nd volume block, layer 3 of the 3 rd volume block, layer 4 of the 4 th convolution and layer 4 of the 5 th volume block of the network.
Example 3
The protocols of examples 1 and 2 were evaluated for efficacy in combination with specific experimental data as described below:
301: data composition
The training set consists of 1000 pictures randomly drawn from the Sintel database.
The test set consisted of Sintel randomly extracting 50 images and all images in the UST-HK, KODAK database except the training set.
302: evaluation criterion
The invention mainly adopts two evaluation indexes to evaluate the quality of the reconstructed high-bit image:
PSNR (Peak Signal to Noise Ratio) is the most widely used objective criterion for evaluating image quality. It is the ratio of the mean square error between the original image and the contrast image (2) n -1) 2 The logarithm of (i.e. the square of the maximum value of the signal, where n is the number of bits). The larger the PSNR value between 2 images, the more similar.
SSIM (Structural Similarity Index) [17] The method is an index for measuring the structural similarity of two images. Structural similarity index defines structural information from the perspective of image composition as reflecting objects in a scene independent of brightness, contrastThe nature of the structure. Therefore, it can compare image distortion from three levels of brightness (mean), contrast (variance), and structure, with structure accounting for the dominant contributor. SSIM is a number between 0 and 1, and a larger SSIM indicates a smaller difference between the original image and the contrast image, i.e., better image quality.
303: comparison algorithm
The present invention was compared in experiments with eight methods, 7 traditional methods, 1 deep learning method.
The 7 conventional methods include: 1) ZP (Zero Padding algorithm, zero Padding); 2) MIG (Ideal Gain product algorithm, multiplication by an Ideal Gain); 3) BR (Bit Replication algorithm) [18] (ii) a 4) MRC (MRC based on Minimum Risk based Classification algorithm) [19] (ii) a 5) CRR (Contour Region Reconstruction algorithm based on the Hongman algorithm, contour Region Reconstruction) [20] (ii) a 6) CA (Content Adaptive Image Bit-Depth enhancement algorithm, depth Expansion) [1] (ii) a 7) ACDC (Maximum a Posteriori Estimation AC Signal Algorithm, maximum a Posteriori Estimation of AC Signal) [22]
The deep learning method is BE-CNN (Bit-Depth Enhancement method of image based on end-to-end Convolutional Neural Network), the method directly reconstructs a high-Bit image based on a simple Convolutional Neural Network, and the generation of image false contours can BE inhibited to a certain extent. This method does not restore a wide range of smooth transition regions very well.
Tables 1-2 show the evaluation results of the method and other methods for reconstructing high-bit image quality in the Sintel, UST-HK and KODAK databases, respectively (the best evaluation results are shown in bold letters). The evaluation results in table 1 were based on 50 images randomly selected from the sinter database, the evaluation results in table two were based on all 40 images from the UST-HK database, and the evaluation results in table 3 were based on all 24 pictures from the KODAK database. As can BE seen from 3 tables, the evaluation results obtained by the deep learning method BE-CNN and the method provided by the invention are obviously higher than those obtained by traditional methods such as ZP, MIG, BR, MRC, CRR, CA, ACDC and the like. Compared with BE-CNN, the method obtains higher evaluation results in a plurality of bit enhancement modes of 3 databases. This demonstrates the effectiveness of the method in an objective way.
TABLE 1
Figure BDA0002253439330000091
TABLE 2
Figure BDA0002253439330000092
Figure BDA0002253439330000101
TABLE 3
Figure BDA0002253439330000102
Reference to the literature
[1]WAN P,AU O C,TANG K,et al.From 2D Extrapolation to 1D Interpolation:Content Adaptive Image Bit-Depth Expansion[J].2012,170-5.
[2]TSOGKAS S,KOKKINOS I,PAPANDREOU G,et al.Deep Learning for Semantic Part Segmentation with High-Level Guidance[J].Computer Science,2015,530-8.
[3]RADMAN A,ZAINAL N,SUANDI S A.Automated segmentation of iris images acquired in an unconstrained environment using HOG-SVM and GrowCut[J].Digital Signal Processing,2017,64(60-70.
[4]TSENG C W,SU H R,LAI S H,et al.Depth image super-resolution via multi-frame registration and deep learning;proceedings of the Signaland Information Processing Association Summit and Conference,F,2017[C].
[5]MA C,HUANG J B,YANG X,et al.Hierarchical Convolutional Features for Visual Tracking;proceedings of the IEEE International Conference on Computer Vision,F,2015[C].
[6]WANG N,LI S,GUPTA A,et al.Transferring Rich Feature Hierarchies for Robust Visual Tracking [J].Computer Science,2015,
[7]JOHNSON J,ALAHI A,LI F F.Perceptual Losses for Real-Time Style Transfer and Super-Resolution[J].2016,694-711.
[8]LINGMA D P,WELLING M,et al.Auto-encoding variational bayes;International Conference on Learning Representations,F,2014[C].
[9]FOUNDATION X.Xiph.Org,https://www.xiph.org/,2016.
[10]WAN P,CHEUNG G,FLORENCIO D,ZHANG C,AU O C.Image Bitdepth Enhancement via Maximum a Posteriori Estimation of AC Signal[J].2016,2896-2909.
[11]KODAK.Kodak Lossless True Color Image Suite,http://r0k.us/graphics/kodak/
[12]ZEILER M D,KRISHNAN D,TAYLOR G W,et al.Deconvolutional networks;proceedings of the Computer Vision and Pattern Recognition,F,2010[C].
[13]KINGMA D P,BAJ.ADAM:A Method for Stochastic Optimization,arXiv preprint arXiv:1412.6980,2014.
[14]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[J].2015,448-56.
[15]NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines;proceedings of the International Conference on International Conference on Machine Learning,F,2010[C].
[16]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014.
[17]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-12.
[18]ULICHNEY R A,CHEUNG S.Bit-depth increase by bit replication[M].CiteSeer.2000:232--41.
[19]MITTAL G,JAKHETIYA V,JAISWAL S P,et al.Bit-depth expansion using Minimum Risk Based Classification;proceedings of the Visual Communications and Image Processing,F,2013[C].
[20]CHENG C H,AU O C,LIU C H,et al.Bit-depth expansion by contour region reconstruction;proceedings of the IEEE International Symposium on Circuits and Systems,F,2009[C].
[21]WAN P,CHEUNG G,FLORENCIO D,et al.Image bit-depth enhancement via maximum-a-posteriori estimation of graph AC component;proceedings of the IEEE International Conference on Image Processing,F,2015[C].
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-mentioned serial numbers of the embodiments of the present invention are only for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. An image bit enhancement method based on a series neural network multi-layer feature is characterized in that the bit enhancement method comprises the following steps:
constructing a training set, quantizing high-bit images of the training set into low-bit images, obtaining a residual image by pixel difference between the high-bit images and the low-bit images, and obtaining zero-filling high-bit images by zero-filling the low-bit images;
removing random variables in the VAE network, directly inputting the characteristic diagram generated by the encoder into a decoder, and establishing a deep learning network model on the basis of the characteristic diagram;
adding a plurality of serial jump connections in a network model, and transmitting each layer of feature map to all the following layers;
inputting the zero-padding high-bit image into a deep learning network model to generate a residual image, and training a network by using an Adam optimizer gradient descent perceptual loss function;
quantizing the high-bit images of the test set into low-bit images, inputting the zero-padding high-bit images into the network loaded with the parameters of the training model to generate residual images, and adding the residual images and the low-bit images in pixels to obtain reconstructed high-bit images;
wherein the series jump connection specifically comprises:
the features of the layers being connected in series in the depth direction, i.e.
Figure FDA0004105099470000011
Wherein, X i-1 Is an output characteristic diagram of the upper layer, X i Is the output characteristic diagram of the current layer, X i+1 Is the input feature map of the next layer.
2. The method of claim 1, wherein the constructing the training set further comprises:
the training set consisted of 1000 pictures randomly selected from the Sintel database.
3. The method according to claim 1, wherein the deep learning network model specifically comprises:
the convolutional neural network takes a VAE network as a main network and consists of 8 convolutional layers and 8 transposition convolutional layers, and each layer is followed by a batch normalization layer and an activation function ReLU layer.
4. The method as claimed in claim 1, wherein the loss function is specifically as follows:
Figure FDA0004105099470000012
Figure FDA0004105099470000013
a characteristic diagram, W, representing the output of the ith convolutional layer of the ith convolutional block i,j And H i,j Respectively representing the width and the height of a characteristic diagram output by the jth convolutional layer of the ith block of the pre-training network; i is Res Represents the true residual image, is>
Figure FDA0004105099470000021
Representing the reconstruction of a residual image by a convolutional neural network; n is a radical of c =5,/>
Figure FDA0004105099470000022
Layer 2 of the 1 st volume block, layer 2 of the 2 nd volume block, layer 3 of the 3 rd volume block, layer 4 of the 4 th convolution and layer 4 of the 5 th volume block of the pre-trained VGG-19 network are used.
5. The method according to claim 1, wherein the test set is specifically as follows:
the test set consisted of Sintel randomly extracting 50 images and all images in the UST-HK, KODAK database except the training set.
CN201911043280.3A 2019-10-30 2019-10-30 Image bit enhancement method based on multi-layer characteristics of series neural network Active CN110796622B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911043280.3A CN110796622B (en) 2019-10-30 2019-10-30 Image bit enhancement method based on multi-layer characteristics of series neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911043280.3A CN110796622B (en) 2019-10-30 2019-10-30 Image bit enhancement method based on multi-layer characteristics of series neural network

Publications (2)

Publication Number Publication Date
CN110796622A CN110796622A (en) 2020-02-14
CN110796622B true CN110796622B (en) 2023-04-18

Family

ID=69441995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911043280.3A Active CN110796622B (en) 2019-10-30 2019-10-30 Image bit enhancement method based on multi-layer characteristics of series neural network

Country Status (1)

Country Link
CN (1) CN110796622B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111681192B (en) * 2020-06-09 2022-08-02 天津大学 Bit depth enhancement method for generating countermeasure network based on residual image condition
CN111681293B (en) * 2020-06-09 2022-08-23 西南交通大学 SAR image compression method based on convolutional neural network
CN113066022B (en) * 2021-03-17 2022-08-16 天津大学 Video bit enhancement method based on efficient space-time information fusion
CN114022442B (en) * 2021-11-03 2022-11-29 武汉智目智能技术合伙企业(有限合伙) Unsupervised learning-based fabric defect detection algorithm
CN114663315B (en) * 2022-03-30 2022-11-22 天津大学 Image bit enhancement method and device for generating countermeasure network based on semantic fusion
CN114708180B (en) * 2022-04-15 2023-05-30 电子科技大学 Bit depth quantization and enhancement method for predistortion image with dynamic range preservation

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886768A (en) * 2017-03-02 2017-06-23 杭州当虹科技有限公司 A kind of video fingerprinting algorithms based on deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018053340A1 (en) * 2016-09-15 2018-03-22 Twitter, Inc. Super resolution using a generative adversarial network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886768A (en) * 2017-03-02 2017-06-23 杭州当虹科技有限公司 A kind of video fingerprinting algorithms based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘毅松,孙雨耕,胡华东,于洁潇.超限车辆的最短路径在MAPGIS中的实现.《计算机工程与设计》.2005,第26卷(第09期),2335-2337. *

Also Published As

Publication number Publication date
CN110796622A (en) 2020-02-14

Similar Documents

Publication Publication Date Title
CN110796622B (en) Image bit enhancement method based on multi-layer characteristics of series neural network
CN106952228B (en) Super-resolution reconstruction method of single image based on image non-local self-similarity
CN111709895A (en) Image blind deblurring method and system based on attention mechanism
CN111667424B (en) Unsupervised real image denoising method
CN111091503B (en) Image defocusing and blurring method based on deep learning
CN111028163A (en) Convolution neural network-based combined image denoising and weak light enhancement method
CN112288632B (en) Single image super-resolution method and system based on simplified ESRGAN
CN112164011B (en) Motion image deblurring method based on self-adaptive residual error and recursive cross attention
CN111127325B (en) Satellite video super-resolution reconstruction method and system based on cyclic neural network
CN111861894A (en) Image motion blur removing method based on generating type countermeasure network
CN112270654A (en) Image denoising method based on multi-channel GAN
CN112669214B (en) Fuzzy image super-resolution reconstruction method based on alternating direction multiplier algorithm
CN106169174B (en) Image amplification method
CN116051428B (en) Deep learning-based combined denoising and superdivision low-illumination image enhancement method
CN115170410A (en) Image enhancement method and device integrating wavelet transformation and attention mechanism
Chen et al. Image denoising via deep network based on edge enhancement
Yang et al. A survey of super-resolution based on deep learning
CN111681192B (en) Bit depth enhancement method for generating countermeasure network based on residual image condition
CN113421186A (en) Apparatus and method for unsupervised video super-resolution using a generation countermeasure network
CN117274059A (en) Low-resolution image reconstruction method and system based on image coding-decoding
CN117011357A (en) Human body depth estimation method and system based on 3D motion flow and normal map constraint
CN116188265A (en) Space variable kernel perception blind super-division reconstruction method based on real degradation
Cai et al. Real-time super-resolution for real-world images on mobile devices
Fan et al. Single image super resolution method based on edge preservation
CN112348745B (en) Video super-resolution reconstruction method based on residual convolutional network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant