CN110458802A - Based on the projection normalized stereo image quality evaluation method of weight - Google Patents

Based on the projection normalized stereo image quality evaluation method of weight Download PDF

Info

Publication number
CN110458802A
CN110458802A CN201910580586.6A CN201910580586A CN110458802A CN 110458802 A CN110458802 A CN 110458802A CN 201910580586 A CN201910580586 A CN 201910580586A CN 110458802 A CN110458802 A CN 110458802A
Authority
CN
China
Prior art keywords
image
normalization
convolutional neural
network
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910580586.6A
Other languages
Chinese (zh)
Inventor
李素梅
王明毅
赵平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910580586.6A priority Critical patent/CN110458802A/en
Publication of CN110458802A publication Critical patent/CN110458802A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to field of image processing, to propose new image quality evaluating method, and the ill-conditioning problem in network training process can be solved with being consistent property of human eye subjective assessment.Deep learning method for stereo image quality evaluation provides Research Thinking, and the development of stereoscopic imaging technology is pushed on the basis of certain.For this purpose, it is of the invention, based on the projection normalized stereo image quality evaluation method of weight, the left and right viewpoint figure of stereo-picture is merged, single width blending image is obtained, then single image is pre-processed: stripping and slicing and normalization;Build profound convolutional neural networks model, using pretreated image slice as the input of profound convolutional neural networks, and profound convolutional neural networks structure is optimized using the normalization of projection weight and batch data normalization, the quality evaluation result of stereo-picture is obtained by the output of profound convolutional neural networks.Present invention is mainly applied to image procossing occasions.

Description

Stereo image quality evaluation method based on projection weight normalization
Technical Field
The invention belongs to the field of image processing, and relates to application and optimization of image fusion and deep learning in stereo image quality evaluation.
Background
The stereo imaging technology can bring better visual body to peopleExperience shows that the quality degradation problem is generated from the acquisition to the display of the stereo image[1-2]The degraded image may affect the perception of the stereo content, so how to reasonably and efficiently evaluate the quality of the stereo image has become one of the research hotspots in the field of stereo information. The stereo image quality evaluation method mainly comprises subjective evaluation and objective evaluation. However, the subjective evaluation experiment is time-consuming and labor-consuming, and has a large cost. And the objective evaluation has stronger operability. Therefore, establishing a reasonable and efficient objective evaluation mechanism for the quality of the stereo image has very important practical significance.
Until now, researchers have proposed various methods for evaluating the quality of stereo images, which can be roughly classified into conventional methods and methods using artificial neural networks. Most of traditional methods respectively extract the features of the left view and the right view, and then weight the quality scores of the left view and the right view to obtain the final objective evaluation value[3-7]. However, the features extracted by the conventional method do not necessarily reflect the essential features of the image. To better simulate the mechanism of human eye feature extraction, researchers have applied artificial neural networks to stereoscopic image quality assessment, e.g. [8-10 ]]And the shallow neural network is applied to the objective quality evaluation of the stereo image, but the number of layers of the network is small, the structure is simple, and the process of processing information by levels of a human visual system cannot be simulated more accurately. Compared with a shallow neural network, deep learning can simulate the mode of human brain processing information, and features can be extracted layer by layer through a deep network. Convolutional Neural Network (CNN) is a classic Network in deep learning, and is applicable to the fields of computer vision, natural language processing, and the like. Zhang Wei et al applies convolutional neural network to stereo image quality evaluation, performs feature extraction with 2 convolutional layers and 2 pooling layers, introduces Multi-layer perceptron (MLP) at the end of the network, and connects the learned features to obtain quality scores[11](ii) a Chen Hui et al adopts a convolutional neural network model with 12 convolutional layers[12]Ding et al, have higher objective evaluation scores and subjective evaluation scores of human eyes using a convolutional neural network model with 5 convolutional layersConsistency[13]. The structure of the deep neural network adopted in the field of stereo image quality evaluation at present has certain limitations: on one hand, the arrangement modes of convolution kernels in the network are simple, the convolution kernels are connected in sequence, and the extracted features are single; on the other hand, the layers forming the network are the most basic convolutional layer, the pooling layer and the full-connection layer, the functions are few, and normalization is not carried out, so that the problem of gradient dispersion cannot be handled by the network.
In addition, in practical research, it is found that when the human brain perceives a stereoscopic image, the left and right views are fused first, and then the fused image is processed hierarchically[14]. Lin et al perform quality evaluation on the fused stereo image by using the traditional method, but only fuse a phase diagram and an amplitude diagram[15]. To better simulate this feature, deep learning is used to evaluate the stereo image quality (see [16]]) Processing with fused images has also begun, but the fusion method of this document does not take into account the thresholds at which gain enhancement and gain control occur[17]
Aiming at the problems, the invention provides a three-dimensional image quality evaluation model based on a deep convolutional neural network, and the preprocessed fusion image is used as the input of the network, so that the learning process of the network is more consistent with the visual characteristics of human eyes. A Batch Normalization layer (BN) is introduced into the model to ensure that network output data and input data are distributed at the same time and avoid gradient disappearance; a Projection Weight Normalization (PBWN) layer is introduced to normalize parameters with different magnitudes, and the ill-conditioned phenomenon of a Hessian matrix is relieved, so that the learning capability of the network is improved. The first stage of the model is a convolution kernel parallel connection module, the second stage is that convolution kernels are connected with the module in sequence, a residual error unit is introduced to avoid network degradation, and finally a full connection layer is introduced to finish quality evaluation of the stereo image.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a stereo image quality evaluation method based on projection weight normalization based on a deep convolutional neural network. The method has good performance, keeps consistency with human eye subjective evaluation, introduces data batch normalization and projection weight normalization, and solves the problem of ill-condition in the network training process. The method provides a research idea for a deep learning method for evaluating the quality of the stereo image, and promotes the development of the stereo imaging technology on a certain basis. Therefore, the invention adopts the technical scheme that a three-dimensional image quality evaluation method based on projection weight normalization fuses left and right viewpoint images of a three-dimensional image to obtain a single fused image, and then preprocesses the single image: cutting and normalizing; and constructing a deep convolutional neural network model, taking the preprocessed image blocks as the input of the deep convolutional neural network, optimizing the structure of the deep convolutional neural network by adopting projection weight normalization and data batch normalization, and obtaining the quality evaluation result of the stereo image through the output of the deep convolutional neural network.
The specific steps of obtaining the fused image
Using a Gabor filter having 6 dimensions fsE {1.5,2.5,3.5,5,7,10} } and 8 directions theta e { k pi/8 | k ═ 0,1 … 7}, and fusing the Gabor-filtered left and right views into an image according to formula (1).
Wherein, Il(x, y) and Ir(x, y) denotes a pixel value at a position (x, y) in the left and right views, respectively, C (x, y) denotes a pixel value of the fused image, TCE denotes an enhancement component to the present viewpoint, TCE denotes*Representing the suppressed component for the other viewpoint, the calculation is as shown in equations (2) and (3):
wherein t represents a left viewpoint or a right viewpoint, gc represents an enhancement threshold, ge represents a control threshold, 48 images are obtained after Gabor filtering,frequency information of the nth image representing the t viewpoint filtered by the contrast sensitivity function,weights of the n-th image representing t-viewpoint, i, j representing 6 scales f of Gabor filtering, respectivelysE {1.5,2.5,3.5,5,7,10} (cycles/degree) and 8 directions θ e { k pi/8 | k ═ 0,1 … 7 };
image pre-processing
The normalization calculation process is shown in equation (5):
where I (x, y) represents a pixel value at the (x, y) coordinate point, μ (x, y) is an average value of the pixel values, σ (x, y) is a standard deviation of the pixel values, and ∈ is an arbitrary positive number approaching 0 infinitely.
Convolutional neural network model
Based on multi-scale extraction of feature inclusion structure and residual network structure Block, build a deep convolutional neural network model with two convolutional kernel arrangement modes, the input of the model is a small Block after cutting, the model comprises 1 inclusion structure, 1 convolutional layer, 3 Block structures, 1 pooling layer and 1 full-connection layer, in the same layer in the network inclusion structure, through convolutional kernel parallel operation of different sizes, the features of different scales of the image are extracted, and the convolutional kernel of 1 × 1 size is introduced to reduce network parameters, so that the computational complexity is reduced.
(1) Projection weight normalization
In the planning problem of seeking the optimal solution by the network, adding the constraint on the weight matrix W of each layer:
min l(y,f(x;W))
wherein W ═ { W ═ WiI is 1,2 … L, the elements in the set are the weight matrices for each layer from layer 1 to L, L (y, f (x; W)) represents the loss function, with y being the desired output and f (x; W) being the actual output.Representation reservation matrixAnd the main diagonal elements of the matrixAll off-diagonal elements become 0.
The constraint defines the weight matrix of each layer in a subspace of the manifold space, namely, the weight matrix w of each layer satisfies
ddiag(wwT)=E (7)
Solving the constraint by using Riemann Riemannian optimization theory to obtain Riemannian gradient in manifold space
Wherein,the gradient is obtained without constraint. When the weight matrix omega of each neuron meets the unit normalization, namely omegaTThe riemann gradient is obtained based on equation (8) as 1:
riemann gradient is reduced compared with original gradientAn itemThe norm of this reduced term is analyzed:
the original gradient is adopted for calculation to reduce the calculation amount, and the formula (11) is adopted for weight updating:
(2) batch normalization of data
The batch normalization method of data is shown in formula (12), and during the training process, the mean value mu and the variance sigma are calculated for the data of each batch2For each feature xiProcessing to obtain the activation y after data batch normalizationi
During the test, the mean value of all training batchs is used to represent E [ x ], the unbiased estimation of the variance of all training batchs is used to represent var [ x ], and m is the size of each batch as shown in the formulas (13) and (14).
E[x]=EBB] (13)
Therefore, in the testing stage, the formula of data batch normalization is shown in formula (15), the function of the parameter gamma and beta is zooming and translation, the expression capability of the model is restored, and the network generalization performance is improved:
the invention has the characteristics and beneficial effects that:
the invention provides a stereo image quality evaluation method introducing projection weight normalization based on a deep convolutional neural network, and the identification rate of stereo image quality evaluation is high. The CNN model extracts the characteristics of the preprocessed fused stereo image through two modules of direct motion and parallel motion of a convolution kernel, so that the network can learn the image more fully. Compared with the existing deep learning evaluation algorithm, the invention introduces BN and PBWN to carry out network optimization, solves the ill-conditioned problem in the network training process and effectively improves the network evaluation accuracy.
The stereo image quality evaluation method takes human vision mechanism into consideration, takes the preprocessed fusion image as the input of the network, introduces the optimization of the deep convolutional neural network structure, and effectively improves the performance of the network. Experiments show that the evaluation result of the invention has better consistency with subjective quality.
Description of the drawings:
FIG. 1 is a detailed flow diagram of the process.
Detailed Description
The method adopts the technical scheme that the left and right viewpoint images of the stereo image are fused to obtain a single fused image, and then the single image is cut into blocks and normalized. And constructing a deep convolutional neural network model, taking the preprocessed image small blocks as the input of the deep convolutional neural network, optimizing the structure of the deep convolutional neural network by adopting projection weight normalization and data batch normalization, and obtaining the quality of the stereo image through the output of the deep convolutional neural network.
1. Fusing images
Inspired by the binocular competition phenomenon of Human eyes in the Human Visual System (HVS) of the biological science, the invention adopts a method of fusing images and adopts a Gabor filter to filter the images. The Gabor filter has six dimensions fsE {1.5,2.5,3.5,5,7,10} (cycles/hierarchy) } and eight directions theta e { k pi/8 | k ═ 0,1, … 7 }. After filtering, 48 characteristic maps of each channel of each viewpoint can be obtained. According to a binocular competition mechanism, the increase of the viewpoint is combinedAnd calculating the strong component and the inhibition component of the other viewpoint to obtain a final fusion image.
2. Deep learning
And a convolutional neural network algorithm which starts earlier in deep learning and develops more mature is selected. Based on Incepton structure and Block structure[18-19]The invention builds a deep convolutional neural network model with the two convolutional kernel arrangement modes simultaneously.
3. Optimization of network architecture
A Batch Normalization (BN) layer is introduced into the network to ensure that network output data and input data are distributed at the same time and avoid gradient disappearance; a Projection weight normalization (PBWN) layer is introduced to normalize parameters with different magnitudes, and the ill-conditioned phenomenon of a Hessian matrix is relieved, so that the learning capability of the network is improved.
The purpose of projection weight normalization is to solve the problem of network training ill-condition caused by the space symmetry of the scaling weight in the deep learning nonlinear network[20]. The scaling weight value space symmetry causes the Hessian matrix to be involved in a ill-conditioned state, so that the network is easy to be involved in a local optimal value in training, and the network is not favorable for seeking a global optimal solution[21]. In order to alleviate the problem, the unit normalization is performed on the weight values, so that the same magnitude of the weight values of each layer is ensured.
The batch normalization of the data can avoid the gradual deviation of data distribution and effectively solve the problem that the original space and the target space are not distributed uniformly[22]And for the output after neuron activation, normalization processing is carried out, and then the output is sent to the next layer of neurons for activation, so that gradient dispersion and gradient explosion are avoided. And the learnable reconstruction parameters are introduced, so that the network learning ability and the generalization ability are improved.
The invention performs experiments on the disclosed stereoscopic image libraries LIVE I and LIVE II. LIVE-3 dii has 20 pairs of original images, 365 symmetrically distorted images containing 5 kinds of distortion (Gblur, WN, JPEG, JP2K and FF), LIVE-3 dii has 8 pairs of original images, 360 symmetrically and asymmetrically distorted images containing 5 kinds of distortion (Gblur, WN, JPEG, JP2K and FF).
The method is described in detail below with reference to specific examples.
The invention provides a stereo image quality evaluation method introducing projection weight normalization based on a deep convolutional neural network, and the identification rate of stereo image quality evaluation is high. The CNN model extracts the characteristics of the preprocessed fused stereo image through two modules of direct motion and parallel motion of a convolution kernel, so that the network can learn the image more fully. Compared with the existing deep learning evaluation algorithm, the invention introduces BN and PBWN to carry out network optimization, solves the ill-conditioned problem in the network training process and effectively improves the network evaluation accuracy. The specific flow chart of the method provided by the invention is shown in figure 1.
The method comprises the following specific steps:
1. acquisition of fused images
Using a Gabor filter having 6 dimensions fsE {1.5,2.5,3.5,5,7,10} (cycles/hierarchy) } and 8 directions theta e { k pi/8 | k ═ 0,1 … 7 }. The Gabor filtered left and right views are fused into one image according to formula (16).
Wherein, Il(x, y) and Ir(x, y) represents a pixel value at a position (x, y) in the left and right views, respectively, and C (x, y) represents a pixel value of the fused image. TCE denotes the enhancement component for this view, TCE*Representing the suppressed component for the other viewpoint, the calculation is as shown in equations (17) and (18):
wherein t represents a left viewPoint or right viewpoint, gc represents enhancement threshold, and ge represents control threshold. After Gabor filtering, 48 images are obtained,frequency information of the nth image representing the t viewpoint filtered by the contrast sensitivity function,weights of the n-th image representing t-viewpoint, i, j representing 6 scales f of Gabor filtering, respectivelysE {1.5,2.5,3.5,5,7,10} (cycles/degree) and 8 directions theta e { k pi/8 | k ═ 0,1 … 7 }.
2. Image pre-processing
The size of the single fused image is large, so that the original image is cut into image blocks of 32 x 32, the network operation amount is reduced, and then normalization calculation is performed. The normalization calculation process is shown in equation (20):
where I (x, y) represents a pixel value at the (x, y) coordinate point, μ (x, y) is an average value of the pixel values, σ (x, y) is a standard deviation of the pixel values, and ∈ is an arbitrary positive number approaching 0 infinitely.
3. Convolutional neural network model
Based on the Incepton structure and the Block structure, the invention builds a deep convolutional neural network model with two convolutional kernel arrangement modes simultaneously, and the input of the model is a cut small Block. The model comprises 1 inclusion structure, 1 convolutional layer, 3 Block structures, 1 pooling layer and 1 full-connection layer, as shown in fig. 1.
In the same layer in the network inclusion structure, features of different scales of the image can be extracted through convolution kernel parallel operation of different sizes, so that the extraction process is more comprehensive and sufficient, and the convolution kernels of 1 × 1 size are introduced to reduce network parameters and reduce the computational complexity.
TABLE 1 network model parameter settings
The Block structure introduces the concept of 'residual error', and the input of the upper layer is directly connected and output by adding a channel, so that the problem of network degradation is solved.
4. Optimization of network architecture
In the CNN network adopted by the invention, a projection weight normalization layer (PBWN) and a data batch normalization layer (BN) are respectively introduced after each convolution layer to normalize weight parameters and input data of each layer.
(1) Projection weight normalization
In the planning problem of seeking the optimal solution by the network, adding the constraint on the weight matrix W of each layer:
min l(y,f(x;W))
wherein W ═ { W ═ WiI is 1,2 … L, the elements in the set are the weight matrices for each layer from layer 1 to L, L (y, f (x; W)) represents the loss function, with y being the desired output and f (x; W) being the actual output.Representation reservation matrixAnd the main diagonal elements of the matrixAll off-diagonal elements become 0.
The constraint defines the weight matrix of each layer in a subspace of the manifold space, namely, the weight matrix w of each layer satisfies
ddiag(wwT)=E (22)
Solving the constraint by adopting a Riemann optimization theory to obtain a Riemann gradient in manifold space
Wherein,the gradient is obtained without constraint. When the weight matrix omega of each neuron meets the unit normalization, namely omegaTThe riemann gradient is obtained based on equation (8) as 1:
riemann gradient is reduced by one term compared with original gradientThe norm of this reduced term is analyzed:
it is stated that this term is not the dominant term in equation (24) and that the Riemann gradient has been shown by experiment to be nearly as effective as the original gradient. Therefore, the method adopts the original gradient to calculate so as to reduce the calculation amount.
Therefore, the present invention adopts the formula (26) to update the weight.
(2) Batch normalization of data
The batch normalization method of data is shown in formula (27), and during the training process, the mean value mu and the variance sigma are calculated for each batch2For each feature xiProcessing to obtain the activation y after data batch normalizationi
During testing, E [ x ] is represented by the mean of all training batchs, var [ x ] is represented by the unbiased estimate of the variance of all training batchs, and m is the size of each batch as shown in equations (28) and (29).
E[x]=EBB] (28)
In the test stage, the formula of data batch normalization is shown in formula (30), and the function of the parameters gamma and beta is zooming and translation, so that the expression capability of the model is restored, and the network generalization performance is improved.
5. Stereo image quality evaluation results and analysis
The experiments of the invention were performed in two open stereo image libraries, LIVE I and LIVE II databases, respectively. The evaluation indexes selected by the method are Pearson Linear Correlation Coefficient (PLCC), Spearman Rank Order Correlation Coefficient (SROCC) and Root Mean Square Error (RMSE). The larger the values of PLCC and SROCC are, the smaller the value of RMSE is, the stronger the consistency between the evaluation result of the model and the subjective result is, and the better the effect is.
Table 2 shows the performance of the algorithm of the present invention compared to other methods on the LIVE-I, LIVE-II database.
TABLE 2 Overall Performance comparison of the evaluation methods
Chen 12 gives no overall evaluation index value of LIVE-II database, only gives index values of the respective distortion types, and comparison with Chen 12 is performed in tables 3 and 4. Table 2 shows that the performance of the algorithm of the invention is significantly better than the Heeseok [16] performance, because the invention takes the linear and nonlinear conditions into full consideration in the process of fusing images, i.e. when the stimulation received by both eyes is very small, the stimulation received by the left and right eyes is weighted linearly, and when the stimulation reaches the threshold of generating gain enhancement and gain control, the nonlinear weighting is adopted. Compared with Lin 15, the invention fuses the original image when fusing the image, 15 fuses only the bottom layer characteristic of the image, so the index obtained by the invention is better than the index of 15. Compared with other deep learning methods [11,13] and traditional methods [5-7] which do not fuse images, the PLCC and the SROCC obtained by the method are obviously improved. On LIVE-II, the PLCC bit-line obtained by the present invention is suboptimal, 0.0122% lower than Ding 13. Compared with other algorithms, the RMSE calculated by the model on LIVE-I and LIVE-II databases is smaller, and the algorithm has better performance on the quality evaluation of three-dimensional images of symmetric distortion and asymmetric distortion by integrating three indexes.
The evaluation effects of the algorithm of the invention on different distortion types are analyzed, as shown in tables 3 and 4.
TABLE 3 Performance comparison of evaluation methods for quality evaluation of stereo images of different distortion types in LIVE-3 DII database
TABLE 4 Performance comparison of the evaluation methods for quality evaluation of stereo images of different distortion types in LIVE-3D II database
When the network is tested, the indexes of PLCC and SROCC in the tables 3 and 4 are generally lower than that of the existing algorithm, because the experiment of the invention is 2 classifications, even if only 1 graph is judged incorrectly in the test, the PLCC is also greatly influenced. Experiments show that the algorithm has a good overall evaluation effect on 5 distortion types, and for FF distortion types in a LIVE-I database and FF and BLUR distortion types in a LIVE-II database, the recognition rate reaches 100%, so that the values of PLCC and SROCC also reach 1, and the value of RMSE is 0.
Table 5 shows the effect of adding PBWN layer after each convolutional layer and not on model performance. The result shows that the experimental result is obviously improved after the PBWN is added. The identification rate of LIVE-I image quality evaluation is improved by 2.833% and reaches 98.113%, and the identification rate of LIVE-II image quality evaluation is improved by 5.88% and reaches 96.47%.
TABLE 5 recognition rate of this algorithm for stereo image quality evaluation
TABLE 6 time (unit: second) required for this algorithm test
Table 6 shows the effect of comparing the presence or absence of PBWN on the test time. The PBWN ensures that the magnitude of each layer of weight parameter is the same, and the weight parameters are normalized by single positions, thereby effectively avoiding the ill-conditioned phenomenon of a Hessian matrix in the training process, improving the learning capability and generalization capability of the network, accelerating the network convergence and shortening the time required by network test.
Reference to the literature
[1]Zilly F,Kluger J,Kauff P.Production rules for stereo acquisition[J].Proceedings of the IEEE,2011,99(4):590-606.
[2]Urey H,Chellappan K V,Erden E,et al.State of the art in stereoscopic and autostereoscoic displays[J].Proceedings of the IEEE,2011,99(4):540-555.
[3] Objective evaluation model of stereoscopic image quality based on structural distortion analysis [ J ] computer aided design and graphics bulletin, 2012,24 (8): 1047-1056.
[4] Xushuning, prunus mume, stereoscopic image quality evaluation method based on visual saliency [ J ] information technology, 2016, 2016 (10): 91-93.
[5]Bensalma Rafik,Larabi Mohamed-Chaker.A perceptual metric for stereoscopic image quality assessment based on the binocular energy[J].Multidimensional Systems and Signal Processing,2013,24(2):281-316.
[6]Shao Feng,Jiang Gangyi,Yu Mei,et al.Binocular energy response based quality assessment of stereoscopic images[J].Digital Signal Processing,2014,29:45-53.
[7]Shao Feng,Lin Weisi,Wang Shanshan,et al.Learning Receptive Fields and Quality Lookups for Blind Quality Assessment of Stereoscopic Images[J].IEEE Transactions on Cybernetics,2016,46(3):730-743.
[8] Application of extreme learning machine in objective evaluation of stereoscopic image quality [ J ]. photoelectron, laser, 2014, 2014 (9): 1837-1842.
[9] Consider Shanbo, Shafeng, Jiangxian, etc. a support vector regression-based stereoscopic image objective quality evaluation model [ J ] electronic and informatics newspaper, 2012, 34 (2): 368-374.
[10] Wu-guang, lisuride, chengjincui objective evaluation of stereoscopic images based on genetic neural networks [ J ] information technology, 2013, 2013 (5): 148-153.
[11]Zhang Wei,Qu Cchenfei,Lin Ma,et al.Learning structure of stereoscopic image for no-reference quality assessment with convolutionalneural network[J].Pattern Recognition,2016,59(C):176-187.
[12] Chenhui, lie Chaofeng, stereoscopic color image quality evaluation of deep convolutional neural network [ J ] computer science and exploration, 2018, 12 (08): 1315-1322
[13]Ding Yong,Deng Ruizhe,Xie Xin,et al.Reference stereoscopic image quality assessment using convolutional neural network for adaptive featureextraction[J].IEEE Access,2018,2018(6):37595-37603.
[14]Hubel D.H.,Wiesel T.N.Receptive fields of singleneurones in the cat’s striate cortex[J].The Journal of Physiology,1959,148(3):574-591.
[15]Lin Yancong,Yang Jiachen,Lu Wen,et al.Quality index for stereoscopic images by jointly evaluating cyclopean amplitude and cyclopeanphase[J].IEEE Journal of Selected Topics in Signal Processing,2017,11(11):89-101.
[16]Oh Heeseok,Ahn Sewoong,Kim Jongyoo,et al.Blind deep S3D image quality evaluation via local to global feature aggregation[J].IEEE Transactions on Image Processing,2017,26(10):4923-4936.
[17]Ding Jian,Klein S.A.,Levi D.M.Binocular combination of phaseand contrast explained by a gain-control and gain-enhancement model[J].Journal of Vision,2013,13(2):13.
[18]Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[J].2014:1-9.
[19]He K,Zhang X,Ren S,et al.Deep Residual Learning for Image Recognition[J].2015:770-778.
[20]L.Huang,X.Liu,B.Lang,and B.Li.Projection based weight normalization for deep neural networks.CoRR,abs/1710.02338,2017.
[21]Ian Goodfellow,Yoshua Bengio,and Aaron Courville.Deep Learning.MIT Press,2016.
[22]S.Ioffe and C.Szegedy.Batch normalization:Accelerating deep network training by reducing internal covariate shift.In Proceedings of the32nd International Conference on Machine Learning,ICML 2015。

Claims (3)

1. A three-dimensional image quality evaluation method based on projection weight normalization is characterized in that left and right viewpoint images of a three-dimensional image are fused to obtain a single fused image, and then the single image is preprocessed: cutting and normalizing; and constructing a deep convolutional neural network model, taking the preprocessed image blocks as the input of the deep convolutional neural network, optimizing the structure of the deep convolutional neural network by adopting projection weight normalization and data batch normalization, and obtaining the quality evaluation result of the stereo image through the output of the deep convolutional neural network.
2. The method of claim 1, wherein the projection weight normalization-based stereo image quality evaluation method,
the specific steps of obtaining the fused image
Using a Gabor filter having 6 dimensions fsE {1.5,2.5,3.5,5,7,10} } and 8 directions theta e { k pi/8 | k ═ 0,1 … 7}, and fusing the Gabor-filtered left and right views into an image according to formula (1).
Wherein, Il(x, y) and Ir(x, y) denotes a pixel value at a position (x, y) in the left and right views, respectively, C (x, y) denotes a pixel value of the fused image, TCE denotes an enhancement component to the present viewpoint, TCE denotes*Representing the suppressed component for the other viewpoint, the calculation is as shown in equations (2) and (3):
wherein t represents a left viewpoint or a right viewpoint, gc represents an enhancement threshold, ge represents a control threshold, 48 images are obtained after Gabor filtering,frequency information of the nth image representing the t viewpoint filtered by the contrast sensitivity function,weights of the n-th image representing t-viewpoint, i, j representing 6 scales f of Gabor filtering, respectivelysE {1.5,2.5,3.5,5,7,10} (cycles/degree) and 8 directions θ e { k pi/8 | k ═ 0,1 … 7 };
image pre-processing
The normalization calculation process is shown in equation (5):
wherein I (x, y) represents a pixel value at the (x, y) coordinate point, μ (x, y) is an average value of the pixel values, σ (x, y) is a standard deviation of the pixel values, and ∈ is an arbitrary positive number approaching 0 infinitely;
convolutional neural network model
Based on multi-scale extraction of feature inclusion structure and residual network structure Block, build a deep convolutional neural network model with two convolutional kernel arrangement modes, the input of the model is a small Block after cutting, the model comprises 1 inclusion structure, 1 convolutional layer, 3 Block structures, 1 pooling layer and 1 full-connection layer, in the same layer in the network inclusion structure, through convolutional kernel parallel operation of different sizes, the features of different scales of the image are extracted, and the convolutional kernel of 1 × 1 size is introduced to reduce network parameters, so that the computational complexity is reduced.
3. The method of claim 1, wherein the projection weight normalization-based stereo image quality evaluation method,
(1) projection weight normalization
In the planning problem of seeking the optimal solution by the network, adding the constraint on the weight matrix W of each layer:
min l(y,f(x;W))
wherein W ═ { W ═ WiI-1, 2 … L represents a set of network weight matrices, the elements in the set being layers 1 to LL (y, f (x; W)) represents the loss function, with y being the desired output and f (x; W) being the actual output.Representation reservation matrixAnd the main diagonal elements of the matrixAll off-diagonal elements become 0.
The constraint defines the weight matrix of each layer in a subspace of the manifold space, namely, the weight matrix w of each layer satisfies
ddiag(wwT)=E (7)
Solving the constraint by using Riemann Riemannian optimization theory to obtain Riemannian gradient in manifold space
Wherein,the gradient is obtained without constraint. When the weight matrix omega of each neuron meets the unit normalization, namely omegaTThe riemann gradient is obtained based on equation (8) as 1:
riemann gradient is reduced by one term compared with original gradientThe norm of this reduced term is analyzed:
the original gradient is adopted for calculation to reduce the calculation amount, and the formula (11) is adopted for weight updating:
(2) batch normalization of data
The batch normalization method of data is shown in formula (12), and during the training process, the mean value mu and the variance sigma are calculated for the data of each batch2For each feature xiProcessing to obtain the activation y after data batch normalizationi
During the test, the mean value of all training batchs is used to represent E [ x ], the unbiased estimation of the variance of all training batchs is used to represent var [ x ], and m is the size of each batch as shown in the formulas (13) and (14).
E[x]=EBB] (13)
Therefore, in the testing stage, the formula of data batch normalization is shown in formula (15), the function of the parameter gamma and beta is zooming and translation, the expression capability of the model is restored, and the network generalization performance is improved:
CN201910580586.6A 2019-06-28 2019-06-28 Based on the projection normalized stereo image quality evaluation method of weight Pending CN110458802A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910580586.6A CN110458802A (en) 2019-06-28 2019-06-28 Based on the projection normalized stereo image quality evaluation method of weight

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910580586.6A CN110458802A (en) 2019-06-28 2019-06-28 Based on the projection normalized stereo image quality evaluation method of weight

Publications (1)

Publication Number Publication Date
CN110458802A true CN110458802A (en) 2019-11-15

Family

ID=68481840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910580586.6A Pending CN110458802A (en) 2019-06-28 2019-06-28 Based on the projection normalized stereo image quality evaluation method of weight

Country Status (1)

Country Link
CN (1) CN110458802A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111583377A (en) * 2020-06-10 2020-08-25 江苏科技大学 Volume rendering viewpoint evaluation and selection method for improving wind-driven optimization
CN111915589A (en) * 2020-07-31 2020-11-10 天津大学 Stereo image quality evaluation method based on hole convolution
CN112164056A (en) * 2020-09-30 2021-01-01 南京信息工程大学 No-reference stereo image quality evaluation method based on interactive convolution neural network
CN112257709A (en) * 2020-10-23 2021-01-22 北京云杉世界信息技术有限公司 Signboard photo auditing method and device, electronic equipment and readable storage medium
CN113205503A (en) * 2021-05-11 2021-08-03 宁波海上鲜信息技术股份有限公司 Satellite coastal zone image quality evaluation method
CN117269992A (en) * 2023-08-29 2023-12-22 中国民航科学技术研究院 Satellite navigation multipath signal detection method and system based on convolutional neural network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108389192A (en) * 2018-02-11 2018-08-10 天津大学 Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN108537777A (en) * 2018-03-20 2018-09-14 西京学院 A kind of crop disease recognition methods based on neural network
CN108769671A (en) * 2018-06-13 2018-11-06 天津大学 Stereo image quality evaluation method based on adaptive blending image
CN109360178A (en) * 2018-10-17 2019-02-19 天津大学 Based on blending image without reference stereo image quality evaluation method
CN109671023A (en) * 2019-01-24 2019-04-23 江苏大学 A kind of secondary method for reconstructing of face image super-resolution
CN109714592A (en) * 2019-01-31 2019-05-03 天津大学 Stereo image quality evaluation method based on binocular fusion network
CN109902202A (en) * 2019-01-08 2019-06-18 国家计算机网络与信息安全管理中心 A kind of video classification methods and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108389192A (en) * 2018-02-11 2018-08-10 天津大学 Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN108537777A (en) * 2018-03-20 2018-09-14 西京学院 A kind of crop disease recognition methods based on neural network
CN108769671A (en) * 2018-06-13 2018-11-06 天津大学 Stereo image quality evaluation method based on adaptive blending image
CN109360178A (en) * 2018-10-17 2019-02-19 天津大学 Based on blending image without reference stereo image quality evaluation method
CN109902202A (en) * 2019-01-08 2019-06-18 国家计算机网络与信息安全管理中心 A kind of video classification methods and device
CN109671023A (en) * 2019-01-24 2019-04-23 江苏大学 A kind of secondary method for reconstructing of face image super-resolution
CN109714592A (en) * 2019-01-31 2019-05-03 天津大学 Stereo image quality evaluation method based on binocular fusion network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LEI HUANG.ET AL: ""Projection Based Weight Normalization for Deep Neural Networks"", 《ARXIV》 *
SERGEY IOFFE.ET AL: ""Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift"", 《PROCEEDINGS OF THE 32ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111583377A (en) * 2020-06-10 2020-08-25 江苏科技大学 Volume rendering viewpoint evaluation and selection method for improving wind-driven optimization
CN111583377B (en) * 2020-06-10 2024-01-09 江苏科技大学 Improved wind-driven optimized volume rendering viewpoint evaluation and selection method
CN111915589A (en) * 2020-07-31 2020-11-10 天津大学 Stereo image quality evaluation method based on hole convolution
CN112164056A (en) * 2020-09-30 2021-01-01 南京信息工程大学 No-reference stereo image quality evaluation method based on interactive convolution neural network
CN112164056B (en) * 2020-09-30 2023-08-29 南京信息工程大学 No-reference stereoscopic image quality evaluation method based on interactive convolutional neural network
CN112257709A (en) * 2020-10-23 2021-01-22 北京云杉世界信息技术有限公司 Signboard photo auditing method and device, electronic equipment and readable storage medium
CN112257709B (en) * 2020-10-23 2024-05-07 北京云杉世界信息技术有限公司 Signboard photo auditing method and device, electronic equipment and readable storage medium
CN113205503A (en) * 2021-05-11 2021-08-03 宁波海上鲜信息技术股份有限公司 Satellite coastal zone image quality evaluation method
CN117269992A (en) * 2023-08-29 2023-12-22 中国民航科学技术研究院 Satellite navigation multipath signal detection method and system based on convolutional neural network
CN117269992B (en) * 2023-08-29 2024-04-19 中国民航科学技术研究院 Satellite navigation multipath signal detection method and system based on convolutional neural network

Similar Documents

Publication Publication Date Title
CN110458802A (en) Based on the projection normalized stereo image quality evaluation method of weight
CN107633513B (en) 3D image quality measuring method based on deep learning
CN107563422B (en) A kind of polarization SAR classification method based on semi-supervised convolutional neural networks
CN109360178B (en) Fusion image-based non-reference stereo image quality evaluation method
Zhou et al. Blind quality estimator for 3D images based on binocular combination and extreme learning machine
CN109376787B (en) Manifold learning network and computer vision image set classification method based on manifold learning network
CN108389192A (en) Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN111429402B (en) Image quality evaluation method for fusion of advanced visual perception features and depth features
Liu et al. No-reference quality assessment for contrast-distorted images
Wang et al. GKFC-CNN: Modified Gaussian kernel fuzzy C-means and convolutional neural network for apple segmentation and recognition
CN108389189B (en) Three-dimensional image quality evaluation method based on dictionary learning
Jiang et al. Learning a referenceless stereopair quality engine with deep nonnegativity constrained sparse autoencoder
CN108875655A (en) A kind of real-time target video tracing method and system based on multiple features
Sun et al. Learning local quality-aware structures of salient regions for stereoscopic images via deep neural networks
CN109788275A (en) Naturality, structure and binocular asymmetry are without reference stereo image quality evaluation method
Niu et al. Siamese-network-based learning to rank for no-reference 2D and 3D image quality assessment
CN111882516B (en) Image quality evaluation method based on visual saliency and deep neural network
CN111915589A (en) Stereo image quality evaluation method based on hole convolution
Chang et al. Blind image quality assessment by visual neuron matrix
Liu et al. A multiscale approach to deep blind image quality assessment
Li et al. MCANet: Multi-channel attention network with multi-color space encoder for underwater image classification
CN108428226B (en) Distortion image quality evaluation method based on ICA sparse representation and SOM
CN113810683A (en) No-reference evaluation method for objectively evaluating underwater video quality
CN116664462B (en) Infrared and visible light image fusion method based on MS-DSC and I_CBAM
Guan et al. No-reference stereoscopic image quality assessment on both complex contourlet and spatial domain via Kernel ELM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191115