CN110458802A - Based on the projection normalized stereo image quality evaluation method of weight - Google Patents
Based on the projection normalized stereo image quality evaluation method of weight Download PDFInfo
- Publication number
- CN110458802A CN110458802A CN201910580586.6A CN201910580586A CN110458802A CN 110458802 A CN110458802 A CN 110458802A CN 201910580586 A CN201910580586 A CN 201910580586A CN 110458802 A CN110458802 A CN 110458802A
- Authority
- CN
- China
- Prior art keywords
- image
- weight
- network
- stereo
- convolutional neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000013441 quality evaluation Methods 0.000 title claims abstract description 28
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 35
- 238000010606 normalization Methods 0.000 claims abstract description 26
- 238000002156 mixing Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 28
- 238000012360 testing method Methods 0.000 claims description 12
- 230000002708 enhancing effect Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000004927 fusion Effects 0.000 claims description 5
- 238000001914 filtration Methods 0.000 claims description 4
- 230000005764 inhibitory process Effects 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000035945 sensitivity Effects 0.000 claims description 3
- 238000004091 panning Methods 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 10
- 238000012545 processing Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000003384 imaging method Methods 0.000 abstract description 3
- 238000011160 research Methods 0.000 abstract description 3
- 238000011161 development Methods 0.000 abstract description 2
- 238000011156 evaluation Methods 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 8
- 238000005457 optimization Methods 0.000 description 6
- 238000007796 conventional method Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000000638 stimulation Effects 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000003646 Spearman's rank correlation coefficient Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000002969 morbid Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008555 neuronal activation Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000010587 phase diagram Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Abstract
The invention belongs to field of image processing, to propose new image quality evaluating method, and the ill-conditioning problem in network training process can be solved with being consistent property of human eye subjective assessment.Deep learning method for stereo image quality evaluation provides Research Thinking, and the development of stereoscopic imaging technology is pushed on the basis of certain.For this purpose, it is of the invention, based on the projection normalized stereo image quality evaluation method of weight, the left and right viewpoint figure of stereo-picture is merged, single width blending image is obtained, then single image is pre-processed: stripping and slicing and normalization;Build profound convolutional neural networks model, using pretreated image slice as the input of profound convolutional neural networks, and profound convolutional neural networks structure is optimized using the normalization of projection weight and batch data normalization, the quality evaluation result of stereo-picture is obtained by the output of profound convolutional neural networks.Present invention is mainly applied to image procossing occasions.
Description
Technical field
The invention belongs to field of image processings, are related to image co-registration and deep learning in stereo image quality evaluation
Application and optimization.
Background technique
Stereoscopic imaging technology can bring preferable visual experience, but collecting display and can produce from stereo-picture
Raw degradation problems[1-2], degraded image will affect perception of the people to stereo content, therefore how carry out to the quality of stereo-picture
Rationally efficiently evaluate one of the research hotspot for having become steric information field.Stereo image quality evaluation method is broadly divided into master
It sees evaluation and objectively evaluates.But subjective assessment takes time and effort, and cost is larger.And it objectively evaluates and is operated with stronger
Property.Therefore, it establishes reasonable, efficient stereo image quality and objectively evaluates mechanism and be of great practical significance.
So far, researcher has proposed a variety of stereo image quality evaluation methods, be broadly divided into conventional method and
The method of artificial neural network.Most conventional methods carry out feature extraction to left and right view respectively, then to left and right view
The mass fraction of figure is weighted, and obtains final objectively evaluating value[3-7].But the feature that conventional method is extracted is different surely
The substantive characteristics of true reflection image.In order to which more preferable simulation human eye extracts the mechanism of feature, researcher is by artificial neural network
Network is evaluated applied to stereo image quality, such as [8-10] by shallow-layer Application of Neural Network in three-dimensional image objective quality evaluation,
But the number of plies of network is less, and structure is relatively simple, cannot more accurately simulate the mistake of human visual system's hierarchical processing information
Journey.Compared to shallow-layer neural network, deep learning can more simulate the mode of human brain processing information, can be by profound network to feature
Successively extracted.Convolutional neural networks (Convolutional Neural Network, CNN) are the classics in deep learning
Network is suitable for the fields such as computer vision, natural language processing.Convolutional neural networks are applied to vertical by Zhang Wei et al.
Body image quality evaluation carries out feature extraction with 2 convolutional layers, 2 pond layers, and is finally introducing multi-layer perception (MLP) in network
The feature learnt is carried out full connection to obtain mass fraction by (Multi-layer Perception, MLP)[11];It is old
It is intelligent et al. to use the convolutional neural networks model with 12 convolutional layers[12], Ding et al. is using the convolution with 5 convolutional layers
Neural network model, obtained objective assessment score and human eye subjective assessment score consistency with higher[13].It is vertical at present
The structure of used deep-neural-network has some limitations in body image quality evaluation field: on the one hand, in network
Arrangement mode is relatively simple between portion's convolution kernel, is attached in order, and the feature extracted is more single;On the other hand,
The layer of network consisting is most basic convolutional layer, pond layer and full articulamentum, and function is less, does not standardize, and is caused
Network can not handle gradient disperse problem.
In addition, finding in practical study, human brain first merges left and right view when perceiving stereo-picture, and
Hierarchical handles blending image afterwards[14].Lin et al. carries out quality to fused stereo-picture with conventional method and comments
Valence, but only merged phase diagram and map of magnitudes[15].For more preferable simulation this feature, using deep learning to stereo-picture
The document (such as [16]) that quality is evaluated also starts to be handled using blending image, but the fusion method of the document does not consider
The thresholding of gain suppression and gain control occurs[17]。
In view of the above problems, the invention proposes a kind of, the stereo image quality based on depth convolutional neural networks evaluates mould
Type makes the learning process of network be more in line with human-eye visual characteristic using pretreated blending image as the input of network.Mould
Type introduces batch data normalization layer (Batch Normalization, BN) to guarantee that network output data and input data are same
Distribution, avoids gradient from disappearing;Introducing projection weight normalization layer (Projection Based Weight Normalization,
PBWN) come different magnitude of parameter of standardizing, alleviate the ill phenomenon of Hai Sen (Hessian) matrix, to improve network
Habit ability.The model first stage is convolution kernel parallel connection module, and second stage is that module is linked in sequence in convolution kernel, and introduces
Residual unit avoids network from degenerating, and is finally introducing full articulamentum, completes the quality evaluation of stereo-picture.
Summary of the invention
In order to overcome the deficiencies of the prior art, it the present invention is directed to be based on depth convolutional neural networks, proposes a kind of based on projection
The normalized stereo image quality evaluation method of weight.The method better performances, and being consistent property of human eye subjective assessment, and draw
Enter batch data normalization and projection weight normalization, solves the ill-conditioning problem in network training process.The method is perspective view
As the deep learning method of quality evaluation provides Research Thinking, the development of stereoscopic imaging technology is pushed on the basis of certain.For
This, the technical solution adopted by the present invention is that, based on the projection normalized stereo image quality evaluation method of weight, by stereo-picture
Left and right viewpoint figure merged, obtain single width blending image, then single image pre-processed: stripping and slicing and normalization;
Profound convolutional neural networks model is built, using pretreated image slice as the input of profound convolutional neural networks,
And profound convolutional neural networks structure is optimized using the normalization of projection weight and batch data normalization, pass through deep layer
The output of secondary convolutional neural networks obtains the quality evaluation result of stereo-picture.
The acquisition specific steps of blending image
Using Gabor filter, which has 6 scale fs∈ { 1.5,2.5,3.5,5,7,10 } } and 8 direction θ
∈ π/8 k | and k=0,1 ... 7 }, piece image will be become according to formula (1) fusion by the filtered left and right view of Gabor.
Wherein, Il(x, y) and Ir(x, y) respectively indicates the pixel value for being located at position (x, y) in left and right view, C (x, y) table
Show that the pixel value of blending image, TCE table show the enhancing component to this viewpoint, TCE*Indicate the inhibition component to another viewpoint, meter
Shown in calculation mode such as formula (2), (3):
Wherein, t indicates left view point or right viewpoint, and gc indicates that enhancing thresholding, ge indicate control thresholding, filter through Gabor
After obtain 48 width images,Indicate the frequency information that the n-th width image of t viewpoint is filtered out by contrast sensitivity function,Indicate t view
The weight of n-th width image of point, i, j respectively indicate 6 scale f of Gabor filterings∈{1.5,2.5,3.5,5,7,10}
(cycles/degree) and 8 direction θ ∈ { π/8 k | k=0,1 ... 7 };
Image preprocessing
It normalizes shown in calculating process such as formula (5):
Wherein, I (x, y) indicates that the pixel value for being located at (x, y) coordinate points, μ (x, y) are the average value of pixel value, σ (x, y)
For the standard deviation of pixel value, ε is any positive number for being substantially equal to 0.
Convolutional neural networks model
Based on multiple dimensioned extraction feature Inception structure and residual error network structure block Block, build while there are two types of having
The profound convolutional neural networks model of convolution kernel arrangement mode, the input of the model are the fritter after cutting, and model includes 1
Inception structure, 1 convolutional layer, 3 Block structures, 1 pond layer and 1 full articulamentum are tied in network Inception
In same layer inside structure, by different size of convolution kernel concurrent operation, the feature of image different scales is extracted, and draw
Enter the convolution kernel of 1 × 1 size to reduce network parameter, reduces computation complexity.
(1) projection weight normalization
In the planning problem that network seeks optimal solution, the constraint to each layer weight matrix W is added:
min l(y,f(x;W))
Wherein W={ wi, i=1,2 ... L } and the set that indicates network weight matrix, the element in set is the 1st layer to L
The weight matrix of each layer of layer, l (y, f (x;W loss function, using y as desired output, f (x)) are indicated;It W) is reality output.It indicates to retain matrixMain diagonal element and make matrixAll off-diagonal elements become 0.
The weight matrix of each layer is provided that in a sub-spaces in manifold space, that is, each layer weight matrix w is equal by the constraint
Meet
ddiag(wwT)=E (7)
The constraint is solved using Riemann's Riemannian optimum theory, the Riemann's gradient that can be obtained in manifold space is
Wherein,For the gradient acquired under no restraint condition.When the weight matrix ω of each neuron meets unit specification
Change, i.e. ω ωT=1, its Riemann's gradient can be obtained i.e. based on formula (8) are as follows:
Riemann's gradient reduces one compared to original gradientThis norm reduced is divided
Analysis:
It is calculated using original gradient to reduce calculation amount, carries out right value update using formula (11):
(2) batch data normalizes
Shown in the normalized method of batch data such as formula (12), in the training process, to the data of every batch batch
Calculate mean μ and variances sigma2, to each feature xiIt is handled, obtaining the activation after batch data normalized is yi。
In test, E [x] is indicated with the mean value of all trained batch, with the unbiased esti-mator of all trained batch variances
It indicates var [x], as shown in formula (13), (14), m is the size of each batch.
E [x]=EB[μB] (13)
Therefore test phase, shown in the normalized formula of batch data such as formula (15), the function of parameter γ, β be scaling and
Translation, the ability to express of Restoration model improve generalization ability of network performance:
The features of the present invention and beneficial effect are:
The present invention is based on depth convolutional neural networks to propose a kind of introducing projection normalized stereo image quality of weight
Evaluation method is higher to the discrimination of three-dimensional image quality evaluation.CNN model passes through convolution kernel direct motion and parallel two modules,
Feature extraction is carried out to pretreated fusion stereo-picture, keeps network more abundant to the study of image.Compared to existing depth
Learning evaluation algorithm, present invention introduces BN and PBWN to carry out the network optimization, solves the ill-conditioning problem in network training process, effectively
Ground improves assessing network accuracy.
Stereo image quality evaluation method of the invention considers human eye vision mechanism, and pretreated blending image is made
For the input of network, and the optimization to profound convolutional neural networks structure is introduced, effectively improves the performance of network.
Experiment shows that evaluation result and subjective quality of the invention have preferable consistency.
Detailed description of the invention:
The specific flow chart of Fig. 1 this method.
Specific embodiment
The technical scheme steps that the present invention takes are first to merge the left and right viewpoint figure of stereo-picture, obtain single width
Then blending image carries out stripping and slicing and normalization to single image.Profound convolutional neural networks model is built, after pretreatment
Input of the image fritter as profound convolutional neural networks, and using the normalization of projection weight and batch data normalization pair
Profound convolutional neural networks structure optimizes, and obtains the matter of stereo-picture by the output of profound convolutional neural networks
Amount.
1, blending image
It is opened by human eye binocular warfare in bioscience human visual system (Human Visual System, HVS)
Hair, the present invention are used blending image method, are filtered using Gabor filter to image.There are six rulers for Gabor filter tool
Spend fs∈ { 1.5,2.5,3.5,5,7,10 } (cycles/degree) } and eight direction θ ∈ π/8 k | k=0,1 ... 7 }.Filtering
48 characteristic patterns in each channel of each viewpoint can be obtained afterwards.According to binocular competition mechanism, in conjunction with this viewpoint enhancing component with to another
The inhibition component of one viewpoint, is calculated final blending image.
2, deep learning
It selects in deep learning and starts to walk relatively early, to develop more mature convolutional neural networks algorithm.It is tied based on Inception
Structure and Block structure[18-19], the present invention builds while having the profound convolutional Neural net of both the above convolution kernel arrangement mode
Network model.
3, the optimization of network structure
Introduce batch data normalization layer (Batch Normalization, BN) in a network to guarantee that network exports number
, with being distributed, gradient is avoided to disappear according to input data;It introduces projection weight and normalizes layer (Projection Based Weight
Normalization, PBWN) come different magnitude of parameter of standardizing, alleviate the ill phenomenon of Hessian matrix, to improve
The learning ability of network.
The projection normalized purpose of weight is solved in deep learning nonlinear network due to scaling weight spatial symmetry
Caused network training ill-conditioning problem[20].Scaling weight spatial symmetry makes Hessian matrix fall into morbid state, and network is caused to exist
It is easily trapped into local optimum in training, is unfavorable for network and seeks globally optimal solution[21].In order to alleviate the problem, by weight into
Row unit standardization, so that it is guaranteed that the magnitude of each layer weight is identical.
Batch data normalizes avoidable data distribution and gradually deviates, and effectively the former space of solution is distributed different with object space
The problem of cause[22], for the output after neuronal activation, next layer of neuron is re-fed into after being normalized
It is activated, evades gradient disperse and gradient is exploded.And reconstruction parameter can be learnt by introducing, improve e-learning ability with it is extensive
Ability.
The present invention tests on disclosed stereo-picture library LIVE I and LIVE II.LIVE-3D I have 20 pairs it is original
Image, 365 width symmetrical distortion images, include 5 kinds of distortions (Gblur, WN, JPEG, JP2K and FF), LIVE-3D II have 8 pairs it is original
Image, 360 width include symmetrically 5 kinds of type of distortion (Gblur, WN, JPEG, JP2K and FF) with asymmetric distorted image.
Below with reference to specific example process in detail.
The present invention is based on depth convolutional neural networks to propose a kind of introducing projection normalized stereo image quality of weight
Evaluation method is higher to the discrimination of three-dimensional image quality evaluation.CNN model passes through convolution kernel direct motion and parallel two modules,
Feature extraction is carried out to pretreated fusion stereo-picture, keeps network more abundant to the study of image.Compared to existing depth
Learning evaluation algorithm, present invention introduces BN and PBWN to carry out the network optimization, solves the ill-conditioning problem in network training process, effectively
Ground improves assessing network accuracy.The specific flow chart of the mentioned method of the present invention is as shown in Figure 1.
Specific step is as follows:
1, the acquisition of blending image
Using Gabor filter, which has 6 scale fs∈{1.5,2.5,3.5,5,7,10}(cycles/
) } and 8 direction θ ∈ { π/8 k | k=0,1 ... 7 } degree.It will be by the filtered left and right view of Gabor according to formula (16)
Fusion becomes piece image.
Wherein, Il(x, y) and Ir(x, y) respectively indicates the pixel value for being located at position (x, y) in left and right view, C (x, y) table
Show the pixel value of blending image.TCE table shows the enhancing component to this viewpoint, TCE*Indicate the inhibition component to another viewpoint, meter
Shown in calculation mode such as formula (17), (18):
Wherein, t indicates left view point or right viewpoint, and gc indicates that enhancing thresholding, ge indicate control thresholding.It is filtered through Gabor
After obtain 48 width images,Indicate the frequency information that the n-th width image of t viewpoint is filtered out by contrast sensitivity function,Indicate t view
The weight of n-th width image of point, i, j respectively indicate 6 scale f of Gabor filterings∈{1.5,2.5,3.5,5,7,10}
(cycles/degree) and 8 direction θ ∈ { π/8 k | k=0,1 ... 7 }.
2, image preprocessing
Fused single image size is larger, therefore original image is cut into 32 × 32 image block to reduce net
Calculating is then normalized in network operand.It normalizes shown in calculating process such as formula (20):
Wherein, I (x, y) indicates that the pixel value for being located at (x, y) coordinate points, μ (x, y) are the average value of pixel value, σ (x, y)
For the standard deviation of pixel value, ε is any positive number for being substantially equal to 0.
3, convolutional neural networks model
Based on Inception structure and Block structure, the present invention builds while having the depth there are two types of convolution kernel arrangement mode
Level convolutional neural networks model, the input of the model are the fritter after cutting.Model include 1 Inception structure, 1
Convolutional layer, 3 Block structures, 1 pond layer and 1 full articulamentum, as shown in Figure 1.
In the same layer of network Inception inside configuration, by different size of convolution kernel concurrent operation, it can extract
To the feature of image different scales, make extraction process more comprehensively, sufficiently, and introduces the convolution kernel of 1 × 1 size to reduce network
Parameter reduces computation complexity.
1 network model parameter setting of table
Block structure introduces the thought of " residual error ", by increasing a channel, upper one layer of input is directly connected to defeated
Out, network degenerate problem is solved.
4, the optimization of network structure
It is right that projection weight normalization layer (PBWN) is introduced in the CNN network that the present invention uses, after each convolutional layer
Each layer weighting parameter is normalized with input data respectively with batch data normalization layer (BN).
(1) projection weight normalization
In the planning problem that network seeks optimal solution, the constraint to each layer weight matrix W is added:
min l(y,f(x;W))
Wherein W={ wi, i=1,2 ... L } and the set that indicates network weight matrix, the element in set is the 1st layer to L
The weight matrix of each layer of layer, l (y, f (x;W loss function, using y as desired output, f (x)) are indicated;It W) is reality output.It indicates to retain matrixMain diagonal element and make matrixAll off-diagonal elements become 0.
The weight matrix of each layer is provided that in a sub-spaces in manifold space, that is, each layer weight matrix w is equal by the constraint
Meet
ddiag(wwT)=E (22)
The constraint is solved using Riemann's optimum theory, the Riemann's gradient that can be obtained in manifold space is
Wherein,For the gradient acquired under no restraint condition.When the weight matrix ω of each neuron meets unit specification
Change, i.e. ω ωT=1, its Riemann's gradient can be obtained i.e. based on formula (8) are as follows:
Riemann's gradient reduces one compared to original gradientThis norm reduced is divided
Analysis:
Illustrate that this is not leading term in formula (24), and is experimentally confirmed the effect of Riemann's gradient and original gradient
It is almost the same.Therefore the present invention is calculated using original gradient to reduce calculation amount.
Therefore, the present invention carries out right value update using formula (26).
(2) batch data normalizes
Shown in the normalized method of batch data such as formula (27), in the training process, mean value is calculated to each batch
μ and variances sigma2, to each feature xiIt is handled, obtaining the activation after batch data normalized is yi。
In test, E [x] is indicated with the mean value of all trained batch, with the unbiased esti-mator of all trained batch variances
It indicates var [x], as shown in formula (28), (29), m is the size of each batch.
E [x]=EB[μB] (28)
Therefore test phase, shown in the normalized formula of batch data such as formula (30), the function of parameter γ, β be scaling and
Translation, the ability to express of Restoration model improve generalization ability of network performance.
5, stereo image quality evaluation result and analysis
Experiment of the invention is carried out in stereo-picture library disclosed in two, is LIVE I and LIVE II database respectively.
The evaluation index that the present invention selects is Pearson's linearly dependent coefficient (Pearsonlinear correlation
Coefficient, PLCC), Spearman rank correlation coefficient (Spearman rank order correlation
Coefficient, SROCC) and root-mean-square error (Root mean square error, RMSE).The value of PLCC, SROCC are got over
Greatly, the value of RMSE is smaller, indicates that the consistency of model evaluation result and subjective results is stronger, effect is better.
Table 2 is inventive algorithm compared with other methods are in the performance on LIVE-I, LIVE-II database.
The overall performance of each evaluation method of table 2 compares
Chen [12] does not provide LIVE-II database overall assessment index value, only provides the index number of point type of distortion
Value, therefore carried out with the comparison of Chen [12] in table 3, table 4.Table 2 shows that inventive algorithm performance is substantially better than Heeseok [16]
Performance, this is because the present invention fully takes into account two kinds of situations of linearity and non-linearity during blending image, i.e. eyes receive
When the stimulation very little arrived, to the stimulation linear weighted function that right and left eyes receive, when stimulation has reached generation gain suppression and gain control
When the thresholding of system, using nonlinear weight.Compared to Lin [15], the present invention merges original image in blending image, [15] only
Characteristics of the underlying image is merged, therefore present invention gained index is better than [15] index.Gained PLCC and SROCC of the invention is compared
Other deep learning methods [11,13] not merged to image and conventional method [5-7], which have, to be obviously improved.In LIVE-
On II, present invention gained PLCC ranks suboptimum, compared with Ding [13] low 0.0122%.Compared to other algorithms, model of the present invention exists
The RMSE being calculated on LIVE-I and LIVE-II database is smaller, and comprehensive three indexs consider that inventive algorithm is symmetrically losing
Very preferable performance is all had in the evaluation of the stereo image quality of asymmetric distortion.
Inventive algorithm is analyzed to the evaluation effect of different type of distortion, as shown in table 3, table 4.
Each evaluation method of table 3 compares the performance of type of distortion stereo image quality evaluations different in I database of LIVE-3D
Performance ratio of each evaluation method of table 4 to type of distortion stereo image quality evaluations different in LIVE-3D II database
Compared with
When testing, PLCC, SROCC index are generally lower than existing algorithm to network in table 3, table 4, this is because this hair
Bright experiment is 2 classification, therefore even if only sentencing wrong 1 figure in testing, can also extreme influence be caused to PLCC.Experiment shows
Inventive algorithm is preferable for 5 kinds of type of distortion overall evaluation effects, for FF type of distortion and LIVE- in LIVE-I database
FF, BLUR type of distortion in II database, since discrimination has reached 100%, so the value of PLCC, SROCC also reach
1, and the value of RMSE is 0.
Table 5 illustrates the influence for being added PBWN layers after each convolutional layer and PBWN being not added to model performance.The result shows that adding
Experimental result can be made to be obviously improved after adding PBWN.2.833% is improved to the discrimination of LIVE-I image quality evaluation, is reached
98.113%, 5.88% is improved to the discrimination of LIVE-II image quality evaluation, reaches 96.47%.
The discrimination that 5 algorithms of table are evaluated about stereo image quality
The time required to 6 test of heuristics of table (unit: second)
Table 6 illustrates the influence compared whether there is or not PBWN for the testing time.PBWN keeps the magnitude of each layer weighting parameter identical,
And the equal coverlet bit specifications of weighting parameter, it is effectively prevented from the ill phenomenon for occurring Hessian matrix in training process, is improved
The learning ability and generalization ability of network, accelerate network convergence, the time required to shortening network test.
Bibliography
[1] Zilly F, Kluger J, Kauff P.Production rules for stereo acquisition
[J] .Proceedings of the IEEE, 2011,99 (4): 590-606.
[2] Urey H, Chellappan K V, Erden E, et al.State of the art in
Stereoscopic and autostereoscoic displays [J] .Proceedings of the IEEE, 2011,99
(4): 540-555.
[3] Mao Xiangying, Yu Mei, Jiang Gangyi, the stereo image quality for waiting to analyze based on structure distortion objectively evaluate model
[J] CAD and graphics journal, 2012,24 (8): 1047-1056.
[4] stereo image quality evaluation method [J] information technology of Xu Shuning, Li Sumei view-based access control model conspicuousness,
2016,2016 (10): 91-93.
[5] Bensalma Rafik, Larabi Mohamed-Chaker.A perceptual metric for
stereoscopic image quality assessment based on the binocular energy[J]
.Multidimensional Systems and Signal Processing, 2013,24 (2): 281-316.
[6] Shao Feng, Jiang Gangyi, Yu Mei, et al.Binocular energy response
Based quality assessment of stereoscopic images [J] .Digital Signal Processing,
2014,29:45-53.
[7] Shao Feng, Lin Weisi, Wang Shanshan, et al.Learning Receptive Fields
and Quality Lookups for Blind Quality Assessment of Stereoscopic Images[J]
.IEEE Transactions on Cybernetics, 2016,46 (3): 730-743.
[8] Wang Guanghua, Li Sumei, Zhu Dan wait application [J] of ExtremeLearningMachine in stereo image quality objectively evaluates
Optoelectronic laser, 2014,2014 (9): 1837-1842.
[9] Gu Shanbo, Shao Feng, Jiang Gangyi wait three-dimensional image objective quality evaluation model of the based on support vector regression
[J] electronics and information journal, 2012,34 (2): 368-374.
[10] Wu limits light, Li Sumei, and stereo-picture of the Cheng Jincui based on genetic neural network objectively evaluates [J] information
Technology, 2013,2013 (5): 148-153.
[11] Zhang Wei, Qu Cchenfei, Lin Ma, et al.Learning structure of
stereoscopic image for no-reference quality assessment with convolutional
Neural network [J] .Pattern Recognition, 2016,59 (C): 176-187.
[12] three-dimensional color image quality evaluation [J] computer science of Chen Hui, Li Chaofeng depth convolutional neural networks
With exploration, 2018,12 (08): 1315-1322
[13]Ding Yong,Deng Ruizhe,Xie Xin,et al.Reference stereoscopic image
quality assessment using convolutional neural network for adaptive feature
extraction[J].IEEE Access, 2018,2018 (6): 37595-37603.
[14] Hubel D.H., Wiesel T.N.Receptive fields of singleneurones in the
Cat ' s striate cortex [J] .The Journal of Physiology, 1959,148 (3): 574-591.
[15] Lin Yancong, Yang Jiachen, Lu Wen, et al.Quality index for
stereoscopic images by jointly evaluating cyclopean amplitude and cyclopean
Phase [J] .IEEE Journal of Selected Topics in Signal Processing, 2017,11 (11): 89-
101.
[16] Oh Heeseok, Ahn Sewoong, Kim Jongyoo, et al.Blind deep S3D image
quality evaluation via local to global feature aggregation[J].IEEE
Transactions on Image Processing, 2017,26 (10): 4923-4936.
[17] Ding Jian, Klein S.A., Levi D.M.Binocular combination of phaseand
contrast explained by a gain-control and gain-enhancement model[J].Journal of
Vision, 2013,13 (2): 13.
[18]Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[J]
.2014:1-9.
[19]He K,Zhang X,Ren S,et al.Deep Residual Learning for Image
Recognition[J].2015:770-778.
[20]L.Huang,X.Liu,B.Lang,and B.Li.Projection based weight
normalization for deep neural networks.CoRR,abs/1710.02338,2017.
[21]Ian Goodfellow,Yoshua Bengio,and Aaron Courville.Deep
Learning.MIT Press,2016.
[22]S.Ioffe and C.Szegedy.Batch normalization:Accelerating deep
network training by reducing internal covariate shift.In Proceedings of the
32nd International Conference on Machine Learning,ICML 2015。
Claims (3)
1. one kind is based on the projection normalized stereo image quality evaluation method of weight, characterized in that by the left and right of stereo-picture
Viewpoint figure is merged, and is obtained single width blending image, is then pre-processed to single image: stripping and slicing and normalization;Build depth
Level convolutional neural networks model using pretreated image slice as the input of profound convolutional neural networks, and uses
The normalization of projection weight optimizes profound convolutional neural networks structure with batch data normalization, passes through profound convolution
The output of neural network obtains the quality evaluation result of stereo-picture.
2. as described in claim 1 based on the projection normalized stereo image quality evaluation method of weight, characterized in that
The acquisition specific steps of blending image
Using Gabor filter, which has 6 scale fs∈ { 1.5,2.5,3.5,5,7,10 } } and 8 direction θ ∈ k π/
8 | k=0,1 ... 7 }, piece image will be become according to formula (1) fusion by the filtered left and right view of Gabor.
Wherein, Il(x, y) and Ir(x, y) respectively indicates the pixel value for being located at position (x, y) in left and right view, and C (x, y) expression is melted
The pixel value of image is closed, TCE table shows the enhancing component to this viewpoint, TCE*Indicate the inhibition component to another viewpoint, calculating side
Shown in formula such as formula (2), (3):
Wherein, t indicates left view point or right viewpoint, and gc indicates that enhancing thresholding, ge indicate control thresholding, after Gabor is filtered
To 48 width images,Indicate the frequency information that the n-th width image of t viewpoint is filtered out by contrast sensitivity function,Indicate t viewpoint
The weight of n-th width image, i, j respectively indicate 6 scale f of Gabor filterings∈{1.5,2.5,3.5,5,7,10}(cycles/
) and 8 direction θ ∈ { π/8 k | k=0,1 ... 7 } degree;
Image preprocessing
It normalizes shown in calculating process such as formula (5):
Wherein, I (x, y) indicates that the pixel value for being located at (x, y) coordinate points, μ (x, y) are the average value of pixel value, and σ (x, y) is picture
The standard deviation of element value, ε is any positive number for being substantially equal to 0;
Convolutional neural networks model
Based on multiple dimensioned extraction feature Inception structure and residual error network structure block Block, builds while having there are two types of convolution
The profound convolutional neural networks model of nuclear arrangement mode, the input of the model are the fritter after cutting, and model includes 1
Inception structure, 1 convolutional layer, 3 Block structures, 1 pond layer and 1 full articulamentum are tied in network Inception
In same layer inside structure, by different size of convolution kernel concurrent operation, the feature of image different scales is extracted, and draw
Enter the convolution kernel of 1 × 1 size to reduce network parameter, reduces computation complexity.
3. as described in claim 1 based on the projection normalized stereo image quality evaluation method of weight, characterized in that
(1) projection weight normalization
In the planning problem that network seeks optimal solution, the constraint to each layer weight matrix W is added:
min l(y,f(x;W))
Wherein W={ wi, i=1,2 ... L } and the set that indicates network weight matrix, the element in set is the 1st layer to L layers of each layer
Weight matrix, l (y, f (x;W loss function, using y as desired output, f (x)) are indicated;It W) is reality output.It indicates to retain matrixMain diagonal element and make matrixAll off-diagonal elements become 0.
The weight matrix of each layer is provided that in a sub-spaces in manifold space, that is, each layer weight matrix w is all satisfied by the constraint
ddiag(wwT)=E (7)
The constraint is solved using Riemann's Riemannian optimum theory, the Riemann's gradient that can be obtained in manifold space is
Wherein,For the gradient acquired under no restraint condition.Standardize when the weight matrix ω of each neuron meets unit,
That is ω ωT=1, its Riemann's gradient can be obtained i.e. based on formula (8) are as follows:
Riemann's gradient reduces one compared to original gradientThis norm reduced is analyzed:
It is calculated using original gradient to reduce calculation amount, carries out right value update using formula (11):
(2) batch data normalizes
Shown in the normalized method of batch data such as formula (12), in the training process, the data of every batch batch are calculated
Mean μ and variances sigma2, to each feature xiIt is handled, obtaining the activation after batch data normalized is yi。
In test, E [x] is indicated with the mean value of all trained batch, is indicated with the unbiased esti-mator of all trained batch variances
Var [x], as shown in formula (13), (14), m is the size of each batch.
E [x]=EB[μB] (13)
Therefore test phase, shown in the normalized formula of batch data such as formula (15), the function of parameter γ, β are zooming and panning,
The ability to express of Restoration model improves generalization ability of network performance:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910580586.6A CN110458802A (en) | 2019-06-28 | 2019-06-28 | Based on the projection normalized stereo image quality evaluation method of weight |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910580586.6A CN110458802A (en) | 2019-06-28 | 2019-06-28 | Based on the projection normalized stereo image quality evaluation method of weight |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110458802A true CN110458802A (en) | 2019-11-15 |
Family
ID=68481840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910580586.6A Pending CN110458802A (en) | 2019-06-28 | 2019-06-28 | Based on the projection normalized stereo image quality evaluation method of weight |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110458802A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111583377A (en) * | 2020-06-10 | 2020-08-25 | 江苏科技大学 | Volume rendering viewpoint evaluation and selection method for improving wind-driven optimization |
CN111915589A (en) * | 2020-07-31 | 2020-11-10 | 天津大学 | Stereo image quality evaluation method based on hole convolution |
CN112164056A (en) * | 2020-09-30 | 2021-01-01 | 南京信息工程大学 | No-reference stereo image quality evaluation method based on interactive convolution neural network |
CN112257709A (en) * | 2020-10-23 | 2021-01-22 | 北京云杉世界信息技术有限公司 | Signboard photo auditing method and device, electronic equipment and readable storage medium |
CN113205503A (en) * | 2021-05-11 | 2021-08-03 | 宁波海上鲜信息技术股份有限公司 | Satellite coastal zone image quality evaluation method |
CN117269992A (en) * | 2023-08-29 | 2023-12-22 | 中国民航科学技术研究院 | Satellite navigation multipath signal detection method and system based on convolutional neural network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108389192A (en) * | 2018-02-11 | 2018-08-10 | 天津大学 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
CN108537777A (en) * | 2018-03-20 | 2018-09-14 | 西京学院 | A kind of crop disease recognition methods based on neural network |
CN108769671A (en) * | 2018-06-13 | 2018-11-06 | 天津大学 | Stereo image quality evaluation method based on adaptive blending image |
CN109360178A (en) * | 2018-10-17 | 2019-02-19 | 天津大学 | Based on blending image without reference stereo image quality evaluation method |
CN109671023A (en) * | 2019-01-24 | 2019-04-23 | 江苏大学 | A kind of secondary method for reconstructing of face image super-resolution |
CN109714592A (en) * | 2019-01-31 | 2019-05-03 | 天津大学 | Stereo image quality evaluation method based on binocular fusion network |
CN109902202A (en) * | 2019-01-08 | 2019-06-18 | 国家计算机网络与信息安全管理中心 | A kind of video classification methods and device |
-
2019
- 2019-06-28 CN CN201910580586.6A patent/CN110458802A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108389192A (en) * | 2018-02-11 | 2018-08-10 | 天津大学 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
CN108537777A (en) * | 2018-03-20 | 2018-09-14 | 西京学院 | A kind of crop disease recognition methods based on neural network |
CN108769671A (en) * | 2018-06-13 | 2018-11-06 | 天津大学 | Stereo image quality evaluation method based on adaptive blending image |
CN109360178A (en) * | 2018-10-17 | 2019-02-19 | 天津大学 | Based on blending image without reference stereo image quality evaluation method |
CN109902202A (en) * | 2019-01-08 | 2019-06-18 | 国家计算机网络与信息安全管理中心 | A kind of video classification methods and device |
CN109671023A (en) * | 2019-01-24 | 2019-04-23 | 江苏大学 | A kind of secondary method for reconstructing of face image super-resolution |
CN109714592A (en) * | 2019-01-31 | 2019-05-03 | 天津大学 | Stereo image quality evaluation method based on binocular fusion network |
Non-Patent Citations (2)
Title |
---|
LEI HUANG.ET AL: ""Projection Based Weight Normalization for Deep Neural Networks"", 《ARXIV》 * |
SERGEY IOFFE.ET AL: ""Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift"", 《PROCEEDINGS OF THE 32ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111583377A (en) * | 2020-06-10 | 2020-08-25 | 江苏科技大学 | Volume rendering viewpoint evaluation and selection method for improving wind-driven optimization |
CN111583377B (en) * | 2020-06-10 | 2024-01-09 | 江苏科技大学 | Improved wind-driven optimized volume rendering viewpoint evaluation and selection method |
CN111915589A (en) * | 2020-07-31 | 2020-11-10 | 天津大学 | Stereo image quality evaluation method based on hole convolution |
CN112164056A (en) * | 2020-09-30 | 2021-01-01 | 南京信息工程大学 | No-reference stereo image quality evaluation method based on interactive convolution neural network |
CN112164056B (en) * | 2020-09-30 | 2023-08-29 | 南京信息工程大学 | No-reference stereoscopic image quality evaluation method based on interactive convolutional neural network |
CN112257709A (en) * | 2020-10-23 | 2021-01-22 | 北京云杉世界信息技术有限公司 | Signboard photo auditing method and device, electronic equipment and readable storage medium |
CN112257709B (en) * | 2020-10-23 | 2024-05-07 | 北京云杉世界信息技术有限公司 | Signboard photo auditing method and device, electronic equipment and readable storage medium |
CN113205503A (en) * | 2021-05-11 | 2021-08-03 | 宁波海上鲜信息技术股份有限公司 | Satellite coastal zone image quality evaluation method |
CN117269992A (en) * | 2023-08-29 | 2023-12-22 | 中国民航科学技术研究院 | Satellite navigation multipath signal detection method and system based on convolutional neural network |
CN117269992B (en) * | 2023-08-29 | 2024-04-19 | 中国民航科学技术研究院 | Satellite navigation multipath signal detection method and system based on convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110458802A (en) | Based on the projection normalized stereo image quality evaluation method of weight | |
CN111476292B (en) | Small sample element learning training method for medical image classification processing artificial intelligence | |
Liu et al. | Statistical modeling of 3-D natural scenes with application to Bayesian stereopsis | |
CN110739070A (en) | brain disease diagnosis method based on 3D convolutional neural network | |
CN105550634B (en) | Human face posture recognition methods based on Gabor characteristic and dictionary learning | |
CN109345494B (en) | Image fusion method and device based on potential low-rank representation and structure tensor | |
CN110378408A (en) | Power equipment image-recognizing method and device based on transfer learning and neural network | |
CN109360178A (en) | Based on blending image without reference stereo image quality evaluation method | |
CN108389192A (en) | Stereo-picture Comfort Evaluation method based on convolutional neural networks | |
Wang et al. | GKFC-CNN: Modified Gaussian kernel fuzzy C-means and convolutional neural network for apple segmentation and recognition | |
CN111062260B (en) | Automatic generation method of face-beautifying recommendation scheme | |
CN108416397A (en) | A kind of Image emotional semantic classification method based on ResNet-GCN networks | |
CN108389189A (en) | Stereo image quality evaluation method dictionary-based learning | |
CN109934804A (en) | The detection method in the Alzheimer lesion region based on convolutional neural networks | |
CN111882516B (en) | Image quality evaluation method based on visual saliency and deep neural network | |
Lu et al. | Classification of Alzheimer’s disease in MobileNet | |
Pan et al. | DenseNetFuse: A study of deep unsupervised DenseNet to infrared and visual image fusion | |
Yang et al. | No-reference stereoimage quality assessment for multimedia analysis towards Internet-of-Things | |
CN113538662B (en) | Single-view three-dimensional object reconstruction method and device based on RGB data | |
CN110473176A (en) | Image processing method and device, method for processing fundus images, electronic equipment | |
CN112489048B (en) | Automatic optic nerve segmentation method based on depth network | |
CN108377387A (en) | Virtual reality method for evaluating video quality based on 3D convolutional neural networks | |
CN107680070A (en) | A kind of layering weight image interfusion method based on original image content | |
Junhua et al. | No-reference image quality assessment based on AdaBoost_BP neural network in wavelet domain | |
Wan et al. | Multi-focus color image fusion based on quaternion multi-scale singular value decomposition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191115 |