CN108389192A - Stereo-picture Comfort Evaluation method based on convolutional neural networks - Google Patents
Stereo-picture Comfort Evaluation method based on convolutional neural networks Download PDFInfo
- Publication number
- CN108389192A CN108389192A CN201810143049.0A CN201810143049A CN108389192A CN 108389192 A CN108389192 A CN 108389192A CN 201810143049 A CN201810143049 A CN 201810143049A CN 108389192 A CN108389192 A CN 108389192A
- Authority
- CN
- China
- Prior art keywords
- image
- stereo
- picture
- convolutional neural
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to image processing fields, to propose to obtain the evaluation score of image using the stereo-picture Comfort Evaluation method based on convolutional neural networks, are preferably fitted the mankind to stereo-picture subjective assessment value, can predict well stereo-picture comfort level.Thus, the technical solution adopted by the present invention is, stereo-picture Comfort Evaluation method based on convolutional neural networks, the information of the left view of stereo-picture, right view and disparity map is synthesized into input of the piece image as network, human perception is simulated as convolutional neural networks to handle come the image to obtained by, it is trained to obtain weight coefficient, finally utilize human eye vision significant properties that image block comfort value is weighted to the comfortable angle value of image as a whole.Present invention is mainly applied to image procossings.
Description
Technical field
The invention belongs to image processing fields, are related to stereo-picture comfort level method for objectively evaluating and improve and optimizate.Specifically,
It is related to the stereo-picture Comfort Evaluation method based on convolutional neural networks.
Background technology
Stereo-picture, due to being interfered by the external world, leads to figure during acquisition, compression, storage, transmission and display etc.
Picture degrades, and seriously affects the viewing experience of people.Therefore, it is current three-dimensional imaging to carry out evaluation to the comfort level of stereo-picture
Technical field one of main problem urgently to be resolved hurrily.
There is very important status in entire objective stereo image quality evaluation with reference to stereo image quality evaluation entirely.
Have it is many it is outstanding it is complete be suggested with reference to stereo image quality evaluation algorithms, they can substantially be divided into two classes.The first kind is evaluated
The thought of method is to evaluate 3D rendering quality based on newest 2D quality evaluating methods.Classical 2D image quality evaluation method masters
Have Y-PSNR (PeakSignal-Noise Ratio, PSNR), structural similarity (Structural Similarity,
SSIM), average structure similarity (Mean SSIM, MSSIM), visual signal to noise ratio (Visual Signal-Noise Ratio,
VSNR), visual information fidelity (Information Fidelity Criterion, VIF) etc., is applied to these methods vertical
Body image pair finally takes the average value of left and right image score.Second class evaluation method is by the disparity map or depth of stereo-picture
Degree figure is incorporated in algorithm frame in different ways.Such as Benoit [1] proposes, in conjunction with depth information, to utilize 4 kinds of plan views
As quality evaluation algorithm, left images quality and anaglyph quality are carried out linear combination, evaluate stereo-picture, evaluation knot
Fruit and the consistency of subjective perception result are to be improved;You etc. [2] is by 3 kinds of different methods, with left image quality and parallax
The nonlinear combination of picture quality evaluates stereo image quality;Hewage etc. [3] by extract depth map marginal information and
Corresponding colour information devises a kind of three-dimensional video quality evaluation method of color depth;Chen etc. [4] schemes left and right
As pretreatment, piece image is then synthesized with disparity map, and carried out entirely with reference to evaluation with SSIM algorithms;Duan Fenfang etc. [5] is carried
It takes that original and distorted image is horizontal, vertical and viewpoint direction gradient information and sensitizing range, constructs the three-dimensional of each pixel
Structure tensor matrix, finally by characteristic value and feature vector prediction and evaluation quality.These methods are all built from certain angle
It is complete to refer to stereo image quality evaluation index, certain progress is achieved, but evaluation index can also further increase.2011
Afterwards, deep learning network, especially convolutional neural networks are fast-developing.Convolutional neural networks (convolutional neural
Network, CNN) it is a kind of propagated forward just to grow up recent years and the artificial neural network that back-propagating combines,
It is a kind of typical deep learning method, has become the research hotspot of current speech analysis and field of image recognition at present.Convolution
The structure of neural network is close with human visual system, and human visual system is that classification carries out for the processing of information, by low
Grade feature is combined as advanced features, and feature representation is gradually abstracted, and convolutional neural networks are by stacking the number of plies of network come successively
The different stage feature of input data is expressed, each output in network structure inside is all as next layer of input, to more preferable
Study initial data inner link.Convolutional neural networks can choose the feature of needs from image, in image classification, language
Sound identification etc. can obtain higher accuracy.Document [6] proposes a kind of five layers of convolutional neural networks of triple channel, will be three-dimensional
Left view, right view and the differential chart piecemeal of image extract stereo-picture feature respectively as network inputs, by convolution, finally
Full connection weighting obtains final comfort value.Stereo image quality evaluation model based on convolutional neural networks can obtain higher
Classification and Identification rate.The development of stereo-picture Comfort Evaluation algorithm has great significance to the development of stereo-picture.
Invention content
In order to overcome the deficiencies of the prior art, the present invention is directed to propose it is comfortable using the stereo-picture based on convolutional neural networks
Degree evaluation method obtains the evaluation score of image, is preferably fitted the mankind to stereo-picture subjective assessment value, can oppose well
Body image comfort level is predicted.For this purpose, the technical solution adopted by the present invention is, the stereo-picture based on convolutional neural networks relaxes
The information of the left view of stereo-picture, right view and disparity map is synthesized piece image as network by appropriate evaluation method
Input is simulated human perception as convolutional neural networks and is handled come the image to obtained by, trained to obtain weight coefficient, finally
Image block comfort value is weighted to the comfortable angle value of image as a whole using human eye vision significant properties.
Convolutional neural networks solve the problems, such as that gradient disappears using ReLU activation primitives, ReLU activation primitive calculation formula,
Such as formula 1:
F (x)=max (0, x) (1)
X is the input value of function in formula, and ReLU functions are exactly the positive number retained in result.
Convolutional neural networks imitate life using local acknowledgement normalization LRN (Local Response Normalization)
The Flanker task of object neuron, Flanker task refer to that the neuron being activated inhibits adjacent neurons, and local acknowledgement is returned
One response for changing layer normalizes activation primitiveUnder formula 2
MoleculeIt is to choose left and right to face to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y)
N close convolution kernel, by them, the result of convolution is summed on same position, and N is the sum of core, and k, n, α, β are constants,
The value of these parameters is set by experiment.
It is specifically to adopt that image block comfort value, which is weighted the comfortable angle value of image as a whole, using human eye vision significant properties
The central offset CB factors that attention is spread around by center are simulated with anisotropic gaussian kernel function:
CB (x, y) indicates pixel (x, y) to central point (x0,y0) offset information, (x0,y0) indicate to be distorted right viewpoint
Center point coordinate, (x, y) be pixel point coordinates, σhAnd σvThe standard deviation in image level direction and vertical direction is indicated respectively;
Normalization CB (x, y) obtains the corresponding weight matrix CB of imagenormal(x, y), such as formula (4), wherein M and N are figure
The length and width of picture, (x, y) are the pixel coordinate of image;
Normalized weight matrix is subjected to piecemeal processing in the way of original image piecemeal and summation obtains block normalization
Weights CBnormal(i), formula (5) is added:
Entire image is divided into T image block, CBnormal(i) it is the weights of i-th of image block, valueblock(i) it is net
The comfortable angle value for i-th of image block that network predicts, then the calculating such as formula (5) of the comfortable angle value value of entire image.
The features of the present invention and advantageous effect are:
The present invention proposes a kind of a kind of no reference learning network parameter from stereo-picture by convolutional neural networks
Image block comfort value is weighted the comfortable angle value of image as a whole using human eye vision significant properties by image evaluation method.It will
The information of the left view of stereo-picture, right view and disparity map synthesizes piece image, using composograph as the input of network,
Human perception is simulated to handle image by convolutional neural networks, and image middle section attention rate is compared using human eye
High image-forming mechanism, the method for obtaining the evaluation score of final image.Inventive network optimizes network using ReLU and LRN, and
The experimental result under heterogeneous networks parameter is tested by contrast experiment, the results showed that is achieved when using three layers of full articulamentum excellent
Result.Compared to traditional images evaluation method, the stereo-picture Comfort Evaluation method based on convolutional neural networks can lead to
Cross convolutional layer and extract required feature, without artificially giving characteristic value, but this method in training set deficiency not
Can fully it learn, so as to cause the undesirable disadvantage of result.Compared to traditional machine learning method, the present invention is by convolutional Neural
The advantage and human eye vision attention mechanism of network characterization extraction are combined, the experimental results showed that institute's extracting method of the present invention can obtain more
It is good as a result, there is prodigious application value.
Description of the drawings:
Fig. 1 convolutional neural networks basic structures.
The carried input picture of Fig. 2 present invention obtains flow chart.
Fig. 3 inventive network structures.
Specific implementation mode
Graded characteristics based on human eye vision, the present invention propose to use the stereo-picture comfort level based on convolutional neural networks
Evaluation method.This method can automatically extract the feature of image, by the letter of the left view of stereo-picture, right view and disparity map
Breath synthesizes input of the piece image as network, and human perception is simulated come at the image to obtained by as convolutional neural networks
Reason, it is trained to obtain weight coefficient.Finally utilize human eye vision significant properties that image block comfort value is weighted image as a whole
Comfortable angle value.Institute's extracting method of the present invention obtains the evaluation score of image, is preferably fitted the mankind to stereo-picture subjective assessment
Value.Network normalizes layer by using ReLU (Rectified Linear Units) activation primitives and local acknowledgement, and network is more
Fast reaches convergence.The experimental results showed that the evaluation model established by the method for the invention can be comfortable to stereo-picture well
Degree is predicted.
One typical convolutional neural networks structure is made of a series of processes, as shown in Figure 1.Usually by one group or
The multigroup convolutional layer of person and pond layer composition [7], such as Fig. 1, wherein convolutional layer is mainly used for feature extraction, and the unit of convolutional layer is by group
It is woven in characteristic pattern, in characteristic pattern, each unit is connected to the characteristic pattern of last layer by the weights of one group of filter
A localized mass, then this is local weighted and be passed to a nonlinear function, i.e. activation primitive, such as ReLU.Same spy
Sign figure uses identical filter, different layers characteristic pattern to use different filters, different filters that can extract same width
The different characteristic of image.Pond layer is fixed the polymerization of length of window by the output node to convolutional layer, reduces next layer
Input number of nodes to reduce the complexity of calculating, and can increase the stability of network.Maximum pond, most can be used in pond layer
The methods of little Chiization and average pond.Finally, pond layer output valve is integrated by full articulamentum, obtains final classification
Court verdict.This structure obtains relatively good performance [8] in image procossing.Convolution upgrade of network is compared to deep layer nerve
The neural network structures such as network introduce three important concepts --- and local convolution, pond and weights share [9].
ReLU activation primitives are a kind of nonlinear activation primitives [10], with conventional activation function Sigmoid or tanh
Function is compared, and gradient disappearance can be not only solved using ReLU activation primitives, moreover it is possible to accelerate the speed of network convergence.Sigmoid
Or tanh functions are all the functions of saturation nonlinearity, i.e., after functional value reaches a certain level very much, then change very small almost
It is zero.Therefore traditional neural network is after depth increase, using backpropagation method error anti-pass to it is preceding several layers of when very
It is small, there is gradient disappearance (Gradient vanishing) problem.Asking for gradient disappearance can be then solved using ReLU activation primitives
Topic.
ReLU activation primitive calculation formula, such as formula 1:
F (x)=max (0, x) (1)
X is the input value of function in formula, and ReLU functions are exactly the positive number retained in result.
Local acknowledgement normalizes (Local Response Normalization, LRN) [11] mimic biology neuron
Flanker task, Flanker task refer to that the neuron being activated inhibits adjacent neurons.Local acknowledgement's normalization is exactly to borrow
The thought of lateral inhibition of reflecting realizes local inhibition, and competition mechanism is created to the activity of local neuron, to local input area into
Row normalization, can make all variables have similar variance, can make supervised learning algorithm faster, performance is more preferable, makes sound
It answers larger value with respect to bigger, improves the generalization ability of model.Particularly, this method is to peaceful by maximum pondization in network
The data in equal pond carry out local acknowledgement's normalization, to improve the performance of network.When using ReLU activation primitives, local sound is used
Should normalize can obtain better effect.The response that local acknowledgement normalizes layer normalizes activation primitiveUnder formula (2):
MoleculeIt is to choose left and right to face to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y)
N close convolution kernel, by them, the result of convolution is summed on same position.N is the sum of core, and k, n, α, β are constants,
The value of these parameters is set by experiment, and k=2, n=5, α=10 are respectively set as in this experiment-4, β=0.75.
There is when the characteristic [12], i.e. eye-observation image of central offset (Center Bias, CB) by human visual system
Always from the middle section of image to from surrounding, middle section attention rate highest, the attention of image surrounding is gradually successively decreased
[13].Therefore the center for mathematically using gaussian kernel function [14] the simulation attention of anisotropy to be spread around by center is inclined
(CB) factor is moved, such as formula (3):
CB (x, y) indicates pixel (x, y) to central point (x0,y0) offset information.(x0,y0) be image centre bit
Coordinate is set, (x, y) is pixel point coordinates, σhFor the standard deviation in image level direction, σvFor the standard deviation of image vertical direction.
Normalization CB (x, y) obtains the corresponding weight matrix CB of imagenormal(x, y), such as formula (4), wherein M and N are figure
The length and width of picture, (x, y) are the pixel coordinate of image;
Normalized weight matrix is subjected to piecemeal processing in the way of original image piecemeal and summation obtains block normalization
Weights CBnormal(i), such as formula (5).Entire image is divided into T image block, CBnormal(i) it is the weights of i-th of image block,
valueblock(i) be the comfortable angle value of i-th of image block that neural network forecast goes out, then the meter of the comfortable angle value value of entire image
Calculate such as formula (5)
Present invention be described in more detail with example below in conjunction with the accompanying drawings.
One, test method
The present invention uses GPU using Intel xeon E5-2637v3, the 64G RAM that experiment server CPU is 3.5GHz
Parallel to accelerate, GPU is Titan X, video memory 12GB, is trained to network using Caffe deep learning frames.
Original three-dimensional image for experiment is MCL-3D (MediaCommLab-3D, public multimedia stereo-picture library)
Database [15], the image library share 693 pairs of stereo-pictures, share nine kinds of scenes, and picture has carried out distinct methods, in various degree
Plus make an uproar processing.Then subjective assessment is carried out to three-dimensional image library, obtains the quality score of stereo-picture.
In order to ensure that the completeness of training image and test image, the present invention carry out the image of three-dimensional image library in experiment
It selects, selects 648 pairs of stereo-pictures altogether, wherein 529 pairs are used as training image, 119 pairs of images are as test image.For comfortable
Degree, the mankind feel to be divided into comfortable and uncomfortable two kinds of situations for image, therefore the present invention carries out two classification to stereo-picture,
I.e. comfortably with uncomfortable two categories.Fig. 2 gives the sample of different stereo-pictures, the first behavior stereo-picture source images sample
This, i.e., be not distorted the best image pattern of stereoscopic effect, and the second behavior is distorted stereo-picture sample, and distorted manner is Gauss white noise
Sound, Gauss, JPEG, compression, fuzzy etc..
In order to verify the correctness and universality of put forward CNN models, it is real in test to have carried out intersection database by the present invention
It tests.Cross-beta database is using LIVE 3D phase- I, II databases of LIVE 3D phase-.LIVE 3D phase- I are counted
Same distortion processing is carried out to left and right visual point image according to library, shares 20 kinds of scenes, including 5 kinds of distortions, share 20 pairs of reference charts
Picture and 365 pairs of distorted image images;II databases of LIVE 3D phase- be to the perfect of I databases of LIVE 3D phase-,
The distortion level of left and right visual point image is not necessarily identical in reality, and LIVE 3D phase- II take difference to left and right visual point image
The distortion processing of degree shares 8 kinds of scenes, including 5 kinds of distortions, share 8 pairs of reference pictures and 400 pairs of distorted images.
Two, experimental procedure
(1) original three-dimensional image is handled, by the left view point of stereo-picture, the information of right viewpoint and disparity map synthesizes
Piecemeal processing is carried out into the triple channel of coloured image, and to the image after synthesis;
(2) by after piecemeal image and corresponding label be input in network and be trained, carried by the feature of convolutional layer
The weighted fitting label with full articulamentum is taken, finally obtains and loses small network model;
(3) using test sample as the input of network model, the comfortable angle value of test of corresponding block image is obtained, according to
The method of significance weighted obtains the comfortable angle value of overall perspective view picture;By the comfortable angle value of general image and original sample comfort level
Label compares, and obtains training network performance;And tested by the network model of different parameters, analyze the internetworking of different parameters
Energy.
Three, experimental result and analysis
This method network first tier is input layer, inputs the image block of stereo-picture;The second to four layers are convolutional layer, gradually
Extract the characteristics of image of stereo-picture;Layer 5 is full articulamentum to layer 7, and the feature acquired is passed through weights by full articulamentum
Method be mapped to the label space of sample;8th layer be network output, i.e., picture comfortably whether the case where.Inventive network
Input picture size 64x64, each layer parameter such as table 1.
Table 2 is influence of the different full connection quantity of network to performance, from experimental result as can be seen that the quantity connected entirely
It is not The more the better, the parameter of network needs to match with data set, and excessive full connection quantity has slowed down the speed of training
Degree, and cause the decline of performance.
The different full Connecting quantity experimental results of table 2
Table 3 is the experimental result that convolutional layer selects different characteristic figure quantity.When characteristic pattern selects Conv1-64, Conv2-
The performance of network is best when 64, Conv3-128, and test accuracy has reached 97.48%.Table 2 the experimental results showed that, feature
It is not proportional relation that the quantity of figure is more fine or not with network performance, and network parameter is related with image set.The present invention is testing
In for train network stereo-picture be 529 pairs of stereo-pictures, characteristics of image is limited, and characteristic pattern quantity excessively leads to characterology
It practises excessively, so as to cause the decline of experimental result, and excessive network parameter can lead to the increasing of network calculations complexity at double
Add, operation time is multiplied.Network 8 and network 9 show this phenomenon, the characteristic pattern number of network 8 and network 9 in table 2
Amount increases, but experimental result declines, and illustrates for Small Sample Database collection, the excessive decline that may lead to result of network parameter;And
And for Small Sample Database collection, network parameter, which crosses conference, causes network to be not easy to restrain, and increases trained difficulty.
3 different characteristic figure quantity experimental result of table
4 heterogeneous networks length experimental result of table
Table 4 is the experimental result under heterogeneous networks length, and the performance comparison of heterogeneous networks length lower network is set forth.
Experimental result shows not to be that the longer the better for the length of network, and the length needs of network match with data set, long network
The complexity of time is increased, and differs for Small Sample Database collection and surely obtains good result.
5 local acknowledgement of table normalizes layer experimental result
Table 5 is that local acknowledgement normalizes influence of the presence or absence of the layer to experimental result.The present invention the experimental results showed that, selecting
Under same network structure, local acknowledgement normalizes layer and tests accuracy to improving, and promotes network convergence to play the role of positive.
LRN layers of increase improves 1.22% test accuracy in the present invention.
Pearson correlation coefficient (PLCC), Spearman related coefficients (SROCC) and mean square error (Root Mean
Square Error) etc. indexs usually as weigh image quality evaluation scale [18].SROCC and PLCC are bigger, indicate mould
Type performance is good;RMSE is smaller, indicates that performance is good.
Table 6 is that the PLCC parameters of experimental result compare.Inventive network is under II databases of MCL and LIVE 3D phase-
PLCC results are excellent, and PLCC respectively reaches 0.9433 and 0.9419, though I experimental results of LIVE 3D phase- are not best,
But it is also more excellent, reach 0.9207.
6 disparate databases PLCC results of table
7 disparate databases SROCC results of table
Table 7 is that the SROCC parameters of experimental result compare.The present invention proposes image pre-processing method and network and can obtain
The SROCC of excellent effect, MCL-3D can reach 0.9433, LIVE 3D phase-'s I and LIVE 3D phase- II
SROCC is respectively 0.9577 and 0.9647.
8 disparate databases RMSE results of table
Table 8 is that the RMSE parameters of experimental result compare.The experimental result of the present invention is more much smaller than the result of document, this and figure
The preprocessing process of picture has relationship.The present invention proposes a kind of stereo-picture Comfort Evaluation side based on convolutional neural networks
Method, purpose need to simulate human visual system, and whether evaluation piece image is comfortable.Based on this purpose present invention by institute's experimental image
Library image is divided into comfortable and uncomfortable two class, and therefore, the value of the RMSE of the present invention in test is more than other document experiment results
It is small.And existing literature does not provide it during the experiment to the detailed process of image procossing, but different processing procedures is to reality
Testing result has important influence.
Evaluation method SROCC under table 9MCL databases
Table 9 is the specific experiment result of different type of distortion under MCL databases.
Bibliography
[1]Alexandre Benoit,Patrick Le Callet,Patrizio Campisi,et al.Quality
Assessment of Stereoscopic Images[J].EURASIP Journal on Image and Video
Processing,2009,2008(1):1-13。
[2]You J,Xing L,Perkis A,et al.Perceptual Quality Assessment for
Stereoscopic Images Based on 2D Image Quality Metrics and Disparity Analysis
[C]//International Workshop on Video Processing and Quality Metrics for
Consumer Electronics.2010。
[3]Hewage C T E R,Martini M G.Reduced-reference quality metric for 3D
depth map transmission[C]//3dtv-Conference:the True Vision-Capture,
Transmission and Display of 3d Video.IEEE,2010:1-4。
[4]Chen M J,Su C C,Kwon D K,et al.Full-reference quality assessment
of stereopairs accounting for rivalry[J].Signal Processing Image
Communication,2013,28(9):1143-1155。
[5] section fragrance, Shao Feng, Jiang Gangyi, Yu Mei, stereo image qualities of the Li Fu kingfishers based on three-dimensional structure tensor are objective
Evaluation method [J] photoelectron laser, 2014,25 (01):192-198.
[6] the non-of Qu Chen are big without the research and the Shandong realization [D] for referring to stereo image quality assessment algorithm based on CNN
It learns, 2016.
[7] Li Yandong, Hao Zongbo, thunder boat convolutional neural networks Review Study [J] computer applications, 2016,36 (9):
2508-2515。
[8] research [D] the Zhejiang University of application of the license convolutional neural networks in image recognition, 2012.
[9]Kim J,Zeng H,Ghadiyaram D,et al.Deep Convolutional Neural Models
for Picture-Quality Prediction:Challenges and Solutions to Data-Driven Image
Quality Assessment[J].IEEE Signal Processing Magazine,2017,34(6):130-141。
[10]Kang L,Li Y,Doermann D.Multitask Deep Learning for No-Reference
Image Quality Assessment[J].2015.
[11]Malki S,Spaanenburg L.A CNN-Specific Integrated Processor[J]
.Eurasip Journal on Advances in Signal Processing,2009,2009(1):1-14.
[12]Koch C,Ullman S.Shifts in Selective Visual Attention:Towards the
Underlying Neural Circuitry[M]//Matters of Intelligence.Springer Netherlands,
1987:219.
[13]Tseng P H,Carmi R,Cameron I G,et al.Quantifying center bias of
observers in free viewing of dynamic natural scenes.[J].Journal of Vision,
2009,9(7):4.
[14]Blackburn G G,Foody J M,Sprecher D L,et al.Cardiac rehabilitation
participation patterns in a large,tertiary care center:evidence for selection
bias.[J].Journal of Cardiopulmonary Rehabilitation,2000,20(3):189.
[15]Song R,Ko H,Kuo C C J.MCL-3D:a database for stereoscopic image
quality assessment using 2D-image-plus-depth source[J].Journal of Information
Science&Engineering,2015,31(5).
[16]Wang Z,Bovik A C,Sheikh H R,et al.Image quality assessment:from
error visibility to structural similarity[J].IEEE transactions on image
processing,2004,13(4):600-612
[17]Wang Z,Simoncelli E P,Bovik A C.Multiscale structural similarity
for image quality assessment[C]//Signals,Systems and Computers,
2004.Conference Record of the Thirty-Seventh Asilomar Conference on.IEEE,
2003,2:1398-1402.
[18]Larson E C,Chandler D M.Most apparent distortion:full-reference
image quality assessment and the role of strategy[J].Journal of Electronic
Imaging,2010,19(1):011006-011006-21.
[19]Chen M J,Su C C,Kwon D K,et al.Full-reference quality assessment
of stereopairs accounting for rivalry[J].Signal Processing:Image
Communication,2013,28(9):1143-1155.
[20]Shao F,Jiang G,Yu M,et al.Binocular energy response based quality
assessment of stereoscopic images[J].Digital Signal Processing,2014,29:45-53.
[21]Lin Y H,Wu J L.Quality assessment of stereoscopic 3D image
compression by binocular integration behaviors[J].IEEE transactions on Image
Processing,2014,23(4):1527-1542.
[22]Ma J,An P.Method to quality assessment of stereo images[C]//
Visual Communications and Image Processing(VCIP),2016.IEEE,2016:1-4。
Claims (4)
1. a kind of stereo-picture Comfort Evaluation method based on convolutional neural networks, characterized in that by the left view of stereo-picture
The information of figure, right view and disparity map synthesizes input of the piece image as network, and the mankind are simulated by convolutional neural networks
It perceives to handle gained image, it is trained to obtain weight coefficient, finally utilize human eye vision significant properties by image block
Comfort value weights the comfortable angle value of image as a whole.
2. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that convolution
Neural network solves the problems, such as that gradient disappears using ReLU activation primitives, ReLU activation primitive calculation formula, such as formula 1:
F (x)=max (0, x) (1)
X is the input value of function in formula, and ReLU functions are exactly the positive number retained in result.
3. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that convolution
Neural network normalizes the side of LRN (Local Response Normalization) mimic biology neuron using local acknowledgement
Depression effect, Flanker task refer to that the neuron being activated inhibits adjacent neurons, local acknowledgement to normalize the response of layer
Normalize activation primitiveSuch as formula 2:
MoleculeIt is the n for choosing left and right and closing on to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y)
A convolution kernel, by them, the result of convolution is summed on same position, and N is the sum of core, and k, n, α, β are constants, these
The value of parameter is set by experiment.
4. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that utilize
The comfortable angle value that image block comfort value is weighted image as a whole by human eye vision significant properties is specifically, using with each to different
Property gaussian kernel function simulation attention by the central offset CB factors that spread around of center:
CB (x, y) indicates pixel (x, y) to central point (x0,y0) offset information, (x0,y0) indicate the center for being distorted right viewpoint
Point coordinates, (x, y) are pixel point coordinates, σhAnd σvThe standard deviation in image level direction and vertical direction is indicated respectively;
Normalization CB (x, y) obtains the corresponding weight matrix CB of imagenormal(x, y), such as formula (4), wherein M and N are image
Long and wide, (x, y) is the pixel coordinate of image;
Normalized weight matrix is subjected to piecemeal processing in the way of original image piecemeal and summation obtains block normalization weights
CBnormal(i), formula (5) is added:
Entire image is divided into T image block, CBnormal(i) it is the weights of i-th of image block, valueblock(i) it is that network is pre-
The comfortable angle value for i-th of the image block measured, then the calculating such as formula (5) of the comfortable angle value value of entire image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810143049.0A CN108389192A (en) | 2018-02-11 | 2018-02-11 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810143049.0A CN108389192A (en) | 2018-02-11 | 2018-02-11 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108389192A true CN108389192A (en) | 2018-08-10 |
Family
ID=63068845
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810143049.0A Pending CN108389192A (en) | 2018-02-11 | 2018-02-11 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108389192A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242831A (en) * | 2018-08-20 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | Picture quality detection method, device, computer equipment and storage medium |
CN109360178A (en) * | 2018-10-17 | 2019-02-19 | 天津大学 | Based on blending image without reference stereo image quality evaluation method |
CN109523590A (en) * | 2018-10-22 | 2019-03-26 | 福州大学 | A kind of 3D rendering depth information visual comfort appraisal procedure based on sample |
CN109831664A (en) * | 2019-01-15 | 2019-05-31 | 天津大学 | Fast Compression three-dimensional video quality evaluation method based on deep learning |
CN109977967A (en) * | 2019-03-06 | 2019-07-05 | 浙江科技学院 | The significant extracting method of stereo-picture vision based on parameter sharing deep learning network |
CN110060236A (en) * | 2019-03-27 | 2019-07-26 | 天津大学 | Stereo image quality evaluation method based on depth convolutional neural networks |
CN110070519A (en) * | 2019-03-13 | 2019-07-30 | 西安电子科技大学 | Stitching image measuring method, image mosaic system based on phase equalization |
CN110458802A (en) * | 2019-06-28 | 2019-11-15 | 天津大学 | Based on the projection normalized stereo image quality evaluation method of weight |
CN111145150A (en) * | 2019-12-20 | 2020-05-12 | 中国科学院光电技术研究所 | Universal non-reference image quality evaluation method |
CN111860691A (en) * | 2020-07-31 | 2020-10-30 | 福州大学 | Professional stereoscopic video visual comfort degree classification method based on attention and recurrent neural network |
CN111882516A (en) * | 2020-02-19 | 2020-11-03 | 南京信息工程大学 | Image quality evaluation method based on visual saliency and deep neural network |
CN113205503A (en) * | 2021-05-11 | 2021-08-03 | 宁波海上鲜信息技术股份有限公司 | Satellite coastal zone image quality evaluation method |
CN117058132A (en) * | 2023-10-11 | 2023-11-14 | 天津大学 | Cultural relic illumination visual comfort quantitative evaluation method and system based on neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105976351A (en) * | 2016-03-31 | 2016-09-28 | 天津大学 | Central offset based three-dimensional image quality evaluation method |
US20160358321A1 (en) * | 2015-06-05 | 2016-12-08 | Sony Corporation | Full reference image quality assessment based on convolutional neural network |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
-
2018
- 2018-02-11 CN CN201810143049.0A patent/CN108389192A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160358321A1 (en) * | 2015-06-05 | 2016-12-08 | Sony Corporation | Full reference image quality assessment based on convolutional neural network |
CN105976351A (en) * | 2016-03-31 | 2016-09-28 | 天津大学 | Central offset based three-dimensional image quality evaluation method |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
Non-Patent Citations (2)
Title |
---|
姜求平 等: "基于视觉重要区域的立体图像视觉舒适度客观评价方法", 《电子与信息学报》 * |
瞿晨非: "基于CNN的无参考立体图像质量评估算法的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242831A (en) * | 2018-08-20 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | Picture quality detection method, device, computer equipment and storage medium |
CN109360178A (en) * | 2018-10-17 | 2019-02-19 | 天津大学 | Based on blending image without reference stereo image quality evaluation method |
CN109360178B (en) * | 2018-10-17 | 2021-11-19 | 天津大学 | Fusion image-based non-reference stereo image quality evaluation method |
CN109523590A (en) * | 2018-10-22 | 2019-03-26 | 福州大学 | A kind of 3D rendering depth information visual comfort appraisal procedure based on sample |
CN109523590B (en) * | 2018-10-22 | 2021-05-18 | 福州大学 | 3D image depth information visual comfort evaluation method based on sample |
CN109831664A (en) * | 2019-01-15 | 2019-05-31 | 天津大学 | Fast Compression three-dimensional video quality evaluation method based on deep learning |
CN109977967A (en) * | 2019-03-06 | 2019-07-05 | 浙江科技学院 | The significant extracting method of stereo-picture vision based on parameter sharing deep learning network |
CN109977967B (en) * | 2019-03-06 | 2020-12-25 | 浙江科技学院 | Stereo image visual saliency extraction method based on parameter sharing deep learning network |
CN110070519A (en) * | 2019-03-13 | 2019-07-30 | 西安电子科技大学 | Stitching image measuring method, image mosaic system based on phase equalization |
CN110060236B (en) * | 2019-03-27 | 2023-08-11 | 天津大学 | Stereoscopic image quality evaluation method based on depth convolution neural network |
CN110060236A (en) * | 2019-03-27 | 2019-07-26 | 天津大学 | Stereo image quality evaluation method based on depth convolutional neural networks |
CN110458802A (en) * | 2019-06-28 | 2019-11-15 | 天津大学 | Based on the projection normalized stereo image quality evaluation method of weight |
CN111145150A (en) * | 2019-12-20 | 2020-05-12 | 中国科学院光电技术研究所 | Universal non-reference image quality evaluation method |
CN111145150B (en) * | 2019-12-20 | 2022-11-11 | 中国科学院光电技术研究所 | Universal non-reference image quality evaluation method |
CN111882516A (en) * | 2020-02-19 | 2020-11-03 | 南京信息工程大学 | Image quality evaluation method based on visual saliency and deep neural network |
CN111882516B (en) * | 2020-02-19 | 2023-07-07 | 南京信息工程大学 | Image quality evaluation method based on visual saliency and deep neural network |
CN111860691B (en) * | 2020-07-31 | 2022-06-14 | 福州大学 | Stereo video visual comfort degree classification method based on attention and recurrent neural network |
CN111860691A (en) * | 2020-07-31 | 2020-10-30 | 福州大学 | Professional stereoscopic video visual comfort degree classification method based on attention and recurrent neural network |
CN113205503A (en) * | 2021-05-11 | 2021-08-03 | 宁波海上鲜信息技术股份有限公司 | Satellite coastal zone image quality evaluation method |
CN117058132A (en) * | 2023-10-11 | 2023-11-14 | 天津大学 | Cultural relic illumination visual comfort quantitative evaluation method and system based on neural network |
CN117058132B (en) * | 2023-10-11 | 2024-01-23 | 天津大学 | Cultural relic illumination visual comfort quantitative evaluation method and system based on neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108389192A (en) | Stereo-picture Comfort Evaluation method based on convolutional neural networks | |
CN109815893B (en) | Color face image illumination domain normalization method based on cyclic generation countermeasure network | |
CN109559276B (en) | Image super-resolution reconstruction method based on quality evaluation and feature statistics | |
CN110555434B (en) | Method for detecting visual saliency of three-dimensional image through local contrast and global guidance | |
CN107977932A (en) | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method | |
CN109360178B (en) | Fusion image-based non-reference stereo image quality evaluation method | |
CN107437092A (en) | The sorting algorithm of retina OCT image based on Three dimensional convolution neutral net | |
CN109711426A (en) | A kind of pathological picture sorter and method based on GAN and transfer learning | |
CN108235003B (en) | Three-dimensional video quality evaluation method based on 3D convolutional neural network | |
CN110516716A (en) | Non-reference picture quality appraisement method based on multiple-limb similarity network | |
Messai et al. | Adaboost neural network and cyclopean view for no-reference stereoscopic image quality assessment | |
Si et al. | A no-reference stereoscopic image quality assessment network based on binocular interaction and fusion mechanisms | |
CN115018727A (en) | Multi-scale image restoration method, storage medium and terminal | |
Liu et al. | Learning hadamard-product-propagation for image dehazing and beyond | |
CN111882516B (en) | Image quality evaluation method based on visual saliency and deep neural network | |
CN115526891B (en) | Training method and related device for defect data set generation model | |
CN108259893B (en) | Virtual reality video quality evaluation method based on double-current convolutional neural network | |
CN113724354A (en) | Reference image color style-based gray level image coloring method | |
CN112052877A (en) | Image fine-grained classification method based on cascade enhanced network | |
CN112818849A (en) | Crowd density detection algorithm based on context attention convolutional neural network of counterstudy | |
CN114187261A (en) | Non-reference stereo image quality evaluation method based on multi-dimensional attention mechanism | |
CN107909565A (en) | Stereo-picture Comfort Evaluation method based on convolutional neural networks | |
CN108492275A (en) | Based on deep neural network without with reference to stereo image quality evaluation method | |
CN108377387A (en) | Virtual reality method for evaluating video quality based on 3D convolutional neural networks | |
CN110738645B (en) | 3D image quality detection method based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180810 |
|
RJ01 | Rejection of invention patent application after publication |