CN108389192A

CN108389192A - Stereo-picture Comfort Evaluation method based on convolutional neural networks

Info

Publication number: CN108389192A
Application number: CN201810143049.0A
Authority: CN
Inventors: 李素梅; 秦龙斌; 朱兆琪; 侯春萍
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2018-02-11
Filing date: 2018-02-11
Publication date: 2018-08-10

Abstract

The invention belongs to image processing fields, to propose to obtain the evaluation score of image using the stereo-picture Comfort Evaluation method based on convolutional neural networks, are preferably fitted the mankind to stereo-picture subjective assessment value, can predict well stereo-picture comfort level.Thus, the technical solution adopted by the present invention is, stereo-picture Comfort Evaluation method based on convolutional neural networks, the information of the left view of stereo-picture, right view and disparity map is synthesized into input of the piece image as network, human perception is simulated as convolutional neural networks to handle come the image to obtained by, it is trained to obtain weight coefficient, finally utilize human eye vision significant properties that image block comfort value is weighted to the comfortable angle value of image as a whole.Present invention is mainly applied to image procossings.

Description

Stereo-picture Comfort Evaluation method based on convolutional neural networks

Technical field

The invention belongs to image processing fields, are related to stereo-picture comfort level method for objectively evaluating and improve and optimizate.Specifically, It is related to the stereo-picture Comfort Evaluation method based on convolutional neural networks.

Background technology

Stereo-picture, due to being interfered by the external world, leads to figure during acquisition, compression, storage, transmission and display etc. Picture degrades, and seriously affects the viewing experience of people.Therefore, it is current three-dimensional imaging to carry out evaluation to the comfort level of stereo-picture Technical field one of main problem urgently to be resolved hurrily.

There is very important status in entire objective stereo image quality evaluation with reference to stereo image quality evaluation entirely. Have it is many it is outstanding it is complete be suggested with reference to stereo image quality evaluation algorithms, they can substantially be divided into two classes.The first kind is evaluated The thought of method is to evaluate 3D rendering quality based on newest 2D quality evaluating methods.Classical 2D image quality evaluation method masters Have Y-PSNR (PeakSignal-Noise Ratio, PSNR), structural similarity (Structural Similarity, SSIM), average structure similarity (Mean SSIM, MSSIM), visual signal to noise ratio (Visual Signal-Noise Ratio, VSNR), visual information fidelity (Information Fidelity Criterion, VIF) etc., is applied to these methods vertical Body image pair finally takes the average value of left and right image score.Second class evaluation method is by the disparity map or depth of stereo-picture Degree figure is incorporated in algorithm frame in different ways.Such as Benoit [1] proposes, in conjunction with depth information, to utilize 4 kinds of plan views As quality evaluation algorithm, left images quality and anaglyph quality are carried out linear combination, evaluate stereo-picture, evaluation knot Fruit and the consistency of subjective perception result are to be improved；You etc. [2] is by 3 kinds of different methods, with left image quality and parallax The nonlinear combination of picture quality evaluates stereo image quality；Hewage etc. [3] by extract depth map marginal information and Corresponding colour information devises a kind of three-dimensional video quality evaluation method of color depth；Chen etc. [4] schemes left and right As pretreatment, piece image is then synthesized with disparity map, and carried out entirely with reference to evaluation with SSIM algorithms；Duan Fenfang etc. [5] is carried It takes that original and distorted image is horizontal, vertical and viewpoint direction gradient information and sensitizing range, constructs the three-dimensional of each pixel Structure tensor matrix, finally by characteristic value and feature vector prediction and evaluation quality.These methods are all built from certain angle It is complete to refer to stereo image quality evaluation index, certain progress is achieved, but evaluation index can also further increase.2011 Afterwards, deep learning network, especially convolutional neural networks are fast-developing.Convolutional neural networks (convolutional neural Network, CNN) it is a kind of propagated forward just to grow up recent years and the artificial neural network that back-propagating combines, It is a kind of typical deep learning method, has become the research hotspot of current speech analysis and field of image recognition at present.Convolution The structure of neural network is close with human visual system, and human visual system is that classification carries out for the processing of information, by low Grade feature is combined as advanced features, and feature representation is gradually abstracted, and convolutional neural networks are by stacking the number of plies of network come successively The different stage feature of input data is expressed, each output in network structure inside is all as next layer of input, to more preferable Study initial data inner link.Convolutional neural networks can choose the feature of needs from image, in image classification, language Sound identification etc. can obtain higher accuracy.Document [6] proposes a kind of five layers of convolutional neural networks of triple channel, will be three-dimensional Left view, right view and the differential chart piecemeal of image extract stereo-picture feature respectively as network inputs, by convolution, finally Full connection weighting obtains final comfort value.Stereo image quality evaluation model based on convolutional neural networks can obtain higher Classification and Identification rate.The development of stereo-picture Comfort Evaluation algorithm has great significance to the development of stereo-picture.

Invention content

In order to overcome the deficiencies of the prior art, the present invention is directed to propose it is comfortable using the stereo-picture based on convolutional neural networks Degree evaluation method obtains the evaluation score of image, is preferably fitted the mankind to stereo-picture subjective assessment value, can oppose well Body image comfort level is predicted.For this purpose, the technical solution adopted by the present invention is, the stereo-picture based on convolutional neural networks relaxes The information of the left view of stereo-picture, right view and disparity map is synthesized piece image as network by appropriate evaluation method Input is simulated human perception as convolutional neural networks and is handled come the image to obtained by, trained to obtain weight coefficient, finally Image block comfort value is weighted to the comfortable angle value of image as a whole using human eye vision significant properties.

Convolutional neural networks solve the problems, such as that gradient disappears using ReLU activation primitives, ReLU activation primitive calculation formula, Such as formula 1：

F (x)=max (0, x) (1)

X is the input value of function in formula, and ReLU functions are exactly the positive number retained in result.

Convolutional neural networks imitate life using local acknowledgement normalization LRN (Local Response Normalization) The Flanker task of object neuron, Flanker task refer to that the neuron being activated inhibits adjacent neurons, and local acknowledgement is returned One response for changing layer normalizes activation primitiveUnder formula 2

MoleculeIt is to choose left and right to face to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y) N close convolution kernel, by them, the result of convolution is summed on same position, and N is the sum of core, and k, n, α, β are constants, The value of these parameters is set by experiment.

It is specifically to adopt that image block comfort value, which is weighted the comfortable angle value of image as a whole, using human eye vision significant properties The central offset CB factors that attention is spread around by center are simulated with anisotropic gaussian kernel function：

CB (x, y) indicates pixel (x, y) to central point (x₀,y₀) offset information, (x₀,y₀) indicate to be distorted right viewpoint Center point coordinate, (x, y) be pixel point coordinates, σ_hAnd σ_vThe standard deviation in image level direction and vertical direction is indicated respectively；

Normalization CB (x, y) obtains the corresponding weight matrix CB of image_normal(x, y), such as formula (4), wherein M and N are figure The length and width of picture, (x, y) are the pixel coordinate of image；

Normalized weight matrix is subjected to piecemeal processing in the way of original image piecemeal and summation obtains block normalization Weights CB_normal(i), formula (5) is added：

Entire image is divided into T image block, CB_normal(i) it is the weights of i-th of image block, value_block(i) it is net The comfortable angle value for i-th of image block that network predicts, then the calculating such as formula (5) of the comfortable angle value value of entire image.

The features of the present invention and advantageous effect are：

The present invention proposes a kind of a kind of no reference learning network parameter from stereo-picture by convolutional neural networks Image block comfort value is weighted the comfortable angle value of image as a whole using human eye vision significant properties by image evaluation method.It will The information of the left view of stereo-picture, right view and disparity map synthesizes piece image, using composograph as the input of network, Human perception is simulated to handle image by convolutional neural networks, and image middle section attention rate is compared using human eye High image-forming mechanism, the method for obtaining the evaluation score of final image.Inventive network optimizes network using ReLU and LRN, and The experimental result under heterogeneous networks parameter is tested by contrast experiment, the results showed that is achieved when using three layers of full articulamentum excellent Result.Compared to traditional images evaluation method, the stereo-picture Comfort Evaluation method based on convolutional neural networks can lead to Cross convolutional layer and extract required feature, without artificially giving characteristic value, but this method in training set deficiency not Can fully it learn, so as to cause the undesirable disadvantage of result.Compared to traditional machine learning method, the present invention is by convolutional Neural The advantage and human eye vision attention mechanism of network characterization extraction are combined, the experimental results showed that institute's extracting method of the present invention can obtain more It is good as a result, there is prodigious application value.

Description of the drawings：

Fig. 1 convolutional neural networks basic structures.

The carried input picture of Fig. 2 present invention obtains flow chart.

Fig. 3 inventive network structures.

Specific implementation mode

Graded characteristics based on human eye vision, the present invention propose to use the stereo-picture comfort level based on convolutional neural networks Evaluation method.This method can automatically extract the feature of image, by the letter of the left view of stereo-picture, right view and disparity map Breath synthesizes input of the piece image as network, and human perception is simulated come at the image to obtained by as convolutional neural networks Reason, it is trained to obtain weight coefficient.Finally utilize human eye vision significant properties that image block comfort value is weighted image as a whole Comfortable angle value.Institute's extracting method of the present invention obtains the evaluation score of image, is preferably fitted the mankind to stereo-picture subjective assessment Value.Network normalizes layer by using ReLU (Rectified Linear Units) activation primitives and local acknowledgement, and network is more Fast reaches convergence.The experimental results showed that the evaluation model established by the method for the invention can be comfortable to stereo-picture well Degree is predicted.

One typical convolutional neural networks structure is made of a series of processes, as shown in Figure 1.Usually by one group or The multigroup convolutional layer of person and pond layer composition [7], such as Fig. 1, wherein convolutional layer is mainly used for feature extraction, and the unit of convolutional layer is by group It is woven in characteristic pattern, in characteristic pattern, each unit is connected to the characteristic pattern of last layer by the weights of one group of filter A localized mass, then this is local weighted and be passed to a nonlinear function, i.e. activation primitive, such as ReLU.Same spy Sign figure uses identical filter, different layers characteristic pattern to use different filters, different filters that can extract same width The different characteristic of image.Pond layer is fixed the polymerization of length of window by the output node to convolutional layer, reduces next layer Input number of nodes to reduce the complexity of calculating, and can increase the stability of network.Maximum pond, most can be used in pond layer The methods of little Chiization and average pond.Finally, pond layer output valve is integrated by full articulamentum, obtains final classification Court verdict.This structure obtains relatively good performance [8] in image procossing.Convolution upgrade of network is compared to deep layer nerve The neural network structures such as network introduce three important concepts --- and local convolution, pond and weights share [9].

ReLU activation primitives are a kind of nonlinear activation primitives [10], with conventional activation function Sigmoid or tanh Function is compared, and gradient disappearance can be not only solved using ReLU activation primitives, moreover it is possible to accelerate the speed of network convergence.Sigmoid Or tanh functions are all the functions of saturation nonlinearity, i.e., after functional value reaches a certain level very much, then change very small almost It is zero.Therefore traditional neural network is after depth increase, using backpropagation method error anti-pass to it is preceding several layers of when very It is small, there is gradient disappearance (Gradient vanishing) problem.Asking for gradient disappearance can be then solved using ReLU activation primitives Topic.

ReLU activation primitive calculation formula, such as formula 1：

F (x)=max (0, x) (1)

Local acknowledgement normalizes (Local Response Normalization, LRN) [11] mimic biology neuron Flanker task, Flanker task refer to that the neuron being activated inhibits adjacent neurons.Local acknowledgement's normalization is exactly to borrow The thought of lateral inhibition of reflecting realizes local inhibition, and competition mechanism is created to the activity of local neuron, to local input area into Row normalization, can make all variables have similar variance, can make supervised learning algorithm faster, performance is more preferable, makes sound It answers larger value with respect to bigger, improves the generalization ability of model.Particularly, this method is to peaceful by maximum pondization in network The data in equal pond carry out local acknowledgement's normalization, to improve the performance of network.When using ReLU activation primitives, local sound is used Should normalize can obtain better effect.The response that local acknowledgement normalizes layer normalizes activation primitiveUnder formula (2)：

MoleculeIt is to choose left and right to face to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y) N close convolution kernel, by them, the result of convolution is summed on same position.N is the sum of core, and k, n, α, β are constants, The value of these parameters is set by experiment, and k=2, n=5, α=10 are respectively set as in this experiment^-4, β=0.75.

There is when the characteristic [12], i.e. eye-observation image of central offset (Center Bias, CB) by human visual system Always from the middle section of image to from surrounding, middle section attention rate highest, the attention of image surrounding is gradually successively decreased [13].Therefore the center for mathematically using gaussian kernel function [14] the simulation attention of anisotropy to be spread around by center is inclined (CB) factor is moved, such as formula (3)：

CB (x, y) indicates pixel (x, y) to central point (x₀,y₀) offset information.(x₀,y₀) be image centre bit Coordinate is set, (x, y) is pixel point coordinates, σ_hFor the standard deviation in image level direction, σ_vFor the standard deviation of image vertical direction.

Normalized weight matrix is subjected to piecemeal processing in the way of original image piecemeal and summation obtains block normalization Weights CB_normal(i), such as formula (5).Entire image is divided into T image block, CB_normal(i) it is the weights of i-th of image block, value_block(i) be the comfortable angle value of i-th of image block that neural network forecast goes out, then the meter of the comfortable angle value value of entire image Calculate such as formula (5)

Present invention be described in more detail with example below in conjunction with the accompanying drawings.

One, test method

The present invention uses GPU using Intel xeon E5-2637v3, the 64G RAM that experiment server CPU is 3.5GHz Parallel to accelerate, GPU is Titan X, video memory 12GB, is trained to network using Caffe deep learning frames.

Original three-dimensional image for experiment is MCL-3D (MediaCommLab-3D, public multimedia stereo-picture library) Database [15], the image library share 693 pairs of stereo-pictures, share nine kinds of scenes, and picture has carried out distinct methods, in various degree Plus make an uproar processing.Then subjective assessment is carried out to three-dimensional image library, obtains the quality score of stereo-picture.

In order to ensure that the completeness of training image and test image, the present invention carry out the image of three-dimensional image library in experiment It selects, selects 648 pairs of stereo-pictures altogether, wherein 529 pairs are used as training image, 119 pairs of images are as test image.For comfortable Degree, the mankind feel to be divided into comfortable and uncomfortable two kinds of situations for image, therefore the present invention carries out two classification to stereo-picture, I.e. comfortably with uncomfortable two categories.Fig. 2 gives the sample of different stereo-pictures, the first behavior stereo-picture source images sample This, i.e., be not distorted the best image pattern of stereoscopic effect, and the second behavior is distorted stereo-picture sample, and distorted manner is Gauss white noise Sound, Gauss, JPEG, compression, fuzzy etc..

In order to verify the correctness and universality of put forward CNN models, it is real in test to have carried out intersection database by the present invention It tests.Cross-beta database is using LIVE 3D phase- I, II databases of LIVE 3D phase-.LIVE 3D phase- I are counted Same distortion processing is carried out to left and right visual point image according to library, shares 20 kinds of scenes, including 5 kinds of distortions, share 20 pairs of reference charts Picture and 365 pairs of distorted image images；II databases of LIVE 3D phase- be to the perfect of I databases of LIVE 3D phase-, The distortion level of left and right visual point image is not necessarily identical in reality, and LIVE 3D phase- II take difference to left and right visual point image The distortion processing of degree shares 8 kinds of scenes, including 5 kinds of distortions, share 8 pairs of reference pictures and 400 pairs of distorted images.

Two, experimental procedure

(1) original three-dimensional image is handled, by the left view point of stereo-picture, the information of right viewpoint and disparity map synthesizes Piecemeal processing is carried out into the triple channel of coloured image, and to the image after synthesis；

(2) by after piecemeal image and corresponding label be input in network and be trained, carried by the feature of convolutional layer The weighted fitting label with full articulamentum is taken, finally obtains and loses small network model；

(3) using test sample as the input of network model, the comfortable angle value of test of corresponding block image is obtained, according to The method of significance weighted obtains the comfortable angle value of overall perspective view picture；By the comfortable angle value of general image and original sample comfort level Label compares, and obtains training network performance；And tested by the network model of different parameters, analyze the internetworking of different parameters Energy.

Three, experimental result and analysis

This method network first tier is input layer, inputs the image block of stereo-picture；The second to four layers are convolutional layer, gradually Extract the characteristics of image of stereo-picture；Layer 5 is full articulamentum to layer 7, and the feature acquired is passed through weights by full articulamentum Method be mapped to the label space of sample；8th layer be network output, i.e., picture comfortably whether the case where.Inventive network Input picture size 64x64, each layer parameter such as table 1.

Table 2 is influence of the different full connection quantity of network to performance, from experimental result as can be seen that the quantity connected entirely It is not The more the better, the parameter of network needs to match with data set, and excessive full connection quantity has slowed down the speed of training Degree, and cause the decline of performance.

The different full Connecting quantity experimental results of table 2

Table 3 is the experimental result that convolutional layer selects different characteristic figure quantity.When characteristic pattern selects Conv1-64, Conv2- The performance of network is best when 64, Conv3-128, and test accuracy has reached 97.48%.Table 2 the experimental results showed that, feature It is not proportional relation that the quantity of figure is more fine or not with network performance, and network parameter is related with image set.The present invention is testing In for train network stereo-picture be 529 pairs of stereo-pictures, characteristics of image is limited, and characteristic pattern quantity excessively leads to characterology It practises excessively, so as to cause the decline of experimental result, and excessive network parameter can lead to the increasing of network calculations complexity at double Add, operation time is multiplied.Network 8 and network 9 show this phenomenon, the characteristic pattern number of network 8 and network 9 in table 2 Amount increases, but experimental result declines, and illustrates for Small Sample Database collection, the excessive decline that may lead to result of network parameter；And And for Small Sample Database collection, network parameter, which crosses conference, causes network to be not easy to restrain, and increases trained difficulty.

3 different characteristic figure quantity experimental result of table

4 heterogeneous networks length experimental result of table

Table 4 is the experimental result under heterogeneous networks length, and the performance comparison of heterogeneous networks length lower network is set forth. Experimental result shows not to be that the longer the better for the length of network, and the length needs of network match with data set, long network The complexity of time is increased, and differs for Small Sample Database collection and surely obtains good result.

5 local acknowledgement of table normalizes layer experimental result

Table 5 is that local acknowledgement normalizes influence of the presence or absence of the layer to experimental result.The present invention the experimental results showed that, selecting Under same network structure, local acknowledgement normalizes layer and tests accuracy to improving, and promotes network convergence to play the role of positive. LRN layers of increase improves 1.22% test accuracy in the present invention.

Pearson correlation coefficient (PLCC), Spearman related coefficients (SROCC) and mean square error (Root Mean Square Error) etc. indexs usually as weigh image quality evaluation scale [18].SROCC and PLCC are bigger, indicate mould Type performance is good；RMSE is smaller, indicates that performance is good.

Table 6 is that the PLCC parameters of experimental result compare.Inventive network is under II databases of MCL and LIVE 3D phase- PLCC results are excellent, and PLCC respectively reaches 0.9433 and 0.9419, though I experimental results of LIVE 3D phase- are not best, But it is also more excellent, reach 0.9207.

6 disparate databases PLCC results of table

7 disparate databases SROCC results of table

Table 7 is that the SROCC parameters of experimental result compare.The present invention proposes image pre-processing method and network and can obtain The SROCC of excellent effect, MCL-3D can reach 0.9433, LIVE 3D phase-'s I and LIVE 3D phase- II SROCC is respectively 0.9577 and 0.9647.

8 disparate databases RMSE results of table

Table 8 is that the RMSE parameters of experimental result compare.The experimental result of the present invention is more much smaller than the result of document, this and figure The preprocessing process of picture has relationship.The present invention proposes a kind of stereo-picture Comfort Evaluation side based on convolutional neural networks Method, purpose need to simulate human visual system, and whether evaluation piece image is comfortable.Based on this purpose present invention by institute's experimental image Library image is divided into comfortable and uncomfortable two class, and therefore, the value of the RMSE of the present invention in test is more than other document experiment results It is small.And existing literature does not provide it during the experiment to the detailed process of image procossing, but different processing procedures is to reality Testing result has important influence.

Evaluation method SROCC under table 9MCL databases

Table 9 is the specific experiment result of different type of distortion under MCL databases.

Bibliography

[1]Alexandre Benoit,Patrick Le Callet,Patrizio Campisi,et al.Quality Assessment of Stereoscopic Images[J].EURASIP Journal on Image and Video Processing,2009,2008(1):1-13。

[2]You J,Xing L,Perkis A,et al.Perceptual Quality Assessment for Stereoscopic Images Based on 2D Image Quality Metrics and Disparity Analysis [C]//International Workshop on Video Processing and Quality Metrics for Consumer Electronics.2010。

[3]Hewage C T E R,Martini M G.Reduced-reference quality metric for 3D depth map transmission[C]//3dtv-Conference:the True Vision-Capture, Transmission and Display of 3d Video.IEEE,2010:1-4。

[4]Chen M J,Su C C,Kwon D K,et al.Full-reference quality assessment of stereopairs accounting for rivalry[J].Signal Processing Image Communication,2013,28(9):1143-1155。

[5] section fragrance, Shao Feng, Jiang Gangyi, Yu Mei, stereo image qualities of the Li Fu kingfishers based on three-dimensional structure tensor are objective Evaluation method [J] photoelectron laser, 2014,25 (01):192-198.

[6] the non-of Qu Chen are big without the research and the Shandong realization [D] for referring to stereo image quality assessment algorithm based on CNN It learns, 2016.

[7] Li Yandong, Hao Zongbo, thunder boat convolutional neural networks Review Study [J] computer applications, 2016,36 (9): 2508-2515。

[8] research [D] the Zhejiang University of application of the license convolutional neural networks in image recognition, 2012.

[9]Kim J,Zeng H,Ghadiyaram D,et al.Deep Convolutional Neural Models for Picture-Quality Prediction:Challenges and Solutions to Data-Driven Image Quality Assessment[J].IEEE Signal Processing Magazine,2017,34(6):130-141。

[10]Kang L,Li Y,Doermann D.Multitask Deep Learning for No-Reference Image Quality Assessment[J].2015.

[11]Malki S,Spaanenburg L.A CNN-Specific Integrated Processor[J] .Eurasip Journal on Advances in Signal Processing,2009,2009(1):1-14.

[12]Koch C,Ullman S.Shifts in Selective Visual Attention:Towards the Underlying Neural Circuitry[M]//Matters of Intelligence.Springer Netherlands, 1987:219.

[13]Tseng P H,Carmi R,Cameron I G,et al.Quantifying center bias of observers in free viewing of dynamic natural scenes.[J].Journal of Vision, 2009,9(7):4.

[14]Blackburn G G,Foody J M,Sprecher D L,et al.Cardiac rehabilitation participation patterns in a large,tertiary care center:evidence for selection bias.[J].Journal of Cardiopulmonary Rehabilitation,2000,20(3):189.

[15]Song R,Ko H,Kuo C C J.MCL-3D:a database for stereoscopic image quality assessment using 2D-image-plus-depth source[J].Journal of Information Science&Engineering,2015,31(5).

[16]Wang Z,Bovik A C,Sheikh H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE transactions on image processing,2004,13(4):600-612

[17]Wang Z,Simoncelli E P,Bovik A C.Multiscale structural similarity for image quality assessment[C]//Signals,Systems and Computers, 2004.Conference Record of the Thirty-Seventh Asilomar Conference on.IEEE, 2003,2:1398-1402.

[18]Larson E C,Chandler D M.Most apparent distortion:full-reference image quality assessment and the role of strategy[J].Journal of Electronic Imaging,2010,19(1):011006-011006-21.

[19]Chen M J,Su C C,Kwon D K,et al.Full-reference quality assessment of stereopairs accounting for rivalry[J].Signal Processing:Image Communication,2013,28(9):1143-1155.

[20]Shao F,Jiang G,Yu M,et al.Binocular energy response based quality assessment of stereoscopic images[J].Digital Signal Processing,2014,29:45-53.

[21]Lin Y H,Wu J L.Quality assessment of stereoscopic 3D image compression by binocular integration behaviors[J].IEEE transactions on Image Processing,2014,23(4):1527-1542.

[22]Ma J,An P.Method to quality assessment of stereo images[C]// Visual Communications and Image Processing(VCIP),2016.IEEE,2016:1-4。

Claims

1. a kind of stereo-picture Comfort Evaluation method based on convolutional neural networks, characterized in that by the left view of stereo-picture The information of figure, right view and disparity map synthesizes input of the piece image as network, and the mankind are simulated by convolutional neural networks It perceives to handle gained image, it is trained to obtain weight coefficient, finally utilize human eye vision significant properties by image block Comfort value weights the comfortable angle value of image as a whole.

2. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that convolution Neural network solves the problems, such as that gradient disappears using ReLU activation primitives, ReLU activation primitive calculation formula, such as formula 1：

F (x)=max (0, x) (1)

3. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that convolution Neural network normalizes the side of LRN (Local Response Normalization) mimic biology neuron using local acknowledgement Depression effect, Flanker task refer to that the neuron being activated inhibits adjacent neurons, local acknowledgement to normalize the response of layer Normalize activation primitiveSuch as formula 2：

MoleculeIt is the n for choosing left and right and closing on to indicate that i-th of core uses the output of ReLU activation primitives, denominator at position (x, y) A convolution kernel, by them, the result of convolution is summed on same position, and N is the sum of core, and k, n, α, β are constants, these The value of parameter is set by experiment.

4. the stereo-picture Comfort Evaluation method based on convolutional neural networks as described in claim 1, characterized in that utilize The comfortable angle value that image block comfort value is weighted image as a whole by human eye vision significant properties is specifically, using with each to different Property gaussian kernel function simulation attention by the central offset CB factors that spread around of center：

CB (x, y) indicates pixel (x, y) to central point (x₀,y₀) offset information, (x₀,y₀) indicate the center for being distorted right viewpoint Point coordinates, (x, y) are pixel point coordinates, σ_hAnd σ_vThe standard deviation in image level direction and vertical direction is indicated respectively；

Normalization CB (x, y) obtains the corresponding weight matrix CB of image_normal(x, y), such as formula (4), wherein M and N are image Long and wide, (x, y) is the pixel coordinate of image；

Entire image is divided into T image block, CB_normal(i) it is the weights of i-th of image block, value_block(i) it is that network is pre- The comfortable angle value for i-th of the image block measured, then the calculating such as formula (5) of the comfortable angle value value of entire image.