CN107274378B

CN107274378B - Image fuzzy type identification and parameter setting method based on fusion memory CNN

Info

Publication number: CN107274378B
Application number: CN201710609501.3A
Authority: CN
Inventors: 黄绿娥; 鄢化彪; 吴禄慎; 陈华伟; 袁小翠; 朱根松
Original assignee: Buddhist Tzu Chi General Hospital
Current assignee: Buddhist Tzu Chi General Hospital
Priority date: 2017-07-25
Filing date: 2017-07-25
Publication date: 2020-04-03
Anticipated expiration: 2037-07-25
Also published as: CN107274378A

Abstract

The invention relates to the field of fuzzy image type identification and parameter calculation, in particular to an image fuzzy type identification and parameter setting method integrating memory CNN (convolutional neural network). The invention comprises the following steps: constructing a converged memory network architecture; setting algorithms of each layer of a fusion memory network; obtaining network parameters through network training; and identifying unknown image fuzzy types and setting parameters. The invention overcomes the defect that the network does not have an independent memory function in the existing fuzzy recognition, and can improve the image fuzzy type and the efficiency of parameter calculation.

Description

Image fuzzy type identification and parameter setting method based on fusion memory CNN

Technical Field

The invention relates to the field of fuzzy image type identification and parameter calculation, in particular to an image fuzzy type identification and parameter setting method integrating memory CNN (convolutional neural network).

Background

Today, with the rapid development of network and information technology, CCD and CMOS have become mainstream core sensors following pressure, current and other sensors, and are applied in various fields, no matter in aerial photography or unmanned vehicles, popular face recognition, character recognition, and various industrial detection cameras, and mobile phone camera applications. However, due to the influence of various factors such as environment and the like, the acquired image is not very clear, and especially, the image acquisition in the high-speed motion environment of the camera contains serious fuzzy information, which brings great difficulty to the next application. Therefore, the study of the image blur type and the solution of the blur parameters have important significance for image processing.

Rudin et al use a total variation regularization method to identify fuzzy point extension functions (Rudin L I, Osher S, Fatemi E.nonlinear total variation based fuzzy elimination algorithms [ J ]. Physicd: Nonlinear Phenomena,1992,60(1-4):259-268.), Schmidt et al use a Bayesian model to perform fuzzy kernel estimation (Schmidt U, Schelet K, Roth S.Bayesian approximation with integrated noise estimation [ C ]// Computer Vision and Pattern Recognition (CVPR),2011IEEEConference on.IEEE 2011: 2625-32.) all of which are mathematical models of problems established first to solve parameters. Model methods and parameter methods are successfully applied in many occasions, but various algorithms are mostly suitable for solving specific blurring, the problem of image degradation caused by comprehensive blurring cannot be solved, and the model universality is poor because the algorithms are mostly not considered to be applied to actual image processing.

The emergence of Alphago enables people to know artificial intelligence again, and a large number of scholars apply deep learning to image fuzzy recognition. The reference (YAN R, SHAO L. B. The reference (XUL, REN J S J, LIU C, et al.. Deep connected Neural network for image decoding [ C ]. Advances in Neural Information Processing Systems,2014: 1790-. The reference (SUN J, CAO W, XU Z, et al.. Learing a spatial neural network for non-uniform motion blur removal [ C ]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:769-777.) discloses a method using sparse priors, which first predicts the motion blur probability distribution of a partial block in a CNN image, then estimates the blur kernel of the whole image using an MRF model, and then performs image restoration, which cannot estimate the general non-uniform motion blur kernel. Reference (SCHULER C J, HIRSCH M, HAMELLING S, et al. learning to deflur [ J ]. IEEE transactions on pattern analysis and machine estimation, 2016,38(7): 1439-. The image deblurring method based on deep learning focuses on solving certain parameters of a model by applying a neural network, the structure of the neural network is not improved, and the existing network does not have an independent memory function.

Disclosure of Invention

The invention aims to provide an image blur type identification and parameter setting method fusing a memory CNN, which is used for overcoming the defect that a network does not have an independent memory function in the existing blur identification and can improve the efficiency of image blur type and parameter calculation.

The technical scheme of the invention is as follows: an image fuzzy type identification and parameter setting method fusing memory CNN comprises the following steps:

the first step is to construct a converged memory network architecture:

dividing the CNN model into a serial architecture of 5 layers of convolution layers, 1 deep memory network and 1 BP network;

selecting N from the first convolution layer₁Convolution operators, each convolution operator being of a size s₁×t₁A convolution kernel of (2), wherein s₁Number of rows of convolution kernels, t₁Is the number of columns of the convolution kernel; the convolution kernel is composed of a plurality of different straight lines, a plurality of different discs and a plurality of different circular rings, and each convolution kernel extracts the first-level shape characteristics of the image sub-graph unit;

second convolutional layer selection of N₂Convolution operators, each convolution operator being of a size s₂×t₂A convolution kernel of (2), wherein s₂Number of rows of convolution kernels, t₂Is the number of columns of the convolution kernel; extracting a secondary shape feature of the image subgraph by each convolution kernel;

third convolutional layer selection of N₃Convolution operators, each convolution operator being of a size s₃×t₃A convolution kernel of (2), wherein s₃Number of rows of convolution kernels, t₃Is the number of columns of the convolution kernel; extracting a three-level shape feature of the image subgraph by each convolution kernel;

fourth convolutional layer selection of N₄Convolution operators, each convolution operator being of a size s₄×t₄A convolution kernel of (2), wherein s₄Number of rows of convolution kernels, t₄Is the number of columns of the convolution kernel; extracting a four-level shape feature of the image subgraph by each convolution kernel;

fifth convolutional layer selection of N₅Convolution operators, each convolution operator being of a size s₅×t₅A convolution kernel of (2), wherein s₅Number of rows of convolution kernels, t₅Is the number of columns of the convolution kernel; extracting a five-level shape feature of the image subgraph by each convolution kernel;

the deep memory network adopts a memory model with the depth of D units and adopts a selective updating rule;

the BP network adopts a 4-layer structure, an input layer, two hidden layers and an output layer;

secondly, setting algorithms of each layer of the fusion memory network:

1) carrying out gray processing on an input image, converting the original input image into a corresponding gray image by a vector mapping method when the original input image is a three-dimensional color image, and skipping the step if the input image is the gray image;

2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P⁰；

3) Image P⁰N with the first winding layer₁The convolution operators respectively carry out convolution operation, and the calculation expression is as follows:

wherein

Is an image P⁰At pixel [ (i-1) Δ₁+1+x,(j-1)Δ₁+1+y]The gray-scale value of (a) is,

denotes the n-th₁At position [ x, y ] of convolution operator]The weight of the (c) position(s),

for the convolved image P¹At pixel [ i, j]Gray value of(s)₁Number of rows, t, of convolution operators₁Is the number of columns, Δ, of the convolution operator₁For convolution shift step size, n₁Is the serial number of the convolution operator, and the range is n is more than or equal to 1₁≤N₁；

4) Regularizing the convolution image, wherein the processing process comprises the following steps:

wherein

Is the output after regularization, and omega is the attenuation coefficient;

5) will be provided with

Performing maximum pooling calculation on the image, wherein the calculation method comprises the following steps:

6) calculating a result of the second layer convolution operation according to the second layer convolution kernel by referring to the step 3), the step 4) and the step 5);

7) calculating the result of the third layer of convolution operation according to the third layer of convolution kernel by referring to the step 3), the step 4) and the step 5);

8) performing similarity clustering analysis on the matrix obtained in the step 7), and keeping the number of the third-level features of the image as M₁A plurality of;

9) on the basis of the step 8), calculating a result of the fourth layer convolution operation according to the fourth layer convolution kernel by referring to the step 3), the step 4) and the step 5);

10) carrying out similarity clustering analysis on the matrix obtained in the step 9), and keeping the fourth-level feature quantity of the image as M₂A plurality of;

11) calculating a result of fifth-layer convolution operation according to a fifth-layer convolution kernel in the step 10) by referring to the step 3), the step 4) and the step 5);

12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and obtaining the matrix by taking the sum of all elements of the matrix as comprehensive characteristicsTo M₃A plurality of different feature points;

13) inputting the characteristic values output in the step 12) into corresponding memory models, and generating corresponding output information through the memory models; the concrete memory model is as follows:

a network structure having D independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:

δ_k(t)＝Min{|C_i(t)-x(t)|,i＝1,2,…,D}， (4)，

when delta_k(t) is less than or equal to the network identification threshold epsilon, indicating that the network successfully identifies the k-th class of information, and the memory coefficient β of each memory cell_i(t) and memory information C_i(t) the selective memory update rule is:

when delta_k(t) when the memory coefficient is greater than the network recognition threshold epsilon, the network recognition process is not input, the memory network updates the information with the worst memory according to the forgetting rule, namely the unit k with the lowest memory coefficient is replaced by the current input information, and the memory coefficient β of each memory unit at the moment_i(t) and memory information C_i(t) the selective memory update rule is:

β_k(t)＝Min{β_i(t),i＝1,2,…,D}， (7)，

the network output is:

h(t+1)＝C_k(t+1)， (10)，

14) taking the elements output in the step 13) as the input of the full-connection BP network, designing the number of nodes of the intermediate hidden layer in a decreasing mode, and outputting 5 nodes on the layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;

15) the network provides a manual judgment feedback function in the parameter training process, when a user finds that the network identification is wrong, error information is input into a network correction interface, the network automatically learns again, and the weight matrix information of the system is updated;

thirdly, obtaining network parameters through network training:

after the network construction and the algorithm setting are finished, learning training is carried out by using the known images of 1-100 ten thousand frames and the characteristics of the known images to obtain network parameters;

fourthly, unknown image fuzzy type identification and parameter setting:

image information acquired in the actual production process or fuzzy image information to be identified is input into the network, and the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the motion model lake direction angle and Gaussian fuzzy noise variance network output are obtained through network calculation, so that the image fuzzy type identification and parameter setting are realized.

On the basis of memory CNN, the invention adopts 5 layers of variable step length convolution operation aiming at the image with high resolution, and when the resolution is higher than 200 x 200 pixels, the invention adopts step length control to accelerate convolution convergence; when the resolution is lower than 200 × 200 pixels, the number of features is guaranteed using a single-step convolution control. In order to avoid the phenomenon of back-layer scale surge in the convolutional network, the number of layer output feature matrixes is reduced by adopting clustering analysis, and the difference and the calculation scale of network features are ensured. And simulating a human memory process in the obtained detail feature processing process, and memorizing, shaping and outputting the features through a deep memory network model. And finally, identifying and calculating the fuzzy type and the parameter size through a BP network. The invention can improve the image blur type and the efficiency of parameter calculation.

Detailed Description

The invention provides an image fuzzy type identification and parameter setting method integrating memory CNN, which is used for overcoming the defect that a network does not have a memory function in the existing fuzzy identification.

The present invention will be described in detail with reference to examples (track surface defect detection).

The first step is to construct a converged memory network architecture:

the first convolution layer selects 96 convolution operators, each convolution operator is a convolution kernel of 16 multiplied by 16, each convolution kernel comprises 72 straight lines with different forms, 8 discs with different sizes and 16 circular rings with different forms, and each convolution kernel extracts the primary shape characteristics of the image sub-graph unit;

the second convolution layer selects 256 convolution operators, each convolution operator is an 8 x 8 convolution kernel, and each convolution kernel extracts a two-level shape feature of the image sub-graph;

the third convolution layer selects 256 convolution operators, each convolution operator is a convolution kernel of 5 multiplied by 5, and each convolution kernel extracts a three-level shape feature of the image subgraph;

the fourth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a four-level shape feature of the image subgraph;

the fifth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a five-level shape feature of the image subgraph;

the deep memory network adopts a memory model with the depth of 10 units and adopts a selective updating rule;

secondly, setting algorithms of each layer of the fusion memory network:

2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P⁰Picture P⁰The size is 1024 × 1024;

3) image P⁰And respectively carrying out convolution operation with 96 convolution operators of the first convolution layer of the network, wherein the calculation expression is as follows:

wherein

Is an image P⁰At pixel [2i-1+ x,2j-1+ y]The gray-scale value of (a) is,

indicating the convolution operator at position x, y]The value of (a) is (b),

for the convolved image P¹At pixel [ i, j]The gray value of (d); after the first layer of convolution operation, 96 convolution images with the size of 504 multiplied by 504 are output;

wherein

Is the output after regularization;

5) will be provided with

outputting 96 feature matrixes with the size of 252 multiplied by 252 through the maximum pooling effect;

6) repeating the step 3), the step 4) and the step 5) to carry out second-layer convolution operation, and selecting the parameter s₂And t₂Are all 8, Δ₂Obtaining 24576 feature matrices with the size of 61 × 61;

7) repeating the step 3), the step 4) and the step 5) to carry out a third layer of convolution operation, and selecting a parameter s₃And t₃Are all 5, Δ₃For 1, 629 ten thousand feature matrices of 28 × 28 size are obtained;

8) carrying out similarity clustering analysis on the matrix obtained in the step 7), and keeping 10 ten thousand image feature matrices;

9) repeating the step 3), the step 4) and the step 5) to carry out a fourth layer of convolution operation, and selecting a parameter s₄And t₄Are all 3, Δ₄For 1, 3840 ten thousand feature matrices with the size of 12 × 12 are obtained;

10) performing similarity clustering analysis on the matrix obtained in the step 9), and keeping 1 ten thousand image feature matrices;

11) repeating the steps 3), 4) and 5) to carry out fifth-layer convolution operation, and selecting the parameter s₅And t₅Are all 3, Δ₅Obtaining 384 ten thousand feature matrices with the size of 4 multiplied by 4 as 1;

12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and taking the sum of all elements of the matrix as comprehensive characteristics to obtain 1000 different characteristic points;

13) inputting the characteristic values output in the step 12) into corresponding memory models, wherein the network is provided with 1000 independent memory units with the memory depth of 10, and corresponding output information is generated through the memory models; the concrete memory model is as follows:

a network structure having 10 independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:

δ_k(t)＝Min{|C_i(t)-x(t)|,i＝1,2,…,10}， (13)，

when delta_k(t) when the value is larger than the network identification threshold value epsilon, the network identification process is not input, the memory network updates the worst memorized information according to the forgetting rule, β_k(t) replacing the lowest memory coefficient k with the current input information, wherein the memory coefficient β of each memory cell_i(t) and memory information C_i(t) the selective memory update rule is:

β_k(t)＝Min{β_i(t),i＝1,2,…,10}， (15)，

the network output h (t +1) is:

h(t+1)＝C_k(t+1)， (10)，

14) taking the elements output in the step 13) as the input of a full-connection BP network, wherein 500 nodes are designed in a first hidden layer, 50 nodes are designed in a second hidden layer, and 5 nodes are output in the output layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;

thirdly, obtaining network parameters through network training:

constructing 50 ten thousand fuzzy images with different fuzzy types and different parameters based on an image database, namely, a cultech 101dataset and an acquired track surface video image database, taking the fuzzy image set as input and the corresponding parameters as output, and performing network training to obtain network parameters;

fourthly, unknown image fuzzy type identification and parameter setting:

taking 1000 images in a caltech 101dataset database as tested images, testing, and calculating the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the direction angle of a motion model lake and the noise variance network output of Gaussian blur corresponding to each image; comparing the network output with a known result, wherein the accuracy reaches 99.7 percent, and the image fuzzy type identification and parameter setting are realized;

meanwhile, applying the fusion memory CNN to an image deblurring processing process in the track surface defect detection system, and calculating the blur type, the defocus blur radius, the motion blur length, the motion model lake direction angle and the Gaussian blur noise variance of the detected image; and applying the setting parameters to image deblurring, wherein the deblurring effect obtained from result analysis meets the requirements of a detection system.

The described embodiments are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. An image fuzzy type identification and parameter setting method fusing memory CNN is characterized by comprising the following steps:

the first step is to construct a converged memory network architecture:

fifth convolutional layer selection of N₅A convolution operator, per volumeThe product operator is of a size s₅×t₅A convolution kernel of (2), wherein s₅Number of rows of convolution kernels, t₅Is the number of columns of the convolution kernel; extracting a five-level shape feature of the image subgraph by each convolution kernel;

secondly, setting algorithms of each layer of the fusion memory network:

wherein

wherein

Is the output after regularization, and omega is the attenuation coefficient;

5) will be provided with

12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and obtaining M by taking the sum of all elements of the matrix as comprehensive characteristics₃A plurality of different feature points;

δ_k(t)＝Min{|C_i(t)-x(t)| i＝1,2,…,D}， (4)，

β_k(t)＝Min{β_i(t),i＝1,2,…,D}， (7)，

the network output is:

h(t+1)＝C_k(t+1)， (10)，

thirdly, obtaining network parameters through network training:

fourthly, unknown image fuzzy type identification and parameter setting:

2. The image blur type identification and parameter setting method of the fusion memory CNN as claimed in claim 1, which is implemented as follows:

the first step is to construct a converged memory network architecture:

secondly, setting algorithms of each layer of the fusion memory network:

wherein

Is an image P⁰At pixel [2i-1+ x,2j-1+ y]The gray-scale value of (a) is,

indicating the convolution operator at position x, y]The value of (a) is (b),

wherein

Is the output after regularization;

5) will be provided with

7) repeating the step 3), the step 4) and the step 5) to carry out a third layer of convolution operation, and selecting parameterss₃And t₃Are all 5, Δ₃For 1, 629 ten thousand feature matrices of 28 × 28 size are obtained;

δ_k(t)＝Min{|C_i(t)-x(t)|,i＝1,2,…,10}， (13)，

β_k(t)＝Min{β_i(t),i＝1,2,…,10}， (15)，

the network output h (t +1) is:

h(t+1)＝C_k(t+1)， (10)，

thirdly, obtaining network parameters through network training:

fourthly, unknown image fuzzy type identification and parameter setting: