CN107274378B - Image fuzzy type identification and parameter setting method based on fusion memory CNN - Google Patents

Image fuzzy type identification and parameter setting method based on fusion memory CNN Download PDF

Info

Publication number
CN107274378B
CN107274378B CN201710609501.3A CN201710609501A CN107274378B CN 107274378 B CN107274378 B CN 107274378B CN 201710609501 A CN201710609501 A CN 201710609501A CN 107274378 B CN107274378 B CN 107274378B
Authority
CN
China
Prior art keywords
convolution
image
network
memory
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710609501.3A
Other languages
Chinese (zh)
Other versions
CN107274378A (en
Inventor
黄绿娥
鄢化彪
吴禄慎
陈华伟
袁小翠
朱根松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Buddhist Tzu Chi General Hospital
Original Assignee
Buddhist Tzu Chi General Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Buddhist Tzu Chi General Hospital filed Critical Buddhist Tzu Chi General Hospital
Priority to CN201710609501.3A priority Critical patent/CN107274378B/en
Publication of CN107274378A publication Critical patent/CN107274378A/en
Application granted granted Critical
Publication of CN107274378B publication Critical patent/CN107274378B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20088Trinocular vision calculations; trifocal tensor

Abstract

The invention relates to the field of fuzzy image type identification and parameter calculation, in particular to an image fuzzy type identification and parameter setting method integrating memory CNN (convolutional neural network). The invention comprises the following steps: constructing a converged memory network architecture; setting algorithms of each layer of a fusion memory network; obtaining network parameters through network training; and identifying unknown image fuzzy types and setting parameters. The invention overcomes the defect that the network does not have an independent memory function in the existing fuzzy recognition, and can improve the image fuzzy type and the efficiency of parameter calculation.

Description

Image fuzzy type identification and parameter setting method based on fusion memory CNN
Technical Field
The invention relates to the field of fuzzy image type identification and parameter calculation, in particular to an image fuzzy type identification and parameter setting method integrating memory CNN (convolutional neural network).
Background
Today, with the rapid development of network and information technology, CCD and CMOS have become mainstream core sensors following pressure, current and other sensors, and are applied in various fields, no matter in aerial photography or unmanned vehicles, popular face recognition, character recognition, and various industrial detection cameras, and mobile phone camera applications. However, due to the influence of various factors such as environment and the like, the acquired image is not very clear, and especially, the image acquisition in the high-speed motion environment of the camera contains serious fuzzy information, which brings great difficulty to the next application. Therefore, the study of the image blur type and the solution of the blur parameters have important significance for image processing.
Rudin et al use a total variation regularization method to identify fuzzy point extension functions (Rudin L I, Osher S, Fatemi E.nonlinear total variation based fuzzy elimination algorithms [ J ]. Physicd: Nonlinear Phenomena,1992,60(1-4):259-268.), Schmidt et al use a Bayesian model to perform fuzzy kernel estimation (Schmidt U, Schelet K, Roth S.Bayesian approximation with integrated noise estimation [ C ]// Computer Vision and Pattern Recognition (CVPR),2011IEEEConference on.IEEE 2011: 2625-32.) all of which are mathematical models of problems established first to solve parameters. Model methods and parameter methods are successfully applied in many occasions, but various algorithms are mostly suitable for solving specific blurring, the problem of image degradation caused by comprehensive blurring cannot be solved, and the model universality is poor because the algorithms are mostly not considered to be applied to actual image processing.
The emergence of Alphago enables people to know artificial intelligence again, and a large number of scholars apply deep learning to image fuzzy recognition. The reference (YAN R, SHAO L. B. The reference (XUL, REN J S J, LIU C, et al.. Deep connected Neural network for image decoding [ C ]. Advances in Neural Information Processing Systems,2014: 1790-. The reference (SUN J, CAO W, XU Z, et al.. Learing a spatial neural network for non-uniform motion blur removal [ C ]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:769-777.) discloses a method using sparse priors, which first predicts the motion blur probability distribution of a partial block in a CNN image, then estimates the blur kernel of the whole image using an MRF model, and then performs image restoration, which cannot estimate the general non-uniform motion blur kernel. Reference (SCHULER C J, HIRSCH M, HAMELLING S, et al. learning to deflur [ J ]. IEEE transactions on pattern analysis and machine estimation, 2016,38(7): 1439-. The image deblurring method based on deep learning focuses on solving certain parameters of a model by applying a neural network, the structure of the neural network is not improved, and the existing network does not have an independent memory function.
Disclosure of Invention
The invention aims to provide an image blur type identification and parameter setting method fusing a memory CNN, which is used for overcoming the defect that a network does not have an independent memory function in the existing blur identification and can improve the efficiency of image blur type and parameter calculation.
The technical scheme of the invention is as follows: an image fuzzy type identification and parameter setting method fusing memory CNN comprises the following steps:
the first step is to construct a converged memory network architecture:
dividing the CNN model into a serial architecture of 5 layers of convolution layers, 1 deep memory network and 1 BP network;
selecting N from the first convolution layer1Convolution operators, each convolution operator being of a size s1×t1A convolution kernel of (2), wherein s1Number of rows of convolution kernels, t1Is the number of columns of the convolution kernel; the convolution kernel is composed of a plurality of different straight lines, a plurality of different discs and a plurality of different circular rings, and each convolution kernel extracts the first-level shape characteristics of the image sub-graph unit;
second convolutional layer selection of N2Convolution operators, each convolution operator being of a size s2×t2A convolution kernel of (2), wherein s2Number of rows of convolution kernels, t2Is the number of columns of the convolution kernel; extracting a secondary shape feature of the image subgraph by each convolution kernel;
third convolutional layer selection of N3Convolution operators, each convolution operator being of a size s3×t3A convolution kernel of (2), wherein s3Number of rows of convolution kernels, t3Is the number of columns of the convolution kernel; extracting a three-level shape feature of the image subgraph by each convolution kernel;
fourth convolutional layer selection of N4Convolution operators, each convolution operator being of a size s4×t4A convolution kernel of (2), wherein s4Number of rows of convolution kernels, t4Is the number of columns of the convolution kernel; extracting a four-level shape feature of the image subgraph by each convolution kernel;
fifth convolutional layer selection of N5Convolution operators, each convolution operator being of a size s5×t5A convolution kernel of (2), wherein s5Number of rows of convolution kernels, t5Is the number of columns of the convolution kernel; extracting a five-level shape feature of the image subgraph by each convolution kernel;
the deep memory network adopts a memory model with the depth of D units and adopts a selective updating rule;
the BP network adopts a 4-layer structure, an input layer, two hidden layers and an output layer;
secondly, setting algorithms of each layer of the fusion memory network:
1) carrying out gray processing on an input image, converting the original input image into a corresponding gray image by a vector mapping method when the original input image is a three-dimensional color image, and skipping the step if the input image is the gray image;
2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P0
3) Image P0N with the first winding layer1The convolution operators respectively carry out convolution operation, and the calculation expression is as follows:
Figure GDA0002281905570000041
wherein
Figure GDA0002281905570000042
Is an image P0At pixel [ (i-1) Δ1+1+x,(j-1)Δ1+1+y]The gray-scale value of (a) is,
Figure GDA0002281905570000043
denotes the n-th1At position [ x, y ] of convolution operator]The weight of the (c) position(s),
Figure GDA0002281905570000044
for the convolved image P1At pixel [ i, j]Gray value of(s)1Number of rows, t, of convolution operators1Is the number of columns, Δ, of the convolution operator1For convolution shift step size, n1Is the serial number of the convolution operator, and the range is n is more than or equal to 11≤N1
4) Regularizing the convolution image, wherein the processing process comprises the following steps:
Figure GDA0002281905570000045
wherein
Figure GDA0002281905570000046
Is the output after regularization, and omega is the attenuation coefficient;
5) will be provided with
Figure GDA0002281905570000047
Performing maximum pooling calculation on the image, wherein the calculation method comprises the following steps:
Figure GDA0002281905570000048
6) calculating a result of the second layer convolution operation according to the second layer convolution kernel by referring to the step 3), the step 4) and the step 5);
7) calculating the result of the third layer of convolution operation according to the third layer of convolution kernel by referring to the step 3), the step 4) and the step 5);
8) performing similarity clustering analysis on the matrix obtained in the step 7), and keeping the number of the third-level features of the image as M1A plurality of;
9) on the basis of the step 8), calculating a result of the fourth layer convolution operation according to the fourth layer convolution kernel by referring to the step 3), the step 4) and the step 5);
10) carrying out similarity clustering analysis on the matrix obtained in the step 9), and keeping the fourth-level feature quantity of the image as M2A plurality of;
11) calculating a result of fifth-layer convolution operation according to a fifth-layer convolution kernel in the step 10) by referring to the step 3), the step 4) and the step 5);
12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and obtaining the matrix by taking the sum of all elements of the matrix as comprehensive characteristicsTo M3A plurality of different feature points;
13) inputting the characteristic values output in the step 12) into corresponding memory models, and generating corresponding output information through the memory models; the concrete memory model is as follows:
a network structure having D independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:
δk(t)=Min{|Ci(t)-x(t)|,i=1,2,…,D}, (4),
when deltak(t) is less than or equal to the network identification threshold epsilon, indicating that the network successfully identifies the k-th class of information, and the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
Figure GDA0002281905570000051
Figure GDA0002281905570000052
when deltak(t) when the memory coefficient is greater than the network recognition threshold epsilon, the network recognition process is not input, the memory network updates the information with the worst memory according to the forgetting rule, namely the unit k with the lowest memory coefficient is replaced by the current input information, and the memory coefficient β of each memory unit at the momenti(t) and memory information Ci(t) the selective memory update rule is:
βk(t)=Min{βi(t),i=1,2,…,D}, (7),
Figure GDA0002281905570000061
Figure GDA0002281905570000062
the network output is:
h(t+1)=Ck(t+1), (10),
14) taking the elements output in the step 13) as the input of the full-connection BP network, designing the number of nodes of the intermediate hidden layer in a decreasing mode, and outputting 5 nodes on the layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;
15) the network provides a manual judgment feedback function in the parameter training process, when a user finds that the network identification is wrong, error information is input into a network correction interface, the network automatically learns again, and the weight matrix information of the system is updated;
thirdly, obtaining network parameters through network training:
after the network construction and the algorithm setting are finished, learning training is carried out by using the known images of 1-100 ten thousand frames and the characteristics of the known images to obtain network parameters;
fourthly, unknown image fuzzy type identification and parameter setting:
image information acquired in the actual production process or fuzzy image information to be identified is input into the network, and the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the motion model lake direction angle and Gaussian fuzzy noise variance network output are obtained through network calculation, so that the image fuzzy type identification and parameter setting are realized.
On the basis of memory CNN, the invention adopts 5 layers of variable step length convolution operation aiming at the image with high resolution, and when the resolution is higher than 200 x 200 pixels, the invention adopts step length control to accelerate convolution convergence; when the resolution is lower than 200 × 200 pixels, the number of features is guaranteed using a single-step convolution control. In order to avoid the phenomenon of back-layer scale surge in the convolutional network, the number of layer output feature matrixes is reduced by adopting clustering analysis, and the difference and the calculation scale of network features are ensured. And simulating a human memory process in the obtained detail feature processing process, and memorizing, shaping and outputting the features through a deep memory network model. And finally, identifying and calculating the fuzzy type and the parameter size through a BP network. The invention can improve the image blur type and the efficiency of parameter calculation.
Detailed Description
The invention provides an image fuzzy type identification and parameter setting method integrating memory CNN, which is used for overcoming the defect that a network does not have a memory function in the existing fuzzy identification.
The present invention will be described in detail with reference to examples (track surface defect detection).
The first step is to construct a converged memory network architecture:
dividing the CNN model into a serial architecture of 5 layers of convolution layers, 1 deep memory network and 1 BP network;
the first convolution layer selects 96 convolution operators, each convolution operator is a convolution kernel of 16 multiplied by 16, each convolution kernel comprises 72 straight lines with different forms, 8 discs with different sizes and 16 circular rings with different forms, and each convolution kernel extracts the primary shape characteristics of the image sub-graph unit;
the second convolution layer selects 256 convolution operators, each convolution operator is an 8 x 8 convolution kernel, and each convolution kernel extracts a two-level shape feature of the image sub-graph;
the third convolution layer selects 256 convolution operators, each convolution operator is a convolution kernel of 5 multiplied by 5, and each convolution kernel extracts a three-level shape feature of the image subgraph;
the fourth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a four-level shape feature of the image subgraph;
the fifth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a five-level shape feature of the image subgraph;
the deep memory network adopts a memory model with the depth of 10 units and adopts a selective updating rule;
the BP network adopts a 4-layer structure, an input layer, two hidden layers and an output layer;
secondly, setting algorithms of each layer of the fusion memory network:
1) carrying out gray processing on an input image, converting the original input image into a corresponding gray image by a vector mapping method when the original input image is a three-dimensional color image, and skipping the step if the input image is the gray image;
2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P0Picture P0The size is 1024 × 1024;
3) image P0And respectively carrying out convolution operation with 96 convolution operators of the first convolution layer of the network, wherein the calculation expression is as follows:
Figure GDA0002281905570000081
wherein
Figure GDA0002281905570000082
Is an image P0At pixel [2i-1+ x,2j-1+ y]The gray-scale value of (a) is,
Figure GDA0002281905570000083
indicating the convolution operator at position x, y]The value of (a) is (b),
Figure GDA0002281905570000084
for the convolved image P1At pixel [ i, j]The gray value of (d); after the first layer of convolution operation, 96 convolution images with the size of 504 multiplied by 504 are output;
4) regularizing the convolution image, wherein the processing process comprises the following steps:
Figure GDA0002281905570000085
wherein
Figure GDA0002281905570000091
Is the output after regularization;
5) will be provided with
Figure GDA0002281905570000092
Performing maximum pooling calculation on the image, wherein the calculation method comprises the following steps:
Figure GDA0002281905570000093
outputting 96 feature matrixes with the size of 252 multiplied by 252 through the maximum pooling effect;
6) repeating the step 3), the step 4) and the step 5) to carry out second-layer convolution operation, and selecting the parameter s2And t2Are all 8, Δ2Obtaining 24576 feature matrices with the size of 61 × 61;
7) repeating the step 3), the step 4) and the step 5) to carry out a third layer of convolution operation, and selecting a parameter s3And t3Are all 5, Δ3For 1, 629 ten thousand feature matrices of 28 × 28 size are obtained;
8) carrying out similarity clustering analysis on the matrix obtained in the step 7), and keeping 10 ten thousand image feature matrices;
9) repeating the step 3), the step 4) and the step 5) to carry out a fourth layer of convolution operation, and selecting a parameter s4And t4Are all 3, Δ4For 1, 3840 ten thousand feature matrices with the size of 12 × 12 are obtained;
10) performing similarity clustering analysis on the matrix obtained in the step 9), and keeping 1 ten thousand image feature matrices;
11) repeating the steps 3), 4) and 5) to carry out fifth-layer convolution operation, and selecting the parameter s5And t5Are all 3, Δ5Obtaining 384 ten thousand feature matrices with the size of 4 multiplied by 4 as 1;
12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and taking the sum of all elements of the matrix as comprehensive characteristics to obtain 1000 different characteristic points;
13) inputting the characteristic values output in the step 12) into corresponding memory models, wherein the network is provided with 1000 independent memory units with the memory depth of 10, and corresponding output information is generated through the memory models; the concrete memory model is as follows:
a network structure having 10 independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:
δk(t)=Min{|Ci(t)-x(t)|,i=1,2,…,10}, (13),
when deltak(t) is less than or equal to the network identification threshold epsilon, indicating that the network successfully identifies the k-th class of information, and the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
Figure GDA0002281905570000101
Figure GDA0002281905570000102
when deltak(t) when the value is larger than the network identification threshold value epsilon, the network identification process is not input, the memory network updates the worst memorized information according to the forgetting rule, βk(t) replacing the lowest memory coefficient k with the current input information, wherein the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
βk(t)=Min{βi(t),i=1,2,…,10}, (15),
Figure GDA0002281905570000103
Figure GDA0002281905570000104
the network output h (t +1) is:
h(t+1)=Ck(t+1), (10),
14) taking the elements output in the step 13) as the input of a full-connection BP network, wherein 500 nodes are designed in a first hidden layer, 50 nodes are designed in a second hidden layer, and 5 nodes are output in the output layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;
15) the network provides a manual judgment feedback function in the parameter training process, when a user finds that the network identification is wrong, error information is input into a network correction interface, the network automatically learns again, and the weight matrix information of the system is updated;
thirdly, obtaining network parameters through network training:
constructing 50 ten thousand fuzzy images with different fuzzy types and different parameters based on an image database, namely, a cultech 101dataset and an acquired track surface video image database, taking the fuzzy image set as input and the corresponding parameters as output, and performing network training to obtain network parameters;
fourthly, unknown image fuzzy type identification and parameter setting:
taking 1000 images in a caltech 101dataset database as tested images, testing, and calculating the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the direction angle of a motion model lake and the noise variance network output of Gaussian blur corresponding to each image; comparing the network output with a known result, wherein the accuracy reaches 99.7 percent, and the image fuzzy type identification and parameter setting are realized;
meanwhile, applying the fusion memory CNN to an image deblurring processing process in the track surface defect detection system, and calculating the blur type, the defocus blur radius, the motion blur length, the motion model lake direction angle and the Gaussian blur noise variance of the detected image; and applying the setting parameters to image deblurring, wherein the deblurring effect obtained from result analysis meets the requirements of a detection system.
The described embodiments are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (2)

1. An image fuzzy type identification and parameter setting method fusing memory CNN is characterized by comprising the following steps:
the first step is to construct a converged memory network architecture:
dividing the CNN model into a serial architecture of 5 layers of convolution layers, 1 deep memory network and 1 BP network;
selecting N from the first convolution layer1Convolution operators, each convolution operator being of a size s1×t1A convolution kernel of (2), wherein s1Number of rows of convolution kernels, t1Is the number of columns of the convolution kernel; the convolution kernel is composed of a plurality of different straight lines, a plurality of different discs and a plurality of different circular rings, and each convolution kernel extracts the first-level shape characteristics of the image sub-graph unit;
second convolutional layer selection of N2Convolution operators, each convolution operator being of a size s2×t2A convolution kernel of (2), wherein s2Number of rows of convolution kernels, t2Is the number of columns of the convolution kernel; extracting a secondary shape feature of the image subgraph by each convolution kernel;
third convolutional layer selection of N3Convolution operators, each convolution operator being of a size s3×t3A convolution kernel of (2), wherein s3Number of rows of convolution kernels, t3Is the number of columns of the convolution kernel; extracting a three-level shape feature of the image subgraph by each convolution kernel;
fourth convolutional layer selection of N4Convolution operators, each convolution operator being of a size s4×t4A convolution kernel of (2), wherein s4Number of rows of convolution kernels, t4Is the number of columns of the convolution kernel; extracting a four-level shape feature of the image subgraph by each convolution kernel;
fifth convolutional layer selection of N5A convolution operator, per volumeThe product operator is of a size s5×t5A convolution kernel of (2), wherein s5Number of rows of convolution kernels, t5Is the number of columns of the convolution kernel; extracting a five-level shape feature of the image subgraph by each convolution kernel;
the deep memory network adopts a memory model with the depth of D units and adopts a selective updating rule;
the BP network adopts a 4-layer structure, an input layer, two hidden layers and an output layer;
secondly, setting algorithms of each layer of the fusion memory network:
1) carrying out gray processing on an input image, converting the original input image into a corresponding gray image by a vector mapping method when the original input image is a three-dimensional color image, and skipping the step if the input image is the gray image;
2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P0
3) Image P0N with the first winding layer1The convolution operators respectively carry out convolution operation, and the calculation expression is as follows:
Figure FDA0002281905560000021
wherein
Figure FDA0002281905560000022
Is an image P0At pixel [ (i-1) Δ1+1+x,(j-1)Δ1+1+y]The gray-scale value of (a) is,
Figure FDA0002281905560000023
denotes the n-th1At position [ x, y ] of convolution operator]The weight of the (c) position(s),
Figure FDA0002281905560000024
for the convolved image P1At pixel [ i, j]Gray value of(s)1Number of rows, t, of convolution operators1Is the number of columns, Δ, of the convolution operator1For convolution shift step size, n1Is the serial number of the convolution operator, and the range is n is more than or equal to 11≤N1
4) Regularizing the convolution image, wherein the processing process comprises the following steps:
Figure FDA0002281905560000025
wherein
Figure FDA0002281905560000026
Is the output after regularization, and omega is the attenuation coefficient;
5) will be provided with
Figure FDA0002281905560000027
Performing maximum pooling calculation on the image, wherein the calculation method comprises the following steps:
Figure FDA0002281905560000028
6) calculating a result of the second layer convolution operation according to the second layer convolution kernel by referring to the step 3), the step 4) and the step 5);
7) calculating the result of the third layer of convolution operation according to the third layer of convolution kernel by referring to the step 3), the step 4) and the step 5);
8) performing similarity clustering analysis on the matrix obtained in the step 7), and keeping the number of the third-level features of the image as M1A plurality of;
9) on the basis of the step 8), calculating a result of the fourth layer convolution operation according to the fourth layer convolution kernel by referring to the step 3), the step 4) and the step 5);
10) carrying out similarity clustering analysis on the matrix obtained in the step 9), and keeping the fourth-level feature quantity of the image as M2A plurality of;
11) calculating a result of fifth-layer convolution operation according to a fifth-layer convolution kernel in the step 10) by referring to the step 3), the step 4) and the step 5);
12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and obtaining M by taking the sum of all elements of the matrix as comprehensive characteristics3A plurality of different feature points;
13) inputting the characteristic values output in the step 12) into corresponding memory models, and generating corresponding output information through the memory models; the concrete memory model is as follows:
a network structure having D independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:
δk(t)=Min{|Ci(t)-x(t)| i=1,2,…,D}, (4),
when deltak(t) is less than or equal to the network identification threshold epsilon, indicating that the network successfully identifies the k-th class of information, and the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
Figure FDA0002281905560000031
Figure FDA0002281905560000032
when deltak(t) when the memory coefficient is greater than the network recognition threshold epsilon, the network recognition process is not input, the memory network updates the information with the worst memory according to the forgetting rule, namely the unit k with the lowest memory coefficient is replaced by the current input information, and the memory coefficient β of each memory unit at the momenti(t) and memory information Ci(t) the selective memory update rule is:
βk(t)=Min{βi(t),i=1,2,…,D}, (7),
Figure FDA0002281905560000041
Figure FDA0002281905560000042
the network output is:
h(t+1)=Ck(t+1), (10),
14) taking the elements output in the step 13) as the input of the full-connection BP network, designing the number of nodes of the intermediate hidden layer in a decreasing mode, and outputting 5 nodes on the layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;
15) the network provides a manual judgment feedback function in the parameter training process, when a user finds that the network identification is wrong, error information is input into a network correction interface, the network automatically learns again, and the weight matrix information of the system is updated;
thirdly, obtaining network parameters through network training:
after the network construction and the algorithm setting are finished, learning training is carried out by using the known images of 1-100 ten thousand frames and the characteristics of the known images to obtain network parameters;
fourthly, unknown image fuzzy type identification and parameter setting:
image information acquired in the actual production process or fuzzy image information to be identified is input into the network, and the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the motion model lake direction angle and Gaussian fuzzy noise variance network output are obtained through network calculation, so that the image fuzzy type identification and parameter setting are realized.
2. The image blur type identification and parameter setting method of the fusion memory CNN as claimed in claim 1, which is implemented as follows:
the first step is to construct a converged memory network architecture:
dividing the CNN model into a serial architecture of 5 layers of convolution layers, 1 deep memory network and 1 BP network;
the first convolution layer selects 96 convolution operators, each convolution operator is a convolution kernel of 16 multiplied by 16, each convolution kernel comprises 72 straight lines with different forms, 8 discs with different sizes and 16 circular rings with different forms, and each convolution kernel extracts the primary shape characteristics of the image sub-graph unit;
the second convolution layer selects 256 convolution operators, each convolution operator is an 8 x 8 convolution kernel, and each convolution kernel extracts a two-level shape feature of the image sub-graph;
the third convolution layer selects 256 convolution operators, each convolution operator is a convolution kernel of 5 multiplied by 5, and each convolution kernel extracts a three-level shape feature of the image subgraph;
the fourth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a four-level shape feature of the image subgraph;
the fifth convolution layer selects 384 convolution operators, each convolution operator is a convolution kernel of 3 multiplied by 3, and each convolution kernel extracts a five-level shape feature of the image subgraph;
the deep memory network adopts a memory model with the depth of 10 units and adopts a selective updating rule;
the BP network adopts a 4-layer structure, an input layer, two hidden layers and an output layer;
secondly, setting algorithms of each layer of the fusion memory network:
1) carrying out gray processing on an input image, converting the original input image into a corresponding gray image by a vector mapping method when the original input image is a three-dimensional color image, and skipping the step if the input image is the gray image;
2) two-dimensional Fourier transform is carried out on the detected image, the two-dimensional Fourier transform is converted into a spectrogram, and the spectrogram is marked as an image P0Picture P0The size is 1024 × 1024;
3) image P0And respectively carrying out convolution operation with 96 convolution operators of the first convolution layer of the network, wherein the calculation expression is as follows:
Figure FDA0002281905560000061
wherein
Figure FDA0002281905560000062
Is an image P0At pixel [2i-1+ x,2j-1+ y]The gray-scale value of (a) is,
Figure FDA0002281905560000063
indicating the convolution operator at position x, y]The value of (a) is (b),
Figure FDA0002281905560000064
for the convolved image P1At pixel [ i, j]The gray value of (d); after the first layer of convolution operation, 96 convolution images with the size of 504 multiplied by 504 are output;
4) regularizing the convolution image, wherein the processing process comprises the following steps:
Figure FDA0002281905560000065
wherein
Figure FDA0002281905560000066
Is the output after regularization;
5) will be provided with
Figure FDA0002281905560000067
Performing maximum pooling calculation on the image, wherein the calculation method comprises the following steps:
Figure FDA0002281905560000068
outputting 96 feature matrixes with the size of 252 multiplied by 252 through the maximum pooling effect;
6) repeating the step 3), the step 4) and the step 5) to carry out second-layer convolution operation, and selecting the parameter s2And t2Are all 8, Δ2Obtaining 24576 feature matrices with the size of 61 × 61;
7) repeating the step 3), the step 4) and the step 5) to carry out a third layer of convolution operation, and selecting parameterss3And t3Are all 5, Δ3For 1, 629 ten thousand feature matrices of 28 × 28 size are obtained;
8) carrying out similarity clustering analysis on the matrix obtained in the step 7), and keeping 10 ten thousand image feature matrices;
9) repeating the step 3), the step 4) and the step 5) to carry out a fourth layer of convolution operation, and selecting a parameter s4And t4Are all 3, Δ4For 1, 3840 ten thousand feature matrices with the size of 12 × 12 are obtained;
10) performing similarity clustering analysis on the matrix obtained in the step 9), and keeping 1 ten thousand image feature matrices;
11) repeating the steps 3), 4) and 5) to carry out fifth-layer convolution operation, and selecting the parameter s5And t5Are all 3, Δ5Obtaining 384 ten thousand feature matrices with the size of 4 multiplied by 4 as 1;
12) carrying out similarity clustering analysis on the matrix obtained in the step 11), and taking the sum of all elements of the matrix as comprehensive characteristics to obtain 1000 different characteristic points;
13) inputting the characteristic values output in the step 12) into corresponding memory models, wherein the network is provided with 1000 independent memory units with the memory depth of 10, and corresponding output information is generated through the memory models; the concrete memory model is as follows:
a network structure having 10 independent memory cells, wherein the network input x (t) is compared with the memory values of the memory cells, and the error at the cell k closest to the input is:
δk(t)=Min{|Ci(t)-x(t)|,i=1,2,…,10}, (13),
when deltak(t) is less than or equal to the network identification threshold epsilon, indicating that the network successfully identifies the k-th class of information, and the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
Figure FDA0002281905560000071
Figure FDA0002281905560000072
when deltak(t) when the value is larger than the network identification threshold value epsilon, the network identification process is not input, the memory network updates the worst memorized information according to the forgetting rule, βk(t) replacing the lowest memory coefficient k with the current input information, wherein the memory coefficient β of each memory celli(t) and memory information Ci(t) the selective memory update rule is:
βk(t)=Min{βi(t),i=1,2,…,10}, (15),
Figure FDA0002281905560000081
Figure FDA0002281905560000082
the network output h (t +1) is:
h(t+1)=Ck(t+1), (10),
14) taking the elements output in the step 13) as the input of a full-connection BP network, wherein 500 nodes are designed in a first hidden layer, 50 nodes are designed in a second hidden layer, and 5 nodes are output in the output layer; the meaning of each node of the output layer is respectively: the 1 st output node is of a fuzzy type, the defocus fuzzy is 1, the motion fuzzy is 2, and the Gaussian fuzzy is 3; the 2 nd output node is the radius r of the defocus blur, when the 1 st output node is 1, the output is the calculated radius value, otherwise, the output is 0; the 3 rd output node is the length of the motion blur; the 4 th output node is the direction angle of the motion model lake; the 5 th output node is the noise variance of Gaussian blur;
15) the network provides a manual judgment feedback function in the parameter training process, when a user finds that the network identification is wrong, error information is input into a network correction interface, the network automatically learns again, and the weight matrix information of the system is updated;
thirdly, obtaining network parameters through network training:
constructing 50 ten thousand fuzzy images with different fuzzy types and different parameters based on an image database, namely, a cultech 101dataset and an acquired track surface video image database, taking the fuzzy image set as input and the corresponding parameters as output, and performing network training to obtain network parameters;
fourthly, unknown image fuzzy type identification and parameter setting:
taking 1000 images in a caltech 101dataset database as tested images, testing, and calculating the fuzzy type, the out-of-focus fuzzy radius, the motion fuzzy length, the direction angle of a motion model lake and the noise variance network output of Gaussian blur corresponding to each image; comparing the network output with a known result, wherein the accuracy reaches 99.7 percent, and the image fuzzy type identification and parameter setting are realized;
meanwhile, applying the fusion memory CNN to an image deblurring processing process in the track surface defect detection system, and calculating the blur type, the defocus blur radius, the motion blur length, the motion model lake direction angle and the Gaussian blur noise variance of the detected image; and applying the setting parameters to image deblurring, wherein the deblurring effect obtained from result analysis meets the requirements of a detection system.
CN201710609501.3A 2017-07-25 2017-07-25 Image fuzzy type identification and parameter setting method based on fusion memory CNN Active CN107274378B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710609501.3A CN107274378B (en) 2017-07-25 2017-07-25 Image fuzzy type identification and parameter setting method based on fusion memory CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710609501.3A CN107274378B (en) 2017-07-25 2017-07-25 Image fuzzy type identification and parameter setting method based on fusion memory CNN

Publications (2)

Publication Number Publication Date
CN107274378A CN107274378A (en) 2017-10-20
CN107274378B true CN107274378B (en) 2020-04-03

Family

ID=60079524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710609501.3A Active CN107274378B (en) 2017-07-25 2017-07-25 Image fuzzy type identification and parameter setting method based on fusion memory CNN

Country Status (1)

Country Link
CN (1) CN107274378B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446631B (en) * 2018-03-20 2020-07-31 北京邮电大学 Deep learning intelligent spectrogram analysis method based on convolutional neural network
CN108537746B (en) * 2018-03-21 2021-09-21 华南理工大学 Fuzzy variable image blind restoration method based on deep convolutional network
CN108830801A (en) * 2018-05-10 2018-11-16 湖南丹尼尔智能科技有限公司 A kind of deep learning image recovery method of automatic identification vague category identifier
CN108898563B (en) * 2018-07-02 2021-01-22 京东方科技集团股份有限公司 Processing method of optical detection image of display panel and computer readable medium
CN109345449B (en) 2018-07-17 2020-11-10 西安交通大学 Image super-resolution and non-uniform blur removing method based on fusion network
CN109445457B (en) * 2018-10-18 2021-05-14 广州极飞科技股份有限公司 Method for determining distribution information, and method and device for controlling unmanned aerial vehicle
CN111339995B (en) * 2020-03-16 2024-02-20 合肥闪捷信息科技有限公司 Sensitive image recognition method based on neural network
CN112183650B (en) * 2020-10-09 2023-09-22 青岛中瑞车云工业互联网科技有限公司 Digital detection and identification method under camera defocus condition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104392205A (en) * 2014-10-24 2015-03-04 浙江力石科技股份有限公司 Abnormal vehicle license plate recognition method and system
CN105844239A (en) * 2016-03-23 2016-08-10 北京邮电大学 Method for detecting riot and terror videos based on CNN and LSTM
CN106782602A (en) * 2016-12-01 2017-05-31 南京邮电大学 Speech-emotion recognition method based on length time memory network and convolutional neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095950B2 (en) * 2015-06-03 2018-10-09 Hyperverge Inc. Systems and methods for image processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104392205A (en) * 2014-10-24 2015-03-04 浙江力石科技股份有限公司 Abnormal vehicle license plate recognition method and system
CN105844239A (en) * 2016-03-23 2016-08-10 北京邮电大学 Method for detecting riot and terror videos based on CNN and LSTM
CN106782602A (en) * 2016-12-01 2017-05-31 南京邮电大学 Speech-emotion recognition method based on length time memory network and convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度卷积神经网络的快速图像分类算法;王华利;《计算机工程与应用》;20160510;第53卷(第13期);全文 *

Also Published As

Publication number Publication date
CN107274378A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
CN107274378B (en) Image fuzzy type identification and parameter setting method based on fusion memory CNN
CN110188685B (en) Target counting method and system based on double-attention multi-scale cascade network
Suganuma et al. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions
CN109360171B (en) Real-time deblurring method for video image based on neural network
CN107529650B (en) Closed loop detection method and device and computer equipment
CN106778604B (en) Pedestrian re-identification method based on matching convolutional neural network
CN107229914B (en) Handwritten digit recognition method based on deep Q learning strategy
CN105701508B (en) Global local optimum model and conspicuousness detection algorithm based on multistage convolutional neural networks
Guo et al. Multiview high dynamic range image synthesis using fuzzy broad learning system
CN111126412B (en) Image key point detection method based on characteristic pyramid network
CN109598704B (en) Fecal microscopic image definition evaluation method based on BP neural network
CN107368887B (en) Deep memory convolutional neural network device and construction method thereof
CN108985443B (en) Action recognition method and neural network generation method and device thereof, and electronic equipment
CN108171249B (en) RGBD data-based local descriptor learning method
CN110533119B (en) Identification recognition method, model training method and device thereof, and electronic system
CN104268524A (en) Convolutional neural network image recognition method based on dynamic adjustment of training targets
CN111428664A (en) Real-time multi-person posture estimation method based on artificial intelligence deep learning technology for computer vision
CN110827304A (en) Traditional Chinese medicine tongue image positioning method and system based on deep convolutional network and level set method
CN111553296B (en) Two-value neural network stereo vision matching method based on FPGA
CN112819096A (en) Method for constructing fossil image classification model based on composite convolutional neural network
CN115376024A (en) Semantic segmentation method for power accessory of power transmission line
CN113160179A (en) Image deblurring method based on dynamic region convolution
KR101563569B1 (en) Learnable Dynamic Visual Image Pattern Recognition System and Method
CN109919215B (en) Target detection method for improving characteristic pyramid network based on clustering algorithm
CN112418070B (en) Attitude estimation method based on decoupling ladder network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant