CN112115973A - Convolutional neural network based image identification method - Google Patents

Convolutional neural network based image identification method Download PDF

Info

Publication number
CN112115973A
CN112115973A CN202010829114.2A CN202010829114A CN112115973A CN 112115973 A CN112115973 A CN 112115973A CN 202010829114 A CN202010829114 A CN 202010829114A CN 112115973 A CN112115973 A CN 112115973A
Authority
CN
China
Prior art keywords
layer
neural network
training
convolution
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010829114.2A
Other languages
Chinese (zh)
Other versions
CN112115973B (en
Inventor
刘航
白仞祥
张玉红
菅秀凯
刘鸣泰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Jianzhu University
Original Assignee
Jilin Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin Jianzhu University filed Critical Jilin Jianzhu University
Priority to CN202010829114.2A priority Critical patent/CN112115973B/en
Publication of CN112115973A publication Critical patent/CN112115973A/en
Application granted granted Critical
Publication of CN112115973B publication Critical patent/CN112115973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention belongs to the technology of deep learning and image recognition, and particularly relates to an image recognition method based on a convolutional neural network. The method comprises the following steps: performing model training on the original picture by adopting a convolutional neural network; and inputting the picture to be processed into the trained model, and identifying the picture. According to the method, the training of the neural network is accelerated by adopting a GPU mode in the training process, Dropout regularization is added into a training model to optimize the system so as to prevent the overfitting phenomenon in the training process, and meanwhile, the data set photos are subjected to image set expansion.

Description

Convolutional neural network based image identification method
Technical Field
The invention belongs to the technology of deep learning and image recognition, and particularly relates to an image recognition method based on a convolutional neural network.
Background
Since Rumelhart and others developed learning algorithms at the same time in 1985, the trend of exploring and researching neural networks has been raised worldwide, the development of artificial neural networks has penetrated into the field of research, and particularly, the application of image classification technology for pattern recognition is gradually increased, and characters recognition technology, license plate recognition technology, face recognition technology, various paper money recognition technology, seal recognition technology, recognition of some military targets and the like are researched more at home and abroad. When the artificial neural network completes the task of image recognition, the following problems are mainly caused:
(1) the number of parameters is too large, in CIFAR-10 (one game dataset) the image is only of size 32x32x3(32 wide, 32 high, 3 color channels), so a single fully connected neuron in the first hidden layer of a normal neural network will have a weight of 32x32x 3-3072. This number is still controllable, but it is clear that this fully connected structure does not extend to larger images. For example, an image of a more appreciable size, such as a 200x200x3 image, would result in 120,000 weighted neurons. Furthermore, we almost certainly have several such neurons, so the parameters increase. Obviously, such a full connection is wasteful and the large number of parameters can quickly lead to over-mating.
(2) No position information between pixels is utilized. For image recognition tasks, the association of each pixel with its surrounding pixels is relatively close, and the association of pixels that are far apart may be small. If a neuron is connected to all neurons in the previous layer, it is equivalent to treating all pixels of the image equally for a pixel, which does not conform to the previous assumption. After we complete the learning of each connection weight, we may eventually find that there are a large number of weights, all of which have small values. In an effort to learn a large number of non-trivial weights, such learning would necessarily be very inefficient.
(3) And limiting the network layer number. The more the number of network layers, the stronger the expression ability, but training a deep artificial neural network by a gradient descent method is difficult because the gradient of a fully-connected neural network is difficult to transfer beyond 3 layers. Therefore, it is impossible to obtain a deep fully-connected neural network, which limits its capabilities.
Disclosure of Invention
In order to solve the problems in the prior art, the technical problem to be solved by the invention is to provide an image identification method based on a convolutional neural network.
The present invention is achieved in such a way that,
a convolutional neural network based image recognition method, comprising:
step 1, performing model training on an original picture by adopting a convolutional neural network;
and 2, inputting the picture to be processed into the trained model, and identifying the picture.
Further: the step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer;
summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
the method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full-connection layer to obtain an output value;
the characteristics of each layer output are as follows:
Figure BDA0002637269530000021
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure BDA0002637269530000022
is the weight of the layer of convolution kernel,
Figure BDA0002637269530000023
is a convolution operation that is performed by a convolution operation,
Figure BDA0002637269530000024
is a feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
determining parameters inside the model by forward propagating the prediction of the samples and the output of the expected value of the convolutional neural network; defining an objective function of the convolutional neural network:
Figure BDA0002637269530000031
where L (x) is a loss function, m is the number of samples,
Figure BDA0002637269530000032
y is the sample output for the desired output. Calculating the partial derivative of the parameters w and b of each layer in the neural network by using a gradient descent method to obtain updated parameter values of the convolutional neural network, so that the actual convolutional neural network output is closer to an expected value;
step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: and judging whether the epoch times are finished or not, if so, quitting the model training, and otherwise, carrying out the next training.
And step 17: and finishing the training of the training model.
Further: the updating in step 15 includes convolution layer updating and full connection layer updating:
and returning the error layer by using a back propagation algorithm, and updating the weight of each layer by using a gradient descent method.
Further: in the step 13, the process is carried out,
the forward propagation process of the convolution layer is to perform convolution operation on input data through convolution kernel, the convolution kernel convolves the whole input picture by adopting a convolution mode with step length of 1 to form a local receptive field, then the local receptive field performs convolution algorithm, the weighted sum is performed through a weight matrix and a characteristic value of the picture, and then the output is obtained through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted from the convolution layer of the upper layer are used as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature map is selected by adopting a maximum pooling method;
the forward propagation process of the full-connection layer is that after the feature map enters the overwinding layer and the feature extraction of the down-sampling layer, the extracted features are transmitted to the full-connection layer, and classification is carried out through the full-connection layer to obtain a classification model and obtain the final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer multiplied by the number of input features plus the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and transmitted to the next layer.
Further: in the step 2, the step of the method is carried out,
step 21: loading the trained optimal weight value stored in the specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model by a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
step 24: carrying out normalization preprocessing on a picture to be recognized;
step 25: identifying using a convolutional neural network based identification system; and outputs the recognition result.
Compared with the prior art, the invention has the beneficial effects that:
in the method, the training of the neural network is accelerated by adopting a GPU mode in the training process, Dropout regularization is added into a training model to optimize the system so as to prevent the overfitting phenomenon in the training process, and meanwhile, the data set photos are subjected to atlas expansion, such as: rotation, scaling, turning and the like, and the model has no overfitting phenomenon to the extended data set in the training process. It can be known from the loss function graph fig. 8 that, when the training model is trained to the later stage, the loss function also keeps steadily decreasing as the model learning rate gradually decreases, and when the training model of the convolutional neural network reaches 25 iterations, the curve of the loss function starts to gradually trend towards stability. As can be seen from the accuracy graph of model training fig. 9, in the beginning of several times, the accuracy of the training model is low, which is because the model parameters are not optimized due to the small number of model training iterations, but in the process of gradually increasing the number of model training iterations, the recognition rate of the model data set is gradually increased, and when the number of iterations of the convolutional neural network training model reaches 25 times, the accuracy graph of the model gradually tends to be stable. By combining the two graphs, the optimal iteration number of the model is reached when the model is iterated for 25 times. By adopting a training model designed based on a convolutional neural network, the accuracy rate can reach 96%.
Drawings
FIG. 1 is a diagram of an embodiment of the present invention for use as a correct pattern;
FIG. 2 is an image used as an error in an embodiment of the present invention;
FIG. 3 is a first layer convolution structure according to an embodiment of the present invention;
FIG. 4 is a second layer convolution structure according to an embodiment of the present invention;
FIG. 5 is a third layer convolution structure according to an embodiment of the present invention;
FIG. 6 is a fourth layer convolution structure according to an embodiment of the present invention;
FIG. 7 is a fifth layer convolution structure according to an embodiment of the present invention;
FIG. 8 is a loss function droop curve according to an embodiment of the present invention;
FIG. 9 shows the recognition accuracy of the training model according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
An image identification method based on a convolution neural network is characterized by comprising the following steps:
step 1, performing model training on an original picture by adopting a convolutional neural network;
and 2, inputting the picture to be processed into the trained model, and identifying the picture.
The step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer; summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
the method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full-connection layer to obtain an output value;
the characteristics of each layer output are as follows:
Figure BDA0002637269530000061
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure BDA0002637269530000062
is the weight of the layer of convolution kernel,
Figure BDA0002637269530000063
is a convolution operation that is performed by a convolution operation,
Figure BDA0002637269530000064
is a feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
determining parameters inside the model by forward propagating the prediction of the samples and the output of the expected value of the convolutional neural network; defining an objective function of the convolutional neural network:
Figure BDA0002637269530000065
where L (x) is a loss function, m is the number of samples,
Figure BDA0002637269530000066
y is the sample output for the desired output. Calculating the partial derivative of the parameters w and b of each layer in the neural network by using a gradient descent method to obtain updated parameter values of the convolutional neural network, so that the actual convolutional neural network output is closer to an expected value;
step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: and judging whether the epoch times are finished or not, if so, quitting the model training, and otherwise, carrying out the next training.
And step 17: and finishing the training of the training model.
The updating in step 15 includes convolution layer updating and full connection layer updating:
and returning the error layer by using a back propagation algorithm, and updating the weight of each layer by using a gradient descent method.
In the step 13, the forward propagation process of the convolution layer is to perform convolution operation on input data through convolution kernel, the convolution kernel convolves the whole input picture by adopting a convolution mode with step length of 1 to form a local receptive field, then perform convolution algorithm on the local receptive field, perform weighted sum on a weight matrix and a characteristic value of the picture, and then obtain output through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted from the convolution layer of the upper layer are used as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature map is selected by adopting a maximum pooling method;
the forward propagation process of the full-connection layer is that after the feature map enters the overwinding layer and the feature extraction of the down-sampling layer, the extracted features are transmitted to the full-connection layer, and classification is carried out through the full-connection layer to obtain a classification model and obtain the final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer multiplied by the number of input features plus the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and transmitted to the next layer.
In step 2, step 21: loading the trained optimal weight value stored in the specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model by a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
step 24: carrying out normalization preprocessing on a picture to be recognized;
step 25: identifying using a convolutional neural network based identification system; and outputs the recognition result.
The operation of the convolutional layer is an important component of the convolutional neural network, and the operation of the convolutional layer is mainly used for extracting and abstracting image characteristics. The core of the convolutional layer is convolution operation, and in the convolution operation, an image should be converted into a matrix first and then operated. Assume that there is an image with a size of 6 x 6, and each pixel has information of the image stored therein. A convolution kernel (equivalent to a weight) is defined to extract certain features from the image. And multiplying the convolution kernel by the corresponding bit of the digital matrix and adding to obtain the output result of the convolution layer.
The value of the convolution kernel can be randomly generated by a function without the experience of the past learning, and then is trained and adjusted step by step.
When all the pixels are covered at least once, the output of a convolution layer can be generated (the convolution step length is 1).
The machine does not know at first which features the part to be identified has, and compares the output values obtained by interacting with different convolution kernels to determine which convolution kernel best represents the feature of the picture, for example, to identify a feature (such as a curve) in the image, that is, the convolution kernel has a high output value for the curve and a low output value for other shapes (such as a triangle). The higher the convolution layer output value, the higher the matching degree, and the more the characteristics of the picture can be expressed.
The down-sampling layer is also called as a pooling layer, and the working process is as follows:
the pooling layer mainly has the effects of reducing the number of parameters, improving the calculation speed, enhancing the robustness of the extracted features and preventing the over-fitting phenomenon from happening, and is generally placed behind the convolution layer, so that the size of the model is reduced and the feature dimension is reduced.
The most common two forms of pooling layer:
maximum pooling: max-pooling-the largest number in a given area is chosen to represent the entire area.
And (3) mean value pooling: mean-posing-choosing the average of the values in a given area to represent the whole area.
The task of the convolutional layer and the pooling layer is to extract features and reduce parameters brought by the original image. However, to generate the final output, a fully connected layer needs to be applied to generate one classifier equal to the number of classes required.
The working principle of the fully-connected layer is similar to that of the previous neural network learning, the tensor output by the pooling layer needs to be cut into vectors again, the vectors are multiplied by the weight matrix, the bias value is added, then the ReLU activation function is used for the tensor, and the parameters are optimized by the gradient descent method.
Example (b):
the training model in this embodiment has 40 epochs to update the learning rate, a larger learning rate is set at the beginning of training, the learning rate is gradually reduced along with the reduction of the total error of the system in the learning process, the optimal weight is saved every time the epoch training is completed, so that the later-stage neural network model is deployed, the training system is optimized by using an SGD (sparse dimension) random gradient descent method in the training process, and the convergence of the model is accelerated by using minipatch training. After the 40 epoch training is finished, the optimal weight in the training is saved, and the saved optimal weight is directly called in the model prediction to initialize the model prediction parameters so as to start the prediction of the picture.
Before training begins, loading pictures to be trained, preprocessing a training set, wherein the pictures include picture normalization, picture channels are uniform, and the like, then building and training a model, namely forward propagation and backward propagation are started, the backward propagation adopts a random gradient descent method for optimization, judging whether the result is better once the optimization is completed once, if so, updating related weight, otherwise, judging whether all epoch training is completed, if not, returning to the training model for continuous training, otherwise, finishing the training of the whole model.
In the neural network model prediction, trained model parameters are loaded, label values of image classification are loaded so as to output a subsequent prediction result of the model, then the image to be classified is transmitted to a user side, the image to be recognized is displayed and preprocessed after the system obtains the image to be recognized, related parameters are unified, the loaded neural network is used for prediction, and finally the recognition result of the current image is output, so that the whole image recognition process is completed.
The data set is divided into 2 types, and comprises 70 training sets of training model optimization model parameters and 10 test sets of test model recognition conditions. Selecting two patterns, wherein a red, green and blue tristimulus is used as a correct pattern, as shown in FIG. 1; non-rgb-blue tristimulus patterns are used as the error patterns, as shown in fig. 2. And respectively taking 40 photos at different angles, taking 35 photos taken in each pattern as a training set for optimizing network parameters, and taking the remaining 5 photos of each pattern as a verification set. The process is as follows: (1) and preprocessing and normalizing the input picture matrix, and sending the pictures with the size of 128x128 into a network.
(2) The first layer of convolution structure uses 96 20 × 20 convolution kernels, the convolution step is 2, the padding operation is valid, and the output signature is 55 × 96. After normalization and PReLU activation, the maximal pooling operation is performed with a local sensing area of 3 × 3, a pooling step of 2, padding operation of valid, and an output signature of 27 × 96. As shown in fig. 3.
(3) The second layer convolution structure takes 27 × 96 characteristic diagram as input, 256 convolution kernels of 5 × 5 are used, the convolution step is 1, the padding operation is same, and the output characteristic diagram is 27 × 256. After normalization and PReLU activation, the maximal pooling operation is performed with local sensing area of 5 × 5, pooling step of 2, padding operation of valid, and output signature of 13 × 256. As shown in fig. 4.
(4) The third layer of convolution structure takes 13 × 256 characteristic diagram as input, 384 convolution kernels of 3 × 3 are used, the convolution step is 1, the padding operation is same, the layer only performs normalization and PReLU activation processing without pooling, and the output characteristic diagram is 13 × 384. As shown in fig. 5.
(5) With 13 × 384 signature as input, 384 convolution kernels of 3 × 3, convolution step 1, padding operation same as same, this layer only normalizes and the PReLU activation process does not pool, and output signature is 13 × 384. As shown in fig. 6.
(6) The fifth layer convolution structure takes 13 × 384 signature as input, 256 convolution kernels of 3 × 3 are used, the convolution step is 1, the padding operation is same, and the output signature is 13 × 256. After normalization and PReLU activation, the maximal pooling operation is performed with a local sensing area of 3 × 3, a pooling step of 2, padding operation of valid, and an output signature of 6 × 256. As shown in fig. 7.
(7) And in the structure of the first fully-connected layer, the characteristic diagram output by the fifth convolutional layer is compressed into a one-dimensional characteristic diagram through the fully-connected layer, the output parameter is 4096, the parameter of the Dropout layer is 0.2, so that the occurrence of overfitting is prevented, and the output characteristic diagram is 4096 x 1.
(8) The structure of the second layer fully-connected layer takes the output characteristic diagram of the first layer fully-connected layer as input, the output parameter is 4096, and the Dropout layer parameter is 0.25. The output characteristic of this layer is therefore 4096 x 1.
(9) And the third layer of fully-connected layer structure takes 4096 × 1 characteristic diagram as input, the output parameter of the layer is 2, and the output characteristic diagram is 2 × 1.
(10) And finally, inputting the 2x 1 feature map output by the third fully-connected layer as a softmax classifier, and outputting 2 classes of classified data through the classifier.
The convolutional neural network in the experiment is explained above, and the specific procedure is as follows:
(1) convolution of convolutional neural networks and pooling layer procedures.
x=Conv2D(96,(20,20),strides=(2,2),padding='valid')(input_dim)
x=bn_relu(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
x=Conv2D(256,(5,5),strides=(1,1),padding='same')(x)
x=bn_relu(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
x=Conv2D(384,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=Conv2D(384,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=Conv2D(256,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
(2) A fully connected layer procedure for convolutional neural networks.
x=Flatten()(x)
fc1=Dense(4096)(x)
dr1=Dropout(0.2)(fc1)
fc2=Dense(4096)(dr1)
dr2=Dropout(0.25)(fc2)
fc3=Dense(out_dims)(dr2)
The iteration number of model training of the training model in the training process of the embodiment is maximum 40 epochs, the selected block size is 128, the training of the neural network is accelerated by adopting a GPU mode, Dropout regularization is added into the training model to optimize the system so as to prevent an overfitting phenomenon from occurring in the training process, and meanwhile, the atlas expansion is performed on a data set photo, for example: rotation, scaling, turning and the like, and the model has no overfitting phenomenon to the extended data set in the training process. It can be known from the loss function graph fig. 8 that, when the training model is trained to the later stage, the loss function also keeps steadily decreasing as the model learning rate gradually decreases, and when the training model of the convolutional neural network reaches 25 iterations, the curve of the loss function starts to gradually trend towards stability. As can be seen from the accuracy graph of model training fig. 9, in the beginning of several times, the accuracy of the training model is low, which is because the model parameters are not optimized due to the small number of model training iterations, but in the process of gradually increasing the number of model training iterations, the recognition rate of the model data set is gradually increased, and when the number of iterations of the convolutional neural network training model reaches 25 times, the accuracy graph of the model gradually tends to be stable. By combining the two graphs, the optimal iteration number of the model is reached when the model is iterated for 25 times. By adopting a training model designed based on a convolutional neural network, the accuracy rate can reach 96%.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

1. An image identification method based on a convolution neural network is characterized by comprising the following steps:
step 1, performing model training on an original picture by adopting a convolutional neural network;
and 2, inputting the picture to be processed into the trained model, and identifying the picture.
2. The method of claim 1, wherein: the step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer; summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
the method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full-connection layer to obtain an output value; the characteristics of each layer output are as follows:
Figure RE-FDA0002758292230000011
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure RE-FDA0002758292230000012
is the weight of the layer of convolution kernel,
Figure RE-FDA0002758292230000013
is a convolution operation that is performed by a convolution operation,
Figure RE-FDA0002758292230000014
is a feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
determining parameters inside the model by forward propagating the prediction of the samples and the output of the expected value of the convolutional neural network; defining an objective function of the convolutional neural network:
Figure RE-FDA0002758292230000015
where L (x) is a loss function, m is the number of samples,
Figure RE-FDA0002758292230000016
y is the sample output for the desired output. Calculating the partial derivative of the parameters w and b of each layer in the neural network by using a gradient descent method to obtain updated parameter values of the convolutional neural network, so that the actual convolutional neural network output is closer to an expected value;
step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: and judging whether the epoch times are finished or not, if so, quitting the model training, and otherwise, carrying out the next training.
And step 17: and finishing the training of the training model.
3. The method of claim 2, wherein the updates in step 15 include convolutional layer updates and fully-connected layer updates: and returning the error layer by utilizing back propagation, and updating the weight of each layer by utilizing a gradient descent method.
4. The method of claim 2, wherein, in step 13,
the forward propagation process of the convolution layer is to perform convolution operation on input data through convolution kernel, the convolution kernel convolves the whole input picture by adopting a convolution mode with step length of 1 to form a local receptive field, then the local receptive field performs convolution algorithm, the weighted sum is performed through a weight matrix and a characteristic value of the picture, and then the output is obtained through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted from the convolution layer of the upper layer are used as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature map is selected by adopting a maximum pooling method;
the forward propagation process of the full-connection layer is that after the feature map enters the overwinding layer and the feature extraction of the down-sampling layer, the extracted features are transmitted to the full-connection layer, and classification is carried out through the full-connection layer to obtain a classification model and obtain the final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer multiplied by the number of input features plus the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and transmitted to the next layer.
5. The method of claim 1, wherein, in step 2,
step 21: loading the trained optimal weight value stored in the specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model by a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
step 24: carrying out normalization preprocessing on a picture to be recognized;
step 25: identifying using a convolutional neural network based identification system; and outputs the recognition result.
CN202010829114.2A 2020-08-18 2020-08-18 Convolutional neural network based image identification method Active CN112115973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010829114.2A CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010829114.2A CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Publications (2)

Publication Number Publication Date
CN112115973A true CN112115973A (en) 2020-12-22
CN112115973B CN112115973B (en) 2022-07-19

Family

ID=73803747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010829114.2A Active CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Country Status (1)

Country Link
CN (1) CN112115973B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597519A (en) * 2020-12-28 2021-04-02 杭州电子科技大学 Non-key decryption method based on convolutional neural network in OFDM (orthogonal frequency division multiplexing) encryption system
CN112598082A (en) * 2021-01-07 2021-04-02 华中科技大学 Method and system for predicting generalized error of image identification model based on non-check set
CN112712126A (en) * 2021-01-05 2021-04-27 南京大学 Picture identification method
CN112991782A (en) * 2021-04-08 2021-06-18 河北工业大学 Control method, system, terminal, equipment, medium and application of traffic signal lamp
CN113298237A (en) * 2021-06-23 2021-08-24 东南大学 Convolutional neural network on-chip training accelerator based on FPGA
CN113505821A (en) * 2021-06-29 2021-10-15 重庆邮电大学 Deep neural network image identification method and system based on sample reliability
CN113591913A (en) * 2021-06-28 2021-11-02 河海大学 Picture classification method and device supporting incremental learning
CN113688931A (en) * 2021-09-01 2021-11-23 什维新智医疗科技(上海)有限公司 Ultrasonic image screening method and device based on deep learning
CN113780525A (en) * 2021-08-30 2021-12-10 中国人民解放军火箭军工程大学 Intelligent auxiliary equipment training and maintenance decision method and device based on deep learning
CN114401063A (en) * 2022-01-10 2022-04-26 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330405A (en) * 2017-06-30 2017-11-07 上海海事大学 Remote sensing images Aircraft Target Recognition based on convolutional neural networks
CN107609503A (en) * 2017-09-05 2018-01-19 刘宇红 Intelligent cancerous tumor cell identifying system and method, cloud platform, server, computer
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
US20190188524A1 (en) * 2017-12-14 2019-06-20 Avigilon Corporation Method and system for classifying an object-of-interest using an artificial neural network
US20190209022A1 (en) * 2018-01-05 2019-07-11 CareBand Inc. Wearable electronic device and system for tracking location and identifying changes in salient indicators of patient health
US20190279075A1 (en) * 2018-03-09 2019-09-12 Nvidia Corporation Multi-modal image translation using neural networks
CN110427846A (en) * 2019-07-19 2019-11-08 西安工业大学 It is a kind of using convolutional neural networks to the face identification method of uneven small sample
CN110619352A (en) * 2019-08-22 2019-12-27 杭州电子科技大学 Typical infrared target classification method based on deep convolutional neural network
CN110674279A (en) * 2019-10-15 2020-01-10 腾讯科技(深圳)有限公司 Question-answer processing method, device, equipment and storage medium based on artificial intelligence

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330405A (en) * 2017-06-30 2017-11-07 上海海事大学 Remote sensing images Aircraft Target Recognition based on convolutional neural networks
CN107609503A (en) * 2017-09-05 2018-01-19 刘宇红 Intelligent cancerous tumor cell identifying system and method, cloud platform, server, computer
US20190188524A1 (en) * 2017-12-14 2019-06-20 Avigilon Corporation Method and system for classifying an object-of-interest using an artificial neural network
US20190209022A1 (en) * 2018-01-05 2019-07-11 CareBand Inc. Wearable electronic device and system for tracking location and identifying changes in salient indicators of patient health
US20190279075A1 (en) * 2018-03-09 2019-09-12 Nvidia Corporation Multi-modal image translation using neural networks
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
CN110427846A (en) * 2019-07-19 2019-11-08 西安工业大学 It is a kind of using convolutional neural networks to the face identification method of uneven small sample
CN110619352A (en) * 2019-08-22 2019-12-27 杭州电子科技大学 Typical infrared target classification method based on deep convolutional neural network
CN110674279A (en) * 2019-10-15 2020-01-10 腾讯科技(深圳)有限公司 Question-answer processing method, device, equipment and storage medium based on artificial intelligence

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597519A (en) * 2020-12-28 2021-04-02 杭州电子科技大学 Non-key decryption method based on convolutional neural network in OFDM (orthogonal frequency division multiplexing) encryption system
CN112597519B (en) * 2020-12-28 2024-02-13 杭州电子科技大学 Non-key decryption method based on convolutional neural network in OFDM encryption system
CN112712126A (en) * 2021-01-05 2021-04-27 南京大学 Picture identification method
CN112712126B (en) * 2021-01-05 2024-03-19 南京大学 Picture identification method
CN112598082B (en) * 2021-01-07 2022-07-12 华中科技大学 Method and system for predicting generalized error of image identification model based on non-check set
CN112598082A (en) * 2021-01-07 2021-04-02 华中科技大学 Method and system for predicting generalized error of image identification model based on non-check set
CN112991782A (en) * 2021-04-08 2021-06-18 河北工业大学 Control method, system, terminal, equipment, medium and application of traffic signal lamp
CN113298237A (en) * 2021-06-23 2021-08-24 东南大学 Convolutional neural network on-chip training accelerator based on FPGA
CN113591913A (en) * 2021-06-28 2021-11-02 河海大学 Picture classification method and device supporting incremental learning
CN113591913B (en) * 2021-06-28 2024-03-29 河海大学 Picture classification method and device supporting incremental learning
CN113505821A (en) * 2021-06-29 2021-10-15 重庆邮电大学 Deep neural network image identification method and system based on sample reliability
CN113780525A (en) * 2021-08-30 2021-12-10 中国人民解放军火箭军工程大学 Intelligent auxiliary equipment training and maintenance decision method and device based on deep learning
CN113688931A (en) * 2021-09-01 2021-11-23 什维新智医疗科技(上海)有限公司 Ultrasonic image screening method and device based on deep learning
CN113688931B (en) * 2021-09-01 2024-03-29 什维新智医疗科技(上海)有限公司 Deep learning-based ultrasonic image screening method and device
CN114401063A (en) * 2022-01-10 2022-04-26 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model
CN114401063B (en) * 2022-01-10 2023-10-31 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model

Also Published As

Publication number Publication date
CN112115973B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN112115973B (en) Convolutional neural network based image identification method
CN110210560B (en) Incremental training method, classification method and device, equipment and medium of classification network
CN110110624B (en) Human body behavior recognition method based on DenseNet and frame difference method characteristic input
CN109359608B (en) Face recognition method based on deep learning model
CN112990097B (en) Face expression recognition method based on countermeasure elimination
CN112288011A (en) Image matching method based on self-attention deep neural network
CN111753881A (en) Defense method for quantitatively identifying anti-attack based on concept sensitivity
CN113111979B (en) Model training method, image detection method and detection device
CN110175248B (en) Face image retrieval method and device based on deep learning and Hash coding
CN110929836B (en) Neural network training and image processing method and device, electronic equipment and medium
CN111832650A (en) Image classification method based on generation of confrontation network local aggregation coding semi-supervision
CN112669343A (en) Zhuang minority nationality clothing segmentation method based on deep learning
CN112132145A (en) Image classification method and system based on model extended convolutional neural network
CN110414586B (en) Anti-counterfeit label counterfeit checking method, device, equipment and medium based on deep learning
CN112270404A (en) Detection structure and method for bulge defect of fastener product based on ResNet64 network
CN116935122A (en) Image classification method and system based on 3D-WGMobileNet
CN111754459B (en) Dyeing fake image detection method based on statistical depth characteristics and electronic device
CN114926876A (en) Image key point detection method and device, computer equipment and storage medium
WO2021055364A1 (en) Efficient inferencing with fast pointwise convolution
CN113011370A (en) Multi-state face recognition method based on deep learning
Depuru et al. Hybrid CNNLBP using facial emotion recognition based on deep learning approach
CN113111957B (en) Anti-counterfeiting method, device, equipment, product and medium based on feature denoising
KR102652397B1 (en) Apparatus, method and program for determining control command using a neural network model
CN114186621A (en) Product trademark identification method and device based on BP neural network
CN116503896A (en) Fish image classification method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant