CN112115973B - Convolutional neural network based image identification method - Google Patents

Convolutional neural network based image identification method Download PDF

Info

Publication number
CN112115973B
CN112115973B CN202010829114.2A CN202010829114A CN112115973B CN 112115973 B CN112115973 B CN 112115973B CN 202010829114 A CN202010829114 A CN 202010829114A CN 112115973 B CN112115973 B CN 112115973B
Authority
CN
China
Prior art keywords
layer
neural network
training
convolution
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010829114.2A
Other languages
Chinese (zh)
Other versions
CN112115973A (en
Inventor
刘航
白仞祥
张玉红
菅秀凯
刘鸣泰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Jianzhu University
Original Assignee
Jilin Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin Jianzhu University filed Critical Jilin Jianzhu University
Priority to CN202010829114.2A priority Critical patent/CN112115973B/en
Publication of CN112115973A publication Critical patent/CN112115973A/en
Application granted granted Critical
Publication of CN112115973B publication Critical patent/CN112115973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technology of deep learning and image recognition, and particularly relates to an image recognition method based on a convolutional neural network. The method comprises the following steps: performing model training on an original picture by adopting a convolutional neural network; and inputting the picture to be processed into the trained model, and identifying the picture. According to the method, the training of the neural network is accelerated by adopting a GPU mode in the training process, Dropout regularization is added into a training model to optimize the system so as to prevent the overfitting phenomenon in the training process, and meanwhile, the image set expansion is carried out on the data set photos.

Description

Convolutional neural network based image identification method
Technical Field
The invention belongs to the technology of deep learning and image recognition, and particularly relates to an image recognition method based on a convolutional neural network.
Background
Since Rumelhart developed learning algorithms in 1985 and others at the same time, the trend of exploring and researching neural networks has been raised worldwide, the development of artificial neural networks has penetrated into the field of research, and especially, the application of pattern recognition in image classification technology is gradually increased, and characters recognition technology, license plate recognition technology, face recognition technology, various paper money recognition technology, stamp recognition technology, recognition of some military targets and other aspects are researched more at home and abroad. When the artificial neural network completes an image recognition task, the following problems are mainly existed:
(1) The number of parameters is too large, in CIFAR-10 (one match data set) the image is only 32x32x3(32 wide, 32 high, 3 color channels) in size, so a single fully connected neuron in the first hidden layer of a normal neural network will have 32x32x3 ═ 3072 weights. This number is still controllable but it is clear that this fully connected structure does not extend to larger images. For example, an image of a larger size, such as 200x200x3, would result in 120,000 weighted neurons. Furthermore, we almost certainly have several such neurons, so the parameters increase. Obviously, such a full connection is wasteful and a large number of parameters can quickly result in over-mating.
(2) No position information between pixels is utilized. For image recognition tasks, the association of each pixel with its surrounding pixels is relatively close, and the association of pixels that are far apart may be small. If a neuron is connected to all neurons in the previous layer, it is equivalent to treating all pixels of the image equally for a pixel, which does not conform to the previous assumption. After we complete the learning of each connection weight, we may eventually find that there are a large number of weights, all of which have very small values. In an effort to learn a large number of non-essential weights, such learning would necessarily be very inefficient.
(3) The number of network layers is limited. The more the network layers are, the stronger the expression capability is, but the training of the deep artificial neural network by the gradient descent method is difficult, because the gradient of the fully-connected neural network is difficult to transfer beyond 3 layers. Therefore, it is impossible to obtain a deep fully-connected neural network, which limits its capabilities.
Disclosure of Invention
In order to solve the problems in the prior art, the invention aims to provide an image identification method based on a convolutional neural network.
The present invention has been accomplished in such a manner that,
a convolutional neural network based image recognition method, comprising:
step 1, performing model training on an original picture by adopting a convolutional neural network;
and 2, inputting the picture to be processed into the trained model, and identifying the picture.
Further, the method comprises the following steps: the step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer;
summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
the method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
Step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full connection layer to obtain an output value;
the output of each layer is characterized as follows:
Figure BDA0002637269530000021
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure BDA0002637269530000022
is the weight of the layer of convolution kernel,
Figure BDA0002637269530000023
is a convolution operation in which the result of the convolution operation,
Figure BDA0002637269530000024
is a feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
determining parameters inside a model by forward propagation of predictions of samples and output of convolutional neural network expectation valuesCounting; defining an objective function of the convolutional neural network:
Figure BDA0002637269530000031
where L (x) is a loss function, m is the number of samples,
Figure BDA0002637269530000032
y is the sample output for the desired output. Calculating the partial derivative of the parameters w and b of each layer in the neural network by using a gradient descent method to obtain updated parameter values of the convolutional neural network, so that the actual convolutional neural network output is closer to an expected value;
Step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: and judging whether the epoch times are finished or not, if so, quitting the model training, and otherwise, carrying out the next training.
And step 17: and finishing the training of the training model.
Further: the updating in step 15 includes convolution layer updating and full connection layer updating:
and returning the error layer by using a back propagation algorithm, and updating the weight of each layer by using a gradient descent method.
Further: in the step 13, the process is carried out,
the forward propagation process of the convolution layer is that convolution operation is carried out on input data through convolution kernel, the convolution kernel convolves the whole input picture in a convolution mode with the step length being 1 to form a local receptive field, then the local receptive field carries out convolution algorithm, weighting sum is carried out on a weight matrix and a characteristic value of the picture, and then output is obtained through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted by the convolution layer on the upper layer are taken as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature graph is selected by adopting a maximum pooling method;
The forward propagation process of the full connection layer is that after the feature graph enters the feature extraction of the over-convolution layer and the down-sampling layer, the extracted features are transmitted to the full connection layer, and classification is carried out through the full connection layer to obtain a classification model and obtain a final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer, the number of input features is multiplied by the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and then transmitted to the next layer.
Further, the method comprises the following steps: in the step (2), the first step is that,
step 21: loading the trained optimal weight value stored in a specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model through a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
and step 24: carrying out normalization preprocessing on a picture to be recognized;
step 25: identifying using a convolutional neural network based identification system; and outputs the recognition result.
Compared with the prior art, the invention has the beneficial effects that:
the method of the invention adopts a GPU mode to accelerate the training of the neural network in the training process, Dropout regularization is added in the training model to optimize the system to prevent the overfitting phenomenon in the training process, and meanwhile, the image set expansion is carried out on the data set photos, such as: rotation, scaling, turning and the like, and the model has no overfitting phenomenon in the training process of the expanded data set. It can be known from the loss function graph fig. 8 that, when the training model is trained to the later stage, the loss function also keeps steadily decreasing as the model learning rate gradually decreases, and when the training model of the convolutional neural network reaches 25 iterations, the curve of the loss function starts to gradually trend towards stability. As can be seen from the accuracy graph of model training fig. 9, in the beginning of several times, the accuracy of the training model is low, which is because the model parameters are not optimized due to the small number of model training iterations, but in the process of gradually increasing the number of model training iterations, the recognition rate of the model data set is gradually increased, and when the number of iterations of the convolutional neural network training model reaches 25 times, the accuracy graph of the model gradually tends to be stable. By combining the two graphs, it can be seen that the optimal number of iterations of the model is reached when the model is iterated 25 times. By adopting a training model designed based on a convolutional neural network, the accuracy rate can reach 96%.
Drawings
FIG. 1 is a diagram of an embodiment of the present invention for use as a correct pattern;
FIG. 2 is an image used as an error in an embodiment of the present invention;
FIG. 3 is a first layer convolution structure according to an embodiment of the present invention;
FIG. 4 is a second layer convolution structure according to an embodiment of the present invention;
FIG. 5 is a third layer convolution structure according to an embodiment of the present invention;
FIG. 6 is a fourth layer convolution structure according to an embodiment of the present invention;
FIG. 7 is a fifth layer convolution structure according to an embodiment of the present invention;
FIG. 8 is a loss function droop curve according to an embodiment of the present invention;
FIG. 9 shows the training model recognition accuracy in accordance with an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
An image identification method based on a convolution neural network is characterized by comprising the following steps:
step 1, performing model training on an original picture by adopting a convolutional neural network;
and 2, inputting the picture to be processed into the trained model, and identifying the picture.
The step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer; summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
The method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full-connection layer to obtain an output value;
the characteristics of each layer output are as follows:
Figure BDA0002637269530000061
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure BDA0002637269530000062
is the weight of the layer of convolution kernel,
Figure BDA0002637269530000063
is a convolution operation that is performed by a convolution operation,
Figure BDA0002637269530000064
is a feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
determining parameters inside the model by forward propagating the prediction of the samples and the output of the expected value of the convolutional neural network; defining an objective function of the convolutional neural network:
Figure BDA0002637269530000065
where L (x) is a loss function, m is the number of samples,
Figure BDA0002637269530000066
y is the sample output for the desired output. Calculating the partial derivative of the parameters w and b of each layer in the neural network by using a gradient descent method to obtain updated parameter values of the convolutional neural network, so that the actual convolutional neural network output is closer to an expected value;
Step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: and judging whether the epoch times are finished or not, if so, quitting the model training, and otherwise, carrying out the next training.
And step 17: and finishing the training of the training model.
The updating in step 15 includes convolution layer updating and full connection layer updating:
and returning the error layer by using a back propagation algorithm, and updating the weight of each layer by using a gradient descent method.
In the step 13, the forward propagation process of the convolution layer is to perform convolution operation on input data through convolution kernel, the convolution kernel convolves the whole input picture by adopting a convolution mode with step length of 1 to form a local receptive field, then perform convolution algorithm on the local receptive field, perform weighted sum on a weight matrix and a characteristic value of the picture, and then obtain output through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted from the convolution layer of the upper layer are used as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature map is selected by adopting a maximum pooling method;
The forward propagation process of the full-connection layer is that after the feature map enters the overwinding layer and the feature extraction of the down-sampling layer, the extracted features are transmitted to the full-connection layer, and classification is carried out through the full-connection layer to obtain a classification model and obtain the final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer multiplied by the number of input features plus the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and transmitted to the next layer.
In step 2, step 21: loading the trained optimal weight value stored in the specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model by a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
step 24: carrying out normalization preprocessing on a picture to be recognized;
step 25: identifying using a convolutional neural network based identification system; and outputs the recognition result.
The operation of the convolutional layer is an important component of the convolutional neural network, and the operation of the convolutional layer is mainly used for extracting and abstracting image characteristics. The core of the convolution layer is convolution operation, and in the convolution operation, the image should be converted into a matrix first and then operated. Assume that there is an image size of 6 x 6, and each pixel has information of the image stored therein. A convolution kernel (equivalent to a weight) is defined to extract certain features from the image. And multiplying the convolution kernel by the corresponding bit of the digital matrix and adding to obtain the output result of the convolution layer.
The value of the convolution kernel can be randomly generated by a function without the experience of the past learning, and then is trained and adjusted step by step.
When all the pixels are covered at least once, the output of a convolution layer can be generated (the convolution step length is 1).
The machine does not know at first which features the part to be identified has, and compares the output values obtained by interacting with different convolution kernels to determine which convolution kernel best represents the feature of the picture, for example, to identify a feature (such as a curve) in the image, that is, the convolution kernel has a high output value for the curve and a low output value for other shapes (such as a triangle). The higher the convolution layer output value, the higher the matching degree, and the more the characteristics of the picture can be expressed.
The down-sampling layer is also called as a pooling layer, and the working process is as follows:
the pooling layer mainly has the effects of reducing the number of parameters, improving the calculation speed, enhancing the robustness of the extracted features and preventing the over-fitting phenomenon from happening, and is generally placed behind the convolution layer, so that the size of the model is reduced and the feature dimension is reduced.
The most common two forms of pooling layer:
maximum pooling: max-pooling-the largest number in a given area is chosen to represent the entire area.
And (3) mean value pooling: mean-posing-choosing the average of the values in a given area to represent the whole area.
The task of the convolutional layer and the pooling layer is to extract features and reduce parameters brought by the original image. However, to generate the final output, a fully connected layer needs to be applied to generate one classifier equal to the number of classes required.
The working principle of the fully-connected layer is similar to that of the previous neural network learning, the tensor output by the pooling layer needs to be cut into vectors again, the vectors are multiplied by the weight matrix, the bias value is added, then the ReLU activation function is used for the tensor, and the parameters are optimized by the gradient descent method.
Example (b):
the training model in this embodiment has 40 epochs to update the learning rate, a larger learning rate is set at the beginning of training, the learning rate is gradually reduced along with the reduction of the total error of the system in the learning process, the optimal weight is saved every time the epoch training is completed, so that the later-stage neural network model is deployed, the training system is optimized by using an SGD (sparse dimension) random gradient descent method in the training process, and the convergence of the model is accelerated by using minipatch training. After the 40 epoch training is finished, the optimal weight in the training is saved, and the saved optimal weight is directly called in the model prediction to initialize the model prediction parameters so as to start the prediction of the picture.
Before training begins, loading pictures to be trained, preprocessing a training set, wherein the pictures include picture normalization, picture channels are uniform, and the like, then building and training a model, namely forward propagation and backward propagation are started, the backward propagation adopts a random gradient descent method for optimization, judging whether the result is better once the optimization is completed once, if so, updating related weight, otherwise, judging whether all epoch training is completed, if not, returning to the training model for continuous training, otherwise, finishing the training of the whole model.
In the neural network model prediction, trained model parameters are loaded, label values of image classification are loaded so as to output a subsequent prediction result of the model, then the image to be classified is transmitted to a user side, the image to be recognized is displayed and preprocessed after the system obtains the image to be recognized, related parameters are unified, the loaded neural network is used for prediction, and finally the recognition result of the current image is output, so that the whole image recognition process is completed.
The data set is divided into 2 types, and comprises 70 training sets of training model optimization model parameters and 10 test sets of test model recognition conditions. Selecting two patterns, wherein a red, green and blue three-color chart is used as a correct pattern, as shown in figure 1; the non-rgb-blue tristimulus is used as the error pattern, as shown in fig. 2. And respectively taking 40 photos at different angles, taking 35 photos taken in each pattern as a training set for optimizing network parameters, and taking the remaining 5 photos of each pattern as a verification set. The process is as follows: (1) and preprocessing and normalizing the input picture matrix, and sending the pictures with the size of 128x128 into a network.
(2) The first layer of convolution structure uses 96 20 × 20 convolution kernels, the convolution step is 2, the padding operation is valid, and the output signature is 55 × 96. After normalization and PReLU activation, the maximal pooling operation is performed with a local sensing area of 3 × 3, a pooling step of 2, padding operation of valid, and an output signature of 27 × 96. As shown in fig. 3.
(3) The second hierarchical convolution structure takes the 27 × 96 signature as input, 256 convolution kernels of 5 × 5, convolution step 1, padding operation same as same, and output signature 27 × 256. After normalization and PReLU activation, the maximal pooling operation is performed with local sensing area of 5 × 5, pooling step of 2, padding operation of valid, and output signature of 13 × 256. As shown in fig. 4.
(4) The third layer of convolution structure takes 13 × 256 characteristic diagram as input, 384 convolution kernels of 3 × 3 are used, the convolution step is 1, the padding operation is same, the layer only performs normalization and PReLU activation processing without pooling, and the output characteristic diagram is 13 × 384. As shown in fig. 5.
(5) With 13 × 384 signature as input, 384 convolution kernels of 3 × 3, convolution step 1, padding operation same as same, this layer just normalizes and the PReLU activation process does not pool, outputting 13 × 384 signature. As shown in fig. 6.
(6) The fifth layer convolution structure takes 13 × 384 signature as input, 256 convolution kernels of 3 × 3 are used, the convolution step is 1, the padding operation is same, and the output signature is 13 × 256. After normalization and PReLU activation, the maximal pooling operation is performed with a local sensing area of 3 × 3, a pooling step of 2, padding operation of valid, and an output signature of 6 × 256. As shown in fig. 7.
(7) And in the structure of the first fully-connected layer, the characteristic diagram output by the fifth convolutional layer is compressed into a one-dimensional characteristic diagram through the fully-connected layer, the output parameter is 4096, the parameter of the Dropout layer is 0.2, so that the occurrence of overfitting is prevented, and the output characteristic diagram is 4096 x 1.
(8) The structure of the second layer fully-connected layer takes the output characteristic diagram of the first layer fully-connected layer as input, the output parameter is 4096, and the Dropout layer parameter is 0.25. The output characteristic of this layer is therefore 4096 x 1.
(9) And the third layer of fully-connected layer structure takes 4096 × 1 characteristic diagram as input, the output parameter of the layer is 2, and the output characteristic diagram is 2 × 1.
(10) And finally, inputting the 2 x 1 feature graph output by the third fully-connected layer as a softmax classifier, and outputting 2 classes of classified data through the classifier.
The convolutional neural network in the experiment is explained above, and the specific procedure is as follows:
(1) Convolution of convolutional neural networks and pooling layer procedure.
x=Conv2D(96,(20,20),strides=(2,2),padding='valid')(input_dim)
x=bn_relu(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
x=Conv2D(256,(5,5),strides=(1,1),padding='same')(x)
x=bn_relu(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
x=Conv2D(384,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=Conv2D(384,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=Conv2D(256,(3,3),strides=(1,1),padding='same')(x)
x=PReLU()(x)
x=MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='valid')(x)
(2) A fully connected layer procedure for convolutional neural networks.
x=Flatten()(x)
fc1=Dense(4096)(x)
dr1=Dropout(0.2)(fc1)
fc2=Dense(4096)(dr1)
dr2=Dropout(0.25)(fc2)
fc3=Dense(out_dims)(dr2)
The iteration number of model training of the training model in the training process of the embodiment is maximum 40 epochs, the selected block size is 128, the training of the neural network is accelerated by adopting a GPU mode, Dropout regularization is added into the training model to optimize the system so as to prevent an overfitting phenomenon from occurring in the training process, and meanwhile, the atlas expansion is performed on a data set photo, for example: rotation, scaling, turning and the like, and the model has no overfitting phenomenon to the extended data set in the training process. It can be known from the loss function graph fig. 8 that, when the training model is trained to the later stage, the loss function also keeps steadily decreasing as the model learning rate gradually decreases, and when the training model of the convolutional neural network reaches 25 iterations, the curve of the loss function starts to gradually trend towards stability. As can be seen from the accuracy graph of model training fig. 9, in the beginning of several times, the accuracy of the training model is low, which is because the model parameters are not optimized due to the small number of model training iterations, but in the process of gradually increasing the number of model training iterations, the recognition rate of the model data set is gradually increased, and when the number of iterations of the convolutional neural network training model reaches 25 times, the accuracy graph of the model gradually tends to be stable. By combining the two graphs, the optimal iteration number of the model is reached when the model is iterated for 25 times. By adopting a training model designed based on a convolutional neural network, the accuracy rate can reach 96%.
The above description is intended to be illustrative of the preferred embodiment of the present invention and should not be taken as limiting the invention, but rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims (4)

1. An image identification method based on a convolution neural network is characterized by comprising the following steps:
step 1, performing model training on an original picture by adopting a convolutional neural network;
step 2, inputting the picture to be processed into the trained model, and identifying the picture; the step 2 of performing model training by using the convolutional neural network comprises the following steps: preliminarily extracting image characteristics through the convolution layer; extracting main features through a down-sampling layer; summarizing the characteristics of all parts through a full connecting layer; generating a classifier for prediction and identification;
the method specifically comprises the following steps:
step 11: initializing a weight value of the convolutional neural network;
step 12: carrying out forward propagation on input picture data through a convolution layer, a down-sampling layer and a full-connection layer to obtain an output value; the characteristics of each layer output are as follows:
Figure FDA0003629346270000011
wherein, y(l)Is the output of the convolutional layer, f (x) is the nonlinear activation function, m is the feature map set input to the layer,
Figure FDA0003629346270000012
is the weight of the layer of convolution kernel,
Figure FDA0003629346270000013
Is a convolution operation in which the result of the convolution operation,
Figure FDA0003629346270000014
is the feature vector of the convolutional layer input, blIs an offset;
step 13: solving the error between the output value of the convolutional neural network and the target value; when the result output by the convolutional neural network does not accord with the expected value, performing a back propagation process; calculating the error between the result and the expected value, returning the error layer by layer, calculating the error of each layer, and updating the weight; adjusting the network weight through training samples and expected values;
prediction of samples by forward propagation and convolutional neural network periodThe output of the expectation value is used for determining the parameters in the model; defining an objective function of the convolutional neural network:
Figure FDA0003629346270000015
where L (x) is a loss function, m is the number of samples,
Figure FDA0003629346270000016
the output is expected, y is sample output, the bias derivative of the parameters w and b of each layer in the neural network is solved by applying a gradient descent method, the updated parameter value of the convolutional neural network is obtained, and the actual convolutional neural network output is closer to the expected value;
step 14: when the error is larger than the expected value, the error is transmitted back to the convolutional neural network, and the errors of the full connection layer, the down sampling layer and the convolutional layer are sequentially obtained; when the error is equal to or less than the expected value, finishing the training;
Step 15: judging whether the weight is optimal according to the obtained error, and if not, updating the weight;
step 16: judging whether the epoch times are finished or not, if so, quitting the model training, otherwise, carrying out the next training;
and step 17: and finishing the training of the training model.
2. The method of claim 1, wherein the updates in step 15 include convolutional layer updates and fully-connected layer updates: and returning the error layer by utilizing back propagation, and updating the weight of each layer by utilizing a gradient descent method.
3. The method of claim 1, wherein, in step 13,
the forward propagation process of the convolution layer is to perform convolution operation on input data through convolution kernel, the convolution kernel convolves the whole input picture by adopting a convolution mode with step length of 1 to form a local receptive field, then the local receptive field performs convolution algorithm, the weighted sum is performed through a weight matrix and a characteristic value of the picture, and then the output is obtained through an activation function;
the forward propagation process of the down-sampling layer is that the features extracted from the convolution layer of the upper layer are used as input and transmitted to the down-sampling layer, the dimensionality of data is reduced through the pooling operation of the down-sampling layer, and the maximum value in the feature map is selected by adopting a maximum pooling method;
The forward propagation process of the full-connection layer is that after the feature map enters the overwinding layer and the feature extraction of the down-sampling layer, the extracted features are transmitted to the full-connection layer, and classification is carried out through the full-connection layer to obtain a classification model and obtain the final result; in the fully-connected layer, the number of parameters is equal to the number of nodes in the fully-connected layer, the number of input features is multiplied by the number of nodes, and after an output matrix is obtained, the output matrix is activated by an excitation function and then transmitted to the next layer.
4. The method of claim 1, wherein, in step 2,
step 21: loading the trained optimal weight value stored in a specific file by the training model in the step 1 in an image recognition system;
step 22: obtaining the optimal weight of each layer of convolution kernel in the training model through a weight sharing method, and loading the trained convolution kernel weight into an image recognition system;
step 23: the output of the full connection layer of the last layer of the convolutional neural network in the training model divides the training data set into correct and wrong types through a softmax classifier, and the image labels classified by the training model are loaded in an image recognition system;
step 24: carrying out normalization preprocessing on a picture to be recognized;
Step 25: performing recognition by using a recognition system based on a convolutional neural network; and outputs the recognition result.
CN202010829114.2A 2020-08-18 2020-08-18 Convolutional neural network based image identification method Active CN112115973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010829114.2A CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010829114.2A CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Publications (2)

Publication Number Publication Date
CN112115973A CN112115973A (en) 2020-12-22
CN112115973B true CN112115973B (en) 2022-07-19

Family

ID=73803747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010829114.2A Active CN112115973B (en) 2020-08-18 2020-08-18 Convolutional neural network based image identification method

Country Status (1)

Country Link
CN (1) CN112115973B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597519B (en) * 2020-12-28 2024-02-13 杭州电子科技大学 Non-key decryption method based on convolutional neural network in OFDM encryption system
CN112712126B (en) * 2021-01-05 2024-03-19 南京大学 Picture identification method
CN112598082B (en) * 2021-01-07 2022-07-12 华中科技大学 Method and system for predicting generalized error of image identification model based on non-check set
CN112991782A (en) * 2021-04-08 2021-06-18 河北工业大学 Control method, system, terminal, equipment, medium and application of traffic signal lamp
CN113298237B (en) * 2021-06-23 2024-05-14 东南大学 Convolutional neural network on-chip training accelerator based on FPGA
CN113591913B (en) * 2021-06-28 2024-03-29 河海大学 Picture classification method and device supporting incremental learning
CN113505821B (en) * 2021-06-29 2022-09-27 重庆邮电大学 Deep neural network image identification method and system based on sample reliability
CN113723205A (en) * 2021-08-04 2021-11-30 中国人民解放军陆军勤务学院 Face recognition method based on face bottom library feature grouping
CN113780525B (en) * 2021-08-30 2022-09-30 中国人民解放军火箭军工程大学 Intelligent auxiliary equipment training and maintenance decision method and device based on deep learning
CN113688931B (en) * 2021-09-01 2024-03-29 什维新智医疗科技(上海)有限公司 Deep learning-based ultrasonic image screening method and device
CN114401063B (en) * 2022-01-10 2023-10-31 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model
CN114511859A (en) * 2022-02-21 2022-05-17 北京浩瀚深度信息技术股份有限公司 Picture identification method and device, electronic equipment and storage medium
CN116739437A (en) * 2023-07-14 2023-09-12 鱼快创领智能科技(南京)有限公司 Comprehensive transport capacity grading method based on Internet of vehicles data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330405A (en) * 2017-06-30 2017-11-07 上海海事大学 Remote sensing images Aircraft Target Recognition based on convolutional neural networks
CN107609503A (en) * 2017-09-05 2018-01-19 刘宇红 Intelligent cancerous tumor cell identifying system and method, cloud platform, server, computer
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
CN110427846A (en) * 2019-07-19 2019-11-08 西安工业大学 It is a kind of using convolutional neural networks to the face identification method of uneven small sample
CN110619352A (en) * 2019-08-22 2019-12-27 杭州电子科技大学 Typical infrared target classification method based on deep convolutional neural network

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10599958B2 (en) * 2017-12-14 2020-03-24 Avigilon Corporation Method and system for classifying an object-of-interest using an artificial neural network
US11147459B2 (en) * 2018-01-05 2021-10-19 CareBand Inc. Wearable electronic device and system for tracking location and identifying changes in salient indicators of patient health
US20190279075A1 (en) * 2018-03-09 2019-09-12 Nvidia Corporation Multi-modal image translation using neural networks
CN110674279A (en) * 2019-10-15 2020-01-10 腾讯科技(深圳)有限公司 Question-answer processing method, device, equipment and storage medium based on artificial intelligence

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330405A (en) * 2017-06-30 2017-11-07 上海海事大学 Remote sensing images Aircraft Target Recognition based on convolutional neural networks
CN107609503A (en) * 2017-09-05 2018-01-19 刘宇红 Intelligent cancerous tumor cell identifying system and method, cloud platform, server, computer
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks
CN109684912A (en) * 2018-11-09 2019-04-26 中国科学院计算技术研究所 A kind of video presentation method and system based on information loss function
CN110427846A (en) * 2019-07-19 2019-11-08 西安工业大学 It is a kind of using convolutional neural networks to the face identification method of uneven small sample
CN110619352A (en) * 2019-08-22 2019-12-27 杭州电子科技大学 Typical infrared target classification method based on deep convolutional neural network

Also Published As

Publication number Publication date
CN112115973A (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN112115973B (en) Convolutional neural network based image identification method
CN110020682B (en) Attention mechanism relation comparison network model method based on small sample learning
CN110210560B (en) Incremental training method, classification method and device, equipment and medium of classification network
CN109344731B (en) Lightweight face recognition method based on neural network
CN111696101A (en) Light-weight solanaceae disease identification method based on SE-Inception
CN112288011A (en) Image matching method based on self-attention deep neural network
CN112990097A (en) Face expression recognition method based on countermeasure elimination
CN113111979B (en) Model training method, image detection method and detection device
CN110929836B (en) Neural network training and image processing method and device, electronic equipment and medium
CN110175248B (en) Face image retrieval method and device based on deep learning and Hash coding
CN110569725A (en) Gait recognition system and method for deep learning based on self-attention mechanism
CN109146000A (en) A kind of method and device for improving convolutional neural networks based on frost weight
CN110414516B (en) Single Chinese character recognition method based on deep learning
CN109447147B (en) Image clustering method based on depth matrix decomposition of double-image sparsity
CN112270404A (en) Detection structure and method for bulge defect of fastener product based on ResNet64 network
CN116503896A (en) Fish image classification method, device and equipment
CN116361657A (en) Method, system and storage medium for disambiguating ash sample labels
CN111754459B (en) Dyeing fake image detection method based on statistical depth characteristics and electronic device
US11657282B2 (en) Efficient inferencing with fast pointwise convolution
CN114120406A (en) Face feature extraction and classification method based on convolutional neural network
CN112287989A (en) Aerial image ground object classification method based on self-attention mechanism
CN114186621A (en) Product trademark identification method and device based on BP neural network
CN113111957B (en) Anti-counterfeiting method, device, equipment, product and medium based on feature denoising
CN115797709B (en) Image classification method, device, equipment and computer readable storage medium
KR102652397B1 (en) Apparatus, method and program for determining control command using a neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant