CN111767860A - Method and terminal for realizing image recognition through convolutional neural network - Google Patents

Method and terminal for realizing image recognition through convolutional neural network Download PDF

Info

Publication number
CN111767860A
CN111767860A CN202010613939.0A CN202010613939A CN111767860A CN 111767860 A CN111767860 A CN 111767860A CN 202010613939 A CN202010613939 A CN 202010613939A CN 111767860 A CN111767860 A CN 111767860A
Authority
CN
China
Prior art keywords
neural network
convolutional neural
data set
network model
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010613939.0A
Other languages
Chinese (zh)
Inventor
仲会娟
蔡清泳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yango University
Original Assignee
Yango University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yango University filed Critical Yango University
Priority to CN202010613939.0A priority Critical patent/CN111767860A/en
Publication of CN111767860A publication Critical patent/CN111767860A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/582Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a terminal for realizing image recognition through a convolutional neural network, which are used for acquiring a data set and preprocessing the data set; setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers; training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model; the method and the terminal for realizing the image recognition through the convolutional neural network creatively cascade the output of each pooling layer, can fully utilize each scale characteristic of the image to classify, and greatly improve the accuracy and the speed of the classification of the convolutional neural network.

Description

Method and terminal for realizing image recognition through convolutional neural network
Technical Field
The invention relates to the field of image processing, in particular to a method and a terminal for realizing image recognition through a convolutional network.
Background
With the development of global economy, the living standard of people is greatly improved, the automobile holding capacity is also sharply increased, more choices and convenience are provided for daily travel of people, but the contradiction between the relatively lagged traffic safety infrastructure construction and the relatively weak traffic safety management level is increasingly prominent, so that traffic accidents and traffic jam frequently occur, and the automobile parking system becomes a major social problem influencing the life of people. Therefore, the intelligent transportation system is receiving wide attention, and the active safe driving technology and the unmanned driving technology become research focuses of domestic and foreign scholars and enterprises. Road traffic sign recognition is an important component of active safety driving systems and automatic driving systems, and plays a great role in the road driving safety process. The automotive industry has near-critical requirements for safety and reliability due to the personal safety of passengers, and therefore, the traffic sign recognition system should have both high recognition accuracy and real-time recognition, so that the traffic sign recognition is still a challenging task.
In recent years, convolutional neural network methods such as LeNet, Alexnet, VGG, google net, Yolo, and ResNet have achieved unusual performance in the field of image detection and recognition, and a conventional LeNet-5 lightweight convolutional neural network model shown in fig. 1 includes five layers: for the identification of the traffic sign images, the excessively complex convolutional neural network can provide a more reliable identification result, but simultaneously causes unnecessary resource waste and has a lower calculation speed.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method and the terminal for realizing the image recognition through the convolutional neural network can quickly and correctly recognize the image.
In order to solve the technical problems, the invention adopts a technical scheme that:
a method for implementing image recognition by a convolutional neural network, comprising the steps of:
s1, acquiring a data set, and preprocessing the data set;
s2, setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers;
and S3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model.
In order to solve the technical problem, the invention adopts another technical scheme as follows:
a terminal for implementing image recognition through a convolutional neural network, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
s1, acquiring a data set, and preprocessing the data set;
s2, setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers;
and S3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model.
The invention has the beneficial effects that: the method comprises the steps of acquiring a data set, preprocessing the data set, cascading the output characteristics of each pooling layer, fully utilizing the local characteristics and the global characteristics of the image, analyzing the image according to the characteristic information of different scales, greatly improving the accuracy of the output result, ensuring that the input image meets the processing conditions of a convolutional neural network by preprocessing, further improving the accuracy of the output result, preprocessing the image, unifying the image and improving the analysis speed of the convolutional neural network on the image.
Drawings
FIG. 1 is a diagram of a conventional convolutional neural network model LeNet-5;
FIG. 2 is a flowchart illustrating steps of a method for performing image recognition via a convolutional neural network, according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a terminal for implementing image recognition through a convolutional neural network according to an embodiment of the present invention;
FIG. 4 is a sample distribution diagram of a data set according to an embodiment of the present invention;
fig. 5 is a schematic diagram of GTSRB dataset sample distribution according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of image augmentation according to an embodiment of the present invention;
FIG. 7 is a block diagram of a convolutional neural network model according to an embodiment of the present invention;
FIG. 8 is a graph of the loss function variation during the convolutional neural network model training process according to an embodiment of the present invention;
FIG. 9 is a graph of the accuracy function variation during the convolutional neural network model training process according to an embodiment of the present invention;
FIG. 10 is a selection of convolution kernel size parameter candidates and their results of operation in a convolutional neural network model, in accordance with an embodiment of the present invention;
FIG. 11 is a Dropout parameter selection and its results of operation in a convolutional neural network model, in accordance with an embodiment of the present invention;
FIG. 12 is a diagram of a candidate set of full-link layer neuron numbers and their results of operation in a convolutional neural network model, according to an embodiment of the present invention;
FIG. 13 is a comparison of model data for different convolutional neural networks according to embodiments of the present invention;
description of reference numerals:
1. a terminal for realizing image recognition through a convolutional neural network; 2. a processor;
3. a memory.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
Referring to fig. 2, a method for implementing image recognition by an over-convolution neural network includes the steps of:
s1, acquiring a data set, and preprocessing the data set;
s2, setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers;
and S3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model.
From the above description, the beneficial effects of the present invention are: the method comprises the steps of acquiring a data set, preprocessing the data set, cascading the output characteristics of each pooling layer, fully utilizing the local characteristics and the global characteristics of the image, analyzing the image according to the characteristic information of different scales, greatly improving the accuracy of the output result, ensuring that the input image meets the processing conditions of a convolutional neural network by preprocessing, further improving the accuracy of the output result, preprocessing the image, unifying the image and improving the analysis speed of the convolutional neural network on the image.
Further, the convolutional neural network model further comprises a global average pooling layer;
in S2, the step of cascading the output characteristics of each pooling layer specifically includes:
unifying the size of the output features of each of the pooling layers;
connecting the output features of the pooling layers with uniform size in series to form a tensor;
inputting the tensor into the global average pooling layer.
As can be seen from the above description, the output of the pooling layer is connected to the input of the global average pooling layer after size conversion and feature fusion, so that the multi-scale features are highly purified in the global average pooling layer and are conveniently handed to a subsequent classifier, instead of a conventional flat layer (Flatten), so that the conversion from a feature map to classification recognition is more natural, and the classification of traffic signs is realized by using feature information of different scales, such as local features, global features and the like of an image.
Further, the data set is a road traffic sign image acquired by a vehicle-mounted camera in an actual traffic environment;
each road traffic sign image in the data set comprises traffic sign images shot under different light rays, different shielding degrees, different shooting angles and different vehicle motion speeds.
The preprocessing the data set in S1 includes:
s11, normalizing the size of the image in the data set to obtain a first data set;
s12, normalizing the pixels of the image in the first data set to obtain a second data set;
s13, amplifying the images in the second data set to obtain a third data set;
the S3 specifically includes: training the convolutional neural network model from the third data set.
According to the description, the practicability of the model after training is considered when the traffic sign image in the data set is selected, the image acquired by the vehicle-mounted camera in the actual traffic environment is used, so that the model after final training can adapt to the actual use requirement, different light rays, shielding degrees, shooting angles and vehicle movement speeds are set, the image shot under different scenes can be recognized by the final model, and the robustness of the final model is improved; the data set is preprocessed and then used for training the model, and accuracy of the final model recognition image is guaranteed.
Further, the augmentation is specifically:
s131, rotating in a random direction by taking the center of the image as an origin and 10 degrees as a unit;
s132, randomly translating the image by 4 pixel points, removing the spare part on one side, and stretching the image back to the size of the original image;
s133, randomly overturning the image;
the operations of S131, S132, and S133 are performed in an increasing order, but the order is not limited thereto.
According to the description, the data set is subjected to augmentation processing, the data set with sufficient data volume can be obtained, the model is trained, premature fitting of the model is prevented, and meanwhile the generalization capability of the model can be improved.
Further, the convolutional neural network further includes: a plurality of convolutional layers alternating with the pooling layers;
and the single convolution layer is formed by cascading a plurality of small convolution layers, and the input of each small convolution layer is connected with the output of the previous small convolution layer except for the head and the tail of the two small convolution layers.
From the above description, it can be known that the number of parameters can be effectively controlled while ensuring the scope of the receptive field by using a plurality of small convolution layers in cascade connection instead of the traditional single large convolution layer.
Further, a batch normalization algorithm is set after each convolution layer:
Figure BDA0002561399450000051
Figure BDA0002561399450000052
Figure BDA0002561399450000053
Figure BDA0002561399450000054
wherein B ═ { x ═ x1,x2,x3,...xmAnd, represents a batch of m data, μ, input into the convolutional neural network modelBRepresents the mean of the m data sets,
Figure BDA0002561399450000055
the variance of the m data is a small positive number, γ is a conversion factor, and β is a translation factor.
From the above description, it can be seen that the batch normalization algorithm is set after each convolution layer, so that the convergence speed of the model can be increased, the divisor can be prevented from being 0 by adding a small positive number, and the expression capability of the network can be enhanced by adding a transformation factor and a translation factor.
Further, the convolutional neural network further includes: a fully-connected layer;
setting initial parameters of the convolutional neural network model in the step S2;
setting the number of convolution kernels, the weight of the convolution kernels, the bias and the number of neurons of a full connecting layer;
setting an initial learning rate, a target minimum error, a training period and the number of samples selected in a single training.
According to the description, the initial parameters of the convolutional neural network model can be set, the initial parameters can be set according to different conditions, the effect of the model is controlled, different requirements are met, and the flexibility of the model is high.
Further, the data set in S1 includes a training set and a test set;
the step S3 specifically includes:
training the convolutional neural network model according to the training set to obtain a first convolutional neural network model;
and verifying the first convolutional neural network model according to the test set to obtain the trained convolutional neural network model.
According to the above description, the data set is divided into the training set and the test set, the training set is used for training the convolutional neural network model, then the convolutional neural network model is verified according to the test set, if the convolutional neural network model does not meet the performance requirement, the convolutional neural network model is correspondingly adjusted according to the verification result, and the identification accuracy of the finally completed model is further ensured.
Further, training the convolutional neural network model according to the data set in step S3 specifically includes:
s31, selecting a cross entropy loss function as a target function, outputting a recognition result by a Softmax classifier, setting EPOCHS to be 50, BATCH _ SIZE to be 64 and initial learning rate to be 0.001, and gradually attenuating the learning rate along with the increase of iteration times by an Adam optimization algorithm;
s32, sending the data set into the convolutional neural network model, and calculating forward output;
s33, calculating the error of the forward output, and updating the weight and the bias in the convolutional neural network model by combining a back propagation algorithm;
s34, repeating the steps S32 and S33 until the objective function converges, and saving the convolutional neural network model at this time.
From the above description, the parameters of the convolutional neural network model are adjusted according to the objective function in combination with the back propagation algorithm, so that the reliability of the completed model is further ensured.
Referring to fig. 3, a terminal for implementing image recognition through a convolutional neural network includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the following steps when executing the computer program:
s1, acquiring a data set, and preprocessing the data set;
s2, setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers;
and S3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model.
As can be seen from the above description, the beneficial effects of the present invention are: the method comprises the steps of acquiring a data set, preprocessing the data set, cascading the output characteristics of each pooling layer, fully utilizing the local characteristics and the global characteristics of the image, analyzing the image according to the characteristic information of different scales, greatly improving the accuracy of the output result, ensuring that the input image meets the processing conditions of a convolutional neural network by preprocessing, further improving the accuracy of the output result, preprocessing the image, unifying the image and improving the analysis speed of the convolutional neural network on the image.
Referring to fig. 2, fig. 4, fig. 6 and fig. 10-12, a first embodiment of the present invention is:
a method for realizing image recognition through a convolutional neural network specifically comprises the following steps:
s1, acquiring a data set, and preprocessing the data set;
the data set is a road traffic sign image acquired by a vehicle-mounted camera in an actual traffic environment;
each road traffic sign image in the data set comprises traffic sign images shot under different light rays, shielding degrees, shooting angles and vehicle movement speeds, each road traffic sign image only comprises one traffic sign, and the image resolution is different from 16 × 16 to 250 × 250;
the data set comprises a training set and a test set;
the pre-processing of the data set comprises:
s11, normalizing the size of the image in the data set to obtain a first data set: uniformly scaling the size of the sample image into 32 multiplied by 32 by a bilinear interpolation method;
s12, normalizing the pixels of the image in the first data set to obtain a second data set: compressing the pixel value range of each pixel point in the image from 0-255 to 0-1, and accelerating the convergence speed of the neural network;
s13, amplifying the images in the second data set to obtain a third data set:
referring to fig. 6, a first column from the left in fig. 6 is an image in the second data set, specifically:
s131, rotating in a random direction by taking the center of the image as an origin and 10 degrees as a unit; for example, 20 degrees, 80 degrees, 170 degrees, etc.;
s132, randomly translating the image by 4 pixel points, removing the spare part on one side, and stretching the image back to the size of the original image;
s133, randomly overturning the image;
the operations of S131, S132, and S133 are performed in an increasing order, but the order is not limited thereto;
in an alternative embodiment, the constructed data set comprises 3250 images of 40 types of traffic signs, wherein the distribution of the training set 2275 images, the test set 975 images and 3250 images in the 40 types of traffic signs is shown in fig. 4;
s2, setting initial parameters of the convolutional neural network model, including;
setting the size of a convolution kernel, the number of the convolution kernels, the weight of the convolution kernel, the bias and the number of neurons of a full connecting layer;
setting an initial learning rate, a target minimum error, a training period and a selected sample number (BATCH _ SIZE) of single training;
performing a parameter selection ratio experiment on the convolution kernel size, the Dropout parameter and the number of neurons in the full connection layer in the network structure according to the data set, respectively adopting 5 × 5, 0.5 and 256 in an optional implementation manner, specifically referring to fig. 10 to 12, respectively setting a preset parameter candidate set for each type of parameter, changing the value of the parameter in the preset parameter candidate set of one type of parameter at a time by a control variable method, operating the convolution neural network model, and selecting the parameter with the highest benefit in the parameter candidate set by an operation result;
in an alternative embodiment, an EPOCH (when a complete data set has passed through the neural network once and back, this process is referred to as an EPOCH) is set to 50, BATCH _ SIZE is set to 64, the initial learning rate is 0.001, and the learning rate is gradually attenuated as the number of iterations increases by the Adam optimization algorithm;
s3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model;
specifically, the data set is a third data set subjected to size normalization, pixel normalization and amplification;
the method specifically comprises the following steps: training the convolutional neural network model according to the training set to obtain a first convolutional neural network model; and verifying the first convolutional neural network model according to the test set to obtain the convolutional neural network model meeting the requirement.
Referring to fig. 7, a second embodiment of the present invention is:
a method for realizing image recognition through a convolutional neural network, which is different from the first embodiment in that:
the convolutional neural network model comprises an input layer (input), a plurality of pooling layers (Maxpool), a plurality of convolutional layers (Conv), a global average pooling layer (Global average Pooling), a fully connected layer (Full) and a classifier;
the convolutional layers alternate with the pooling layers;
the convolutional layers all adopt a ReLU activation function: ReLU max (0, x), where the gradient of the function remains 1 at times greater than 0 and 0 at other times;
adding a Dropout strategy behind each pooling layer, namely randomly closing or neglecting part of hidden layer neurons in the training process;
in S2, the step of cascading the output characteristics of each pooling layer specifically includes:
unifying the size of the output features of each of the pooling layers;
connecting the output features of the pooling layers with uniform size in series to form a tensor;
inputting the tensor into the global average pooling layer;
specifically, in this embodiment, the input of the global average pooling layer is each of the cascaded pooling layers:
GlobalAveragePooling merge_input=[Max_11,Max_22,Maxpool 3];
the output of each pooling layer is respectively connected with one size-uniform pooling layer, so that the sizes of the output characteristics of the pooling layers are uniform;
the Max _11 and Max _22 are two unified pooling layers, which are respectively connected with the Maxpool1 and Maxpool2, and the output features of the Maxpool1 and Maxpool2 are unified into 4 × 4 from 16 × 16 and 8 × 8;
[ Max _11, Max _22, Maxpool 3] indicates that the output signals of two unified size pooling layers and the pooling layer 3 are connected in series to form one tensor, and the output feature size of the pooling layer Maxpool3 is 4 x 4, so that the unified size pooling layers do not need to be connected;
the single convolution layer is formed by cascading a plurality of small convolution layers, and the input of each small convolution layer is connected with the output of the previous small convolution layer except for the head and the tail of the two small convolution layers;
wherein, the classifier can be a SoftMax classifier;
a Batch Normalization algorithm (BN) is set after each of the convolution layers:
Figure BDA0002561399450000101
Figure BDA0002561399450000102
Figure BDA0002561399450000103
Figure BDA0002561399450000104
wherein, B ═ { x ═ x1,x2,x3,...xmAnd, represents a batch of m data, μ, input into the convolutional neural network modelBRepresents the mean of the m data sets,
Figure BDA0002561399450000105
the variance of the m data is to prevent a slight positive number added by a divisor of 0, γ is a transformation factor, and β is a translation factor;
in an optional embodiment, the convolutional neural network model comprises 3 convolutional layers, 3 pooling layers, 1 global average pooling layer, 1 fully-connected layer and 1 classification output layer, wherein the convolutional layers and the pooling layers are alternately connected; the 3 convolutional layers all adopt convolutional kernels with the size of 5 multiplied by 5, the step length is 1, and the number of the neurons is respectively 32, 64 and 128; the 3 layers of pooling layers all adopt convolution kernels with the size of 2 multiplied by 2, the step length is 2, and the maximum pooling is adopted;
the number of the parameters of the convolutional neural network model provided by the invention is shown in the table 1:
TABLE 1
Figure BDA0002561399450000106
Figure BDA0002561399450000111
Wherein, each convolutional layer is cascaded through a plurality of 5 × 5 convolutional layers to replace the traditional single 9 × 9 convolutional layer, and table 2 shows the number of parameters of the traditional large convolutional kernel convolutional neural network model:
TABLE 2
Figure BDA0002561399450000121
Figure BDA0002561399450000131
As can be seen from the comparison between tables 1 and 2, the increase of the number of parameters of the convolutional neural network model provided by the invention is small, but the reliability of the convolutional neural network model is obviously improved.
Referring to fig. 5, 8, 9 and 13, a third embodiment of the present invention is:
a method for implementing image recognition through a convolutional neural network, which is different from the first embodiment or the second embodiment in that training the convolutional neural network model according to the data set in S3 specifically includes:
s31, selecting a cross entropy loss function as a target function;
s32, sending the data set into the convolutional neural network model, and calculating forward output;
s33, calculating the error of the forward output, and updating the weight and the bias in the convolutional neural network model by combining a back propagation algorithm;
specifically, the weights of the bias and convolution kernels are updated;
s34, repeating the steps S32 and S33 until the objective function converges, and storing the convolutional neural network model at the moment;
taking a road traffic sign image which is acquired from a natural scene and only contains one traffic sign as the input of a stored convolutional neural network model, and realizing the specific category of the output traffic sign image;
referring to fig. 8, a loss value of a loss function in the training process decreases rapidly at the early stage of training, which indicates that a difference between a predicted result and a real result is large in the early stage of training, and as the training frequency increases, the change of the loss function tends to be smooth, and a model converges gradually, that is, the fitting degree of the model and the training data is higher and higher;
referring to fig. 9, it is a curve of the variation of the accuracy (precision) in the training process, and it can be seen from the figure that the accuracy is rapidly increased in the early stage of the training, and when the iteration is about thirty times or so, the variation region of the accuracy is gentle, the increase amplitude is very small, and the model is gradually converged;
in an alternative embodiment, a german traffic sign data set (GTSRB) is obtained containing 5 major and 43 minor traffic sign images for 51839 images, wherein 39209 training sets and 12630 testing sets; the distribution of 51839 images in the 43 subclasses is shown in FIG. 5;
the GTSRB is used to perform a comparison experiment on the convolutional neural network model (MS-TSRCNN) and two other convolutional neural network models (single-scale feature connected model, convolutional layer feature connected model), wherein the convolutional layer feature connected model has the same structure as the MS-TSRCNN except that the cascading of each pooling layer is not performed, and the experimental result is shown in fig. 13.
Referring to fig. 3, a fourth embodiment of the present invention is:
a terminal 1 for implementing image recognition through a convolutional neural network, comprising a processor 2, a memory 3 and a computer program stored on the memory 3 and operable on the processor 2, wherein the processor 2 implements the steps of the first embodiment, the second embodiment or the third embodiment when executing the computer program.
In summary, the invention provides a method and a terminal for realizing image recognition through a convolutional neural network, when a training data set is selected, road traffic sign images acquired by a vehicle-mounted camera in an actual traffic environment are adopted, and the traffic sign images shot under different light rays, shielding degrees, shooting angles and vehicle movement speeds are selected, various conditions possibly occurring under actual conditions are preset, targeted model training is performed in advance, and the recognition accuracy of a model in actual application is ensured; when a model is constructed, all pooling layers are cascaded, a global average pooling layer is arranged, characteristic information of local characteristics, global characteristics and the like in various scales can be fully utilized for carrying out traffic sign classification, the conversion from a characteristic diagram to classification recognition is more natural, a plurality of small pooling layers are used for replacing a large pooling layer, the scope of a receptive field is ensured, although the parameter quantity is increased in a small range, the reliability of a convolutional neural network model is obviously improved, the calculation speed of the model is improved, parameters are determined by utilizing a data set to carry out a parameter selection experiment first, the time of model training is accelerated, the recognition accuracy of the final model is improved, the final model is simple in structure, and the recognition of traffic signs can be completed quickly and correctly.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for performing image recognition via a convolutional neural network, comprising the steps of:
s1, acquiring a data set, and preprocessing the data set;
s2, setting initial parameters of a convolutional neural network model, wherein the convolutional neural network model comprises a plurality of pooling layers, and cascading the output characteristics of the pooling layers;
and S3, training the convolutional neural network model according to the data set, and performing image recognition by using the trained convolutional neural network model.
2. The method of claim 1, wherein the convolutional neural network model further comprises a global mean pooling layer;
the step of cascading the output characteristics of each pooling layer specifically includes:
unifying the size of the output features of each of the pooling layers;
connecting the output features of the pooling layers with uniform size in series to form a tensor;
inputting the tensor into the global average pooling layer.
3. The method of claim 1, wherein the data set is a road traffic sign image collected by a vehicle-mounted camera in an actual traffic environment;
each road traffic sign image in the data set comprises traffic sign images shot under different light rays, different shielding degrees, different shooting angles and different vehicle motion speeds;
the preprocessing the data set in S1 includes:
s11, normalizing the size of the image in the data set to obtain a first data set;
s12, normalizing the pixels of the image in the first data set to obtain a second data set;
s13, amplifying the images in the second data set to obtain a third data set;
the S3 specifically includes: training the convolutional neural network model from the third data set.
4. The method of claim 3, wherein the augmenting is specifically:
s131, rotating in a random direction by taking the center of the image as an origin and 10 degrees as a unit;
s132, randomly translating the image by 4 pixel points, removing the spare part on one side, and stretching the image back to the size of the original image;
s133, randomly overturning the image;
the operations of S131, S132, and S133 are performed in an increasing order, but the order is not limited thereto.
5. The method of claim 1, wherein the convolutional neural network further comprises: a plurality of convolutional layers alternating with the pooling layers;
and the single convolution layer is formed by cascading a plurality of small convolution layers, and the input of each small convolution layer is connected with the output of the previous small convolution layer except for the head and the tail of the two small convolution layers.
6. The method of claim 1, wherein each convolutional layer is followed by a batch normalization algorithm:
Figure FDA0002561399440000021
Figure FDA0002561399440000022
Figure FDA0002561399440000023
Figure FDA0002561399440000024
wherein, B ═ { x ═ x1,x2,x3,...xmAnd, represents a batch of m data, μ, input into the convolutional neural network modelBRepresents the mean of the m data sets,
Figure FDA0002561399440000025
the variance of the m data is to prevent a slight positive number added by a divisor of 0, γ is a transformation factor, and β is a translation factor.
7. The method of claim 1, wherein the convolutional neural network further comprises: a fully-connected layer;
setting initial parameters of the convolutional neural network model in the step S2;
setting the number of convolution kernels, the weight of the convolution kernels, the bias and the number of neurons of a full connecting layer;
setting an initial learning rate, a target minimum error, a training period and the number of samples selected in a single training.
8. The method of claim 1, wherein the data set in S1 includes a training set and a test set;
the step S3 specifically includes:
training the convolutional neural network model according to the training set to obtain a first convolutional neural network model;
and verifying the first convolutional neural network model according to the test set to obtain the trained convolutional neural network model.
9. The method according to claim 1, wherein the training of the convolutional neural network model according to the data set in step S3 specifically comprises:
s31, selecting a cross entropy loss function as a target function, outputting a recognition result by a Softmax classifier, setting EPOCHS to be 50, BATCH _ SIZE to be 64 and initial learning rate to be 0.001, and gradually attenuating the learning rate along with the increase of iteration times by an Adam optimization algorithm;
s32, sending the data set into the convolutional neural network model, and calculating forward output;
s33, calculating the error of the forward output, and updating the weight and the bias in the convolutional neural network model by combining a back propagation algorithm;
s34, repeating the steps S32 and S33 until the objective function converges, and saving the convolutional neural network model at this time.
10. A terminal for implementing image recognition by a convolutional neural network, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements a method for implementing image recognition by a convolutional neural network according to any one of claims 1 to 9 when executing the computer program.
CN202010613939.0A 2020-06-30 2020-06-30 Method and terminal for realizing image recognition through convolutional neural network Pending CN111767860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010613939.0A CN111767860A (en) 2020-06-30 2020-06-30 Method and terminal for realizing image recognition through convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010613939.0A CN111767860A (en) 2020-06-30 2020-06-30 Method and terminal for realizing image recognition through convolutional neural network

Publications (1)

Publication Number Publication Date
CN111767860A true CN111767860A (en) 2020-10-13

Family

ID=72723520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010613939.0A Pending CN111767860A (en) 2020-06-30 2020-06-30 Method and terminal for realizing image recognition through convolutional neural network

Country Status (1)

Country Link
CN (1) CN111767860A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221620A (en) * 2021-01-29 2021-08-06 太原理工大学 Multi-scale convolutional neural network-based traffic sign rapid identification method
CN113361654A (en) * 2021-07-12 2021-09-07 广州天鹏计算机科技有限公司 Image identification method and system based on machine learning
CN113408188A (en) * 2021-05-24 2021-09-17 浙江大学衢州研究院 Method for identifying AFM image prediction material performance by convolutional neural network
WO2022111231A1 (en) * 2020-11-26 2022-06-02 中兴通讯股份有限公司 Cnn training method, electronic device, and computer readable storage medium
CN114821091A (en) * 2022-06-29 2022-07-29 成都理工大学 Nuclide rapid identification method based on logistic regression two-classification
CN114841983A (en) * 2022-05-17 2022-08-02 中国信息通信研究院 Image countermeasure sample detection method and system based on decision score

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416270A (en) * 2018-02-06 2018-08-17 南京信息工程大学 A kind of traffic sign recognition method based on more attribute union features
CN108710826A (en) * 2018-04-13 2018-10-26 燕山大学 A kind of traffic sign deep learning mode identification method
CN109086799A (en) * 2018-07-04 2018-12-25 江苏大学 A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN109635784A (en) * 2019-01-10 2019-04-16 重庆邮电大学 Traffic sign recognition method based on improved convolutional neural networks
CN110046544A (en) * 2019-02-27 2019-07-23 天津大学 Digital gesture identification method based on convolutional neural networks
CN110580450A (en) * 2019-08-12 2019-12-17 西安理工大学 traffic sign identification method based on convolutional neural network
CN111028207A (en) * 2019-11-22 2020-04-17 东华大学 Button flaw detection method based on brain-like immediate-universal feature extraction network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416270A (en) * 2018-02-06 2018-08-17 南京信息工程大学 A kind of traffic sign recognition method based on more attribute union features
CN108710826A (en) * 2018-04-13 2018-10-26 燕山大学 A kind of traffic sign deep learning mode identification method
CN109086799A (en) * 2018-07-04 2018-12-25 江苏大学 A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN109635784A (en) * 2019-01-10 2019-04-16 重庆邮电大学 Traffic sign recognition method based on improved convolutional neural networks
CN110046544A (en) * 2019-02-27 2019-07-23 天津大学 Digital gesture identification method based on convolutional neural networks
CN110580450A (en) * 2019-08-12 2019-12-17 西安理工大学 traffic sign identification method based on convolutional neural network
CN111028207A (en) * 2019-11-22 2020-04-17 东华大学 Button flaw detection method based on brain-like immediate-universal feature extraction network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BO-XUN WU等: "Traffic Sign Recognition with Light Convolutional Networks", 《2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN (ICCE-TW)》, pages 1 - 2 *
仲会娟等: "基于多尺度卷积神经网络的交通标志识别方法", 《延边大学学报(自然科学版)》, vol. 46, no. 4, pages 359 - 365 *
宋青松等: "基于多尺度卷积神经网络的交通标志识别", 《湖南大学学报(自然科学版)》, vol. 45, no. 8, pages 131 - 137 *
曲佳博等: "基于ST-CNN的交通标志实时检测识别算法", 《计算机科学》, vol. 46, no. 11, pages 309 - 314 *
田正鑫: "基于多尺度卷积神经网络的交通标志识别方法", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2018, pages 138 - 2982 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022111231A1 (en) * 2020-11-26 2022-06-02 中兴通讯股份有限公司 Cnn training method, electronic device, and computer readable storage medium
CN113221620A (en) * 2021-01-29 2021-08-06 太原理工大学 Multi-scale convolutional neural network-based traffic sign rapid identification method
CN113408188A (en) * 2021-05-24 2021-09-17 浙江大学衢州研究院 Method for identifying AFM image prediction material performance by convolutional neural network
CN113361654A (en) * 2021-07-12 2021-09-07 广州天鹏计算机科技有限公司 Image identification method and system based on machine learning
CN114841983A (en) * 2022-05-17 2022-08-02 中国信息通信研究院 Image countermeasure sample detection method and system based on decision score
CN114841983B (en) * 2022-05-17 2022-12-06 中国信息通信研究院 Image countermeasure sample detection method and system based on decision score
CN114821091A (en) * 2022-06-29 2022-07-29 成都理工大学 Nuclide rapid identification method based on logistic regression two-classification

Similar Documents

Publication Publication Date Title
CN111767860A (en) Method and terminal for realizing image recognition through convolutional neural network
KR102224253B1 (en) Teacher-student framework for light weighted ensemble classifier combined with deep network and random forest and the classification method based on thereof
CN110348384B (en) Small target vehicle attribute identification method based on feature fusion
CN110309856A (en) Image classification method, the training method of neural network and device
CN112270347A (en) Medical waste classification detection method based on improved SSD
CN110348447B (en) Multi-model integrated target detection method with abundant spatial information
CN114255361A (en) Neural network model training method, image processing method and device
CN112529146B (en) Neural network model training method and device
CN111723829B (en) Full-convolution target detection method based on attention mask fusion
CN111310604A (en) Object detection method and device and storage medium
CN111898432A (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN113592060A (en) Neural network optimization method and device
CN112883931A (en) Real-time true and false motion judgment method based on long and short term memory network
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN112883930A (en) Real-time true and false motion judgment method based on full-connection network
Huo et al. Traffic sign recognition based on resnet-20 and deep mutual learning
Zhou et al. Design of lightweight convolutional neural network based on dimensionality reduction module
Cao et al. Head pose estimation algorithm based on deep learning
US20240078800A1 (en) Meta-pre-training with augmentations to generalize neural network processing for domain adaptation
Li et al. Citrus Disease and Pest Recognition Algorithm Based on Migration Learning
Kratz Novel scenario detection in road traffic images
Dai et al. Lightweight Network Ensemble Architecture for Environmental Perception on the Autonomous System.
Wang et al. The Small Target Recognition Method of Neural Network Based on Spatial and Temporal Information
Kadam et al. Convolutional neural network strategies for realtime object detection
Xia et al. Multi-RPN Fusion-Based Sparse PCA-CNN Approach to Object Detection and Recognition for Robot-Aided Visual System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201013