CN110321967B - Image classification improvement method based on convolutional neural network - Google Patents

Image classification improvement method based on convolutional neural network Download PDF

Info

Publication number
CN110321967B
CN110321967B CN201910624323.0A CN201910624323A CN110321967B CN 110321967 B CN110321967 B CN 110321967B CN 201910624323 A CN201910624323 A CN 201910624323A CN 110321967 B CN110321967 B CN 110321967B
Authority
CN
China
Prior art keywords
image
neural network
convolutional neural
convolution
pooling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910624323.0A
Other languages
Chinese (zh)
Other versions
CN110321967A (en
Inventor
李跃辉
赵诚诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201910624323.0A priority Critical patent/CN110321967B/en
Publication of CN110321967A publication Critical patent/CN110321967A/en
Application granted granted Critical
Publication of CN110321967B publication Critical patent/CN110321967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an improved algorithm for image classification based on a convolutional neural network, which adopts an AlexNet network model as a basic framework, firstly carries out proper preprocessing and data enhancement on an input image so as to reduce the dependence on the number of samples by the network, carries out feature extraction through a neural network convolutional layer, then reserves main features through a pooling layer, and simultaneously reduces the parameters and the calculated amount of the next layer. The image classification improved algorithm based on the convolutional neural network can reduce the dependence of a network model on the number of samples, can further reduce the number of parameters by adopting an LDA algorithm and adopting multi-scale convolution, simplifies the calculated amount and improves the accuracy of image classification.

Description

Image classification improvement method based on convolutional neural network
Technical Field
The invention belongs to the field of deep learning and image processing, and relates to application of an image classification and identification task in an improved deep neural network technology.
Background
Because the input layer of the convolutional neural network can directly process multidimensional data, the convolutional neural network has wide application in the field of computer vision. And the digitization continuously drives the development of the society, the data size is not easy to come, various mass data continuously appear, and the method is a great challenge for a neural network. In order to accelerate the learning of the neural network, various optimization algorithms for CNN are emerging continuously. At present, the convolutional neural network is mainly optimized in the depth and width of the model and the direction of data processing. In 2018, based on a convolutional neural network model proposed by LeCun et al, Gauno et al combine and improve several traditional activation functions aiming at the problems of gradient dispersion, low convergence speed and the like, combine the activation function Sigmoid and Softplus to obtain a new CNN model, and apply the CNN model to the recognition of digital handwriting. The accuracy of the convolutional neural network for recognizing the digital handwriting is improved; meanwhile, training parameters of the network are reduced after improvement, so that the neural network structure becomes simpler and the adaptability is stronger. In the same year, Wang Hua et al propose an image classification model using unsupervised learning algorithm and convolution, randomly extract image blocks with the same size from input unlabeled images to form a data set, perform preprocessing, extract dictionaries from the preprocessed image blocks by using a K-means clustering algorithm twice, extract final image features by using discrete convolution operation, and finally classify the extracted image features by using a Softmax classifier, thereby improving the image classification precision and reducing the training complexity. In summary, most of the optimization design is performed at the structure of the network model, and the speed and accuracy of the neural network model are improved by adopting different activation functions or performing various preprocessing operations on the image.
In order to solve the problem that the convolutional neural network limits the input image, further reduce the training complexity and accelerate the model convergence speed, the invention provides an image classification algorithm based on the improved convolutional neural network, so that the network does not limit the size of the input image any more, the network parameter quantity is reduced better, and higher accuracy and speed are achieved.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides an image classification improvement method based on a convolutional neural network, which is used for accurately, efficiently and quickly identifying and classifying images.
The technical scheme is as follows: an image classification improvement method based on a convolutional neural network specifically comprises the following steps:
the method comprises the following steps: carrying out image enhancement, filtering and noise reduction preprocessing operations on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the preprocessed image, extracting image features, performing pooling by adopting a maximum pooling method, extracting pixel points with the maximum receiving domain value, discarding other pixel points, keeping key information of the image while the size of the obtained feature map is reduced, reducing the size of a convolution kernel, performing convolution pooling operation again, and outputting a more abstract feature map;
step three: convolving the feature map obtained by the pooling convolution operation through a plurality of continuous convolution layers, fully fusing the features of different channels, and sending the feature map into a pyramid pooling layer for pooling;
step four: carrying out convolution operation on the feature map through the multi-scale convolution layer, so as to obtain a feature map with a fixed size;
step five: and further reducing the dimension and classifying the feature graph by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function for calculation to obtain probability distribution information, and outputting a calculation result and a prediction category.
Further, in the first step, gaussian filtering is performed on the input image to suppress noise and smooth the image, and at the same time, the image is inverted and color, saturation and contrast are adjusted.
Furthermore, in the third step, four continuous convolution layers are used together to fully fuse the characteristics of different channels of the image.
Further, in the fourth step, three different scales are adopted to map the feature map, and three different convolution operations are respectively adopted to perform convolution, so that the feature map with a fixed size can be obtained finally no matter what the size of the input image is.
Further, in the fifth step, LDA is adopted to reduce the dimension of the feature matrix, and the global divergence matrix StIs defined as:
Figure GDA0003002021410000021
where m is the total number of samples, xiAnd the ith sample vector is, mu is a mean vector of all samples, and T is a mathematical sign for solving a transpose matrix in the matrix theory.
Within-class dispersion matrix SωIs defined as:
Figure GDA0003002021410000022
wherein N is the total number of classes of the sample, XiFor the class i sample matrix, x is the vector of each sample of class i, μiIs the mean vector of all samples of the ith class.
Inter-class dispersion matrix SbIs defined as:
Sb=St-Sω
the optimization objective is thus defined as:
Figure GDA0003002021410000031
wherein W ∈ Rd×(N-1)And calculating a projection matrix formed by a group of optimal identification vectors for a matrix formed by N-1 characteristic vectors through an optimization target formula, projecting the N-dimensional characteristic space by the matrix, and outputting the N-1-dimensional low-dimensional characteristic space.
Further, the network adopts the overlapped maximum pooling, namely, an overlapping area exists between adjacent pooling windows, so that the richness of the characteristics can be improved, and the over-fitting phenomenon is avoided.
Further, data in the neural network is sent to an activation function for calculation, the used activation function is a modified linear unit Leaky ReLU, and the function can properly retain information of a negative axis, so that after characteristic information is calculated through the function, information of a negative interval cannot be completely lost, and the obtained information is more complete.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
1. the invention does not limit the size of the input image, the network can input the image with any size through the multi-scale convolution layer structure in the network, the image characteristic information is greatly reserved, and the final accuracy is improved.
2. The invention further reduces the parameter quantity of the model and accelerates the operation speed of the network. The invention adopts LDA algorithm to learn the similarity, and enhances the discrimination capability of the characteristics. Meanwhile, the method has the function of reducing the dimension, and the main characteristic information of the image is not damaged, so that the network is more efficient and faster.
3. The image classification improved algorithm based on the convolutional neural network can reduce the dependence of a network model on the number of samples, can further reduce the number of parameters by adopting an LDA algorithm and adopting multi-scale convolution, simplifies the calculated amount and improves the accuracy of image classification.
Drawings
Fig. 1 is a diagram of an improved multi-scale convolutional layer structure proposed by the present invention.
Fig. 2 is a network architecture diagram of the improved image classification algorithm of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in FIG. 1, an improved image classification method based on a convolutional neural network, the proposed improved multi-scale convolutional layer structure, maps feature maps with three different scales, 8 × 8, 6 × 6, 4 × 4, and then convolutes the feature maps with convolution kernels of three different sizes, the step sizes are S2, S1, S1, wherein the sizes of the convolution kernels are 2 × 2, 3 × 3, 1 × 1, the number of the convolution kernels is 256, and Leaky ReLU is an activation function, so that a feature map of a fixed size can be obtained finally no matter the size of an input image.
As shown in fig. 2, an image classification improvement method based on a convolutional neural network obtains an image with vivid characteristic information and less interference information by performing an image preprocessing operation on the image. And then inputting the image obtained by preprocessing operation into the neural network, performing feature extraction on the image through two layers of convolution pooling layer sets, wherein the size of a convolution kernel used is larger and is used for extracting more obvious edge feature information of the image, and then, passing the obtained feature map through four continuous layers of convolution layers with the same convolution kernel size and is used for extracting more feature information of different channels of the image, and simultaneously, fully fusing multi-channel information to obtain a more abstract and more representative feature map. In the process, the set convolution kernel size and the proper step size are adopted, so that the image size is not changed in a series of convolution processes. After the pyramid pooling layer in the scheme of the invention, all input images with different sizes are converted into feature maps with fixed sizes, and on the basis, an LDA algorithm is adopted for calculation and classification to obtain the final predicted classification label of the input images. The specific implementation process is as follows:
the method comprises the following steps: carrying out a series of preprocessing operations such as image enhancement, filtering and noise reduction on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the image obtained after the preprocessing, extracting image characteristics, performing pooling by adopting a maximum pooling method to obtain a characteristic graph, reducing the size of a convolution kernel, performing convolution pooling operation again to obtain a more abstract characteristic graph, and performing next-step characteristic fusion;
step three: convolving the characteristic diagram obtained in the second step through a plurality of continuous convolution layers, fully fusing the characteristics of different channels to ensure that the finally obtained characteristic diagram is more abstract and representative, and sending the characteristic diagram into a pyramid pooling layer for pooling;
step four: after mapping the characteristic graph according to different scales, performing convolution operation by adopting convolution kernels with different sizes respectively, thus obtaining the characteristic graph with fixed size;
step five: and further reducing the dimension and classifying the feature map by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function, and calculating to obtain the probability of the projected sample feature information belonging to a certain class, wherein the maximum probability is the prediction class of the image.
The multi-scale convolutional layer in the network structure has a plurality of layers, and the specific structure is as follows:
three different scales are adopted to map the characteristic diagram, which are respectively 8 multiplied by 8, 6 multiplied by 6 and 4 multiplied by 4. And performing convolution operation on the corresponding mapping characteristic graphs by adopting three different convolution kernels respectively, wherein the step sizes are S2, S1 and S1 respectively, the sizes of the convolution kernels are 2 x 2, 3 x 3 and 1 x 1 respectively, the numbers of the convolution kernels are 256, 256 and 256 respectively, and Leaky ReLU is an activation function. With this configuration, a feature map of a fixed size can be obtained regardless of the size of the input image, and the size of the input image is not limited.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (7)

1. An image classification improvement method based on a convolutional neural network is characterized by comprising the following steps: the method specifically comprises the following steps:
the method comprises the following steps: carrying out image enhancement, filtering and noise reduction preprocessing operations on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the preprocessed image, extracting image features, performing pooling by adopting a maximum pooling method, extracting pixel points with the maximum receiving domain value, discarding other pixel points, keeping key information of the image while the size of the obtained feature map is reduced, reducing the size of a convolution kernel, performing convolution pooling operation again, and outputting a more abstract feature map;
step three: convolving the feature map output in the step two and obtained by the pooling convolution operation through a plurality of continuous convolution layers, fully fusing the features of different channels, and sending the feature map into a pyramid pooling layer for pooling;
step four: carrying out convolution operation on the feature map through the multi-scale convolution layer, so as to obtain a feature map with a fixed size;
step five: and further reducing the dimension and classifying the feature graph by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function for calculation to obtain probability distribution information, and outputting a calculation result and a prediction category.
2. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the first step, Gaussian filtering is performed on an input image to suppress noise and smooth the image, and the image is simultaneously turned over to adjust color, saturation and contrast.
3. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the third step, four continuous convolution layers are used together to fully fuse the characteristics of different channels of the image.
4. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the fourth step, three different scales are adopted to map the characteristic diagram, and three different convolution operations are respectively adopted to perform convolution, so that the characteristic diagram with fixed size can be finally obtained no matter what the size of the input image is.
5. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the fifth step, LDA is adopted to reduce the dimension of the characteristic matrix, and the global divergence matrix StIs defined as:
Figure FDA0003012128390000011
where m is the total number of samples, xiThe vector is the ith sample vector, mu is the mean vector of all samples, and T is the mathematical sign of the transpose matrix in the matrix theory;
within-class dispersion matrix SωIs defined as:
Figure FDA0003012128390000021
wherein N is the total number of classes of the sample, XiFor the class i sample matrix, x is the vector of each sample of class i, μiThe mean vector of all samples in the ith class is obtained;
degree of inter-class dispersionMatrix SbIs defined as:
Sb=St-Sω
the optimization objective is thus defined as:
Figure FDA0003012128390000022
wherein W ∈ Rd×(N-1)W is a feature matrix composed of N-1 feature vectors, a projection matrix composed of a group of optimal identification vectors is obtained through calculation by optimizing a target formula, the matrix projects the N-dimensional feature space, and the N-1-dimensional low-dimensional feature space is output.
6. The convolutional neural network-based image classification improving method according to claim 1, wherein: the network adopts overlapped maximum pooling, namely, an overlapping area exists between adjacent pooling windows, so that the richness of characteristics is improved, and the over-fitting phenomenon is avoided.
7. The convolutional neural network-based image classification improving method according to claim 1, wherein: the data in the neural network is sent to an activation function for calculation, and the used activation function is a modified linear unit Leaky ReLU.
CN201910624323.0A 2019-07-11 2019-07-11 Image classification improvement method based on convolutional neural network Active CN110321967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910624323.0A CN110321967B (en) 2019-07-11 2019-07-11 Image classification improvement method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910624323.0A CN110321967B (en) 2019-07-11 2019-07-11 Image classification improvement method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN110321967A CN110321967A (en) 2019-10-11
CN110321967B true CN110321967B (en) 2021-06-01

Family

ID=68121994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910624323.0A Active CN110321967B (en) 2019-07-11 2019-07-11 Image classification improvement method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN110321967B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956611A (en) * 2019-11-01 2020-04-03 武汉纺织大学 Smoke detection method integrated with convolutional neural network
CN111178507B (en) * 2019-12-26 2024-05-24 集奥聚合(北京)人工智能科技有限公司 Atlas convolution neural network data processing method and apparatus
CN111325149B (en) * 2020-02-20 2023-05-26 中山大学 Video action recognition method based on time sequence association model of voting
CN112132145B (en) * 2020-08-03 2023-08-01 深圳大学 Image classification method and system based on model extended convolutional neural network
CN112731410B (en) * 2020-12-25 2021-11-05 上海大学 Underwater target sonar detection method based on CNN
CN113052189B (en) * 2021-03-30 2022-04-29 电子科技大学 Improved MobileNet V3 feature extraction network
CN113205111B (en) * 2021-04-07 2023-05-26 零氪智慧医疗科技(天津)有限公司 Identification method and device suitable for liver tumor and electronic equipment
CN113435389B (en) * 2021-07-09 2024-03-01 大连海洋大学 Chlorella and golden algae classification and identification method based on image feature deep learning
CN114387467B (en) * 2021-12-09 2022-07-29 哈工大(张家口)工业技术研究院 Medical image classification method based on multi-module convolution feature fusion
CN115239928B (en) * 2022-09-05 2022-12-20 四川蜀天信息技术有限公司 3D data large-screen visualization system based on GIS

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096605B (en) * 2016-06-02 2019-03-19 史方 A kind of image obscuring area detection method and device based on deep learning
EP3255586A1 (en) * 2016-06-06 2017-12-13 Fujitsu Limited Method, program, and apparatus for comparing data graphs
CN106897739B (en) * 2017-02-15 2019-10-22 国网江苏省电力公司电力科学研究院 A kind of grid equipment classification method based on convolutional neural networks

Also Published As

Publication number Publication date
CN110321967A (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN110321967B (en) Image classification improvement method based on convolutional neural network
WO2020238293A1 (en) Image classification method, and neural network training method and apparatus
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
Bouti et al. A robust system for road sign detection and classification using LeNet architecture based on convolutional neural network
US10846566B2 (en) Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
Gomez-Ojeda et al. Training a convolutional neural network for appearance-invariant place recognition
CN111191583B (en) Space target recognition system and method based on convolutional neural network
Mao et al. Deep residual pooling network for texture recognition
CN109002755B (en) Age estimation model construction method and estimation method based on face image
CN106126581A (en) Cartographical sketching image search method based on degree of depth study
CN109063719B (en) Image classification method combining structure similarity and class information
CN110175615B (en) Model training method, domain-adaptive visual position identification method and device
CN110569738A (en) natural scene text detection method, equipment and medium based on dense connection network
CN108230330B (en) Method for quickly segmenting highway pavement and positioning camera
CN111191626A (en) Fine identification method for multi-category vehicles
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
Madan et al. Traffic Sign Classification using Hybrid HOG-SURF Features and Convolutional Neural Networks.
CN109543546B (en) Gait age estimation method based on depth sequence distribution regression
CN108664968B (en) Unsupervised text positioning method based on text selection model
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
Sun et al. Multiple-kernel, multiple-instance similarity features for efficient visual object detection
Khlif et al. Learning text component features via convolutional neural networks for scene text detection
Özyurt et al. A new method for classification of images using convolutional neural network based on Dwt-Svd perceptual hash function
Akhand et al. Multiple convolutional neural network training for Bangla handwritten numeral recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant