CN110321967B - Image classification improvement method based on convolutional neural network - Google Patents
Image classification improvement method based on convolutional neural network Download PDFInfo
- Publication number
- CN110321967B CN110321967B CN201910624323.0A CN201910624323A CN110321967B CN 110321967 B CN110321967 B CN 110321967B CN 201910624323 A CN201910624323 A CN 201910624323A CN 110321967 B CN110321967 B CN 110321967B
- Authority
- CN
- China
- Prior art keywords
- image
- neural network
- convolutional neural
- convolution
- pooling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an improved algorithm for image classification based on a convolutional neural network, which adopts an AlexNet network model as a basic framework, firstly carries out proper preprocessing and data enhancement on an input image so as to reduce the dependence on the number of samples by the network, carries out feature extraction through a neural network convolutional layer, then reserves main features through a pooling layer, and simultaneously reduces the parameters and the calculated amount of the next layer. The image classification improved algorithm based on the convolutional neural network can reduce the dependence of a network model on the number of samples, can further reduce the number of parameters by adopting an LDA algorithm and adopting multi-scale convolution, simplifies the calculated amount and improves the accuracy of image classification.
Description
Technical Field
The invention belongs to the field of deep learning and image processing, and relates to application of an image classification and identification task in an improved deep neural network technology.
Background
Because the input layer of the convolutional neural network can directly process multidimensional data, the convolutional neural network has wide application in the field of computer vision. And the digitization continuously drives the development of the society, the data size is not easy to come, various mass data continuously appear, and the method is a great challenge for a neural network. In order to accelerate the learning of the neural network, various optimization algorithms for CNN are emerging continuously. At present, the convolutional neural network is mainly optimized in the depth and width of the model and the direction of data processing. In 2018, based on a convolutional neural network model proposed by LeCun et al, Gauno et al combine and improve several traditional activation functions aiming at the problems of gradient dispersion, low convergence speed and the like, combine the activation function Sigmoid and Softplus to obtain a new CNN model, and apply the CNN model to the recognition of digital handwriting. The accuracy of the convolutional neural network for recognizing the digital handwriting is improved; meanwhile, training parameters of the network are reduced after improvement, so that the neural network structure becomes simpler and the adaptability is stronger. In the same year, Wang Hua et al propose an image classification model using unsupervised learning algorithm and convolution, randomly extract image blocks with the same size from input unlabeled images to form a data set, perform preprocessing, extract dictionaries from the preprocessed image blocks by using a K-means clustering algorithm twice, extract final image features by using discrete convolution operation, and finally classify the extracted image features by using a Softmax classifier, thereby improving the image classification precision and reducing the training complexity. In summary, most of the optimization design is performed at the structure of the network model, and the speed and accuracy of the neural network model are improved by adopting different activation functions or performing various preprocessing operations on the image.
In order to solve the problem that the convolutional neural network limits the input image, further reduce the training complexity and accelerate the model convergence speed, the invention provides an image classification algorithm based on the improved convolutional neural network, so that the network does not limit the size of the input image any more, the network parameter quantity is reduced better, and higher accuracy and speed are achieved.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides an image classification improvement method based on a convolutional neural network, which is used for accurately, efficiently and quickly identifying and classifying images.
The technical scheme is as follows: an image classification improvement method based on a convolutional neural network specifically comprises the following steps:
the method comprises the following steps: carrying out image enhancement, filtering and noise reduction preprocessing operations on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the preprocessed image, extracting image features, performing pooling by adopting a maximum pooling method, extracting pixel points with the maximum receiving domain value, discarding other pixel points, keeping key information of the image while the size of the obtained feature map is reduced, reducing the size of a convolution kernel, performing convolution pooling operation again, and outputting a more abstract feature map;
step three: convolving the feature map obtained by the pooling convolution operation through a plurality of continuous convolution layers, fully fusing the features of different channels, and sending the feature map into a pyramid pooling layer for pooling;
step four: carrying out convolution operation on the feature map through the multi-scale convolution layer, so as to obtain a feature map with a fixed size;
step five: and further reducing the dimension and classifying the feature graph by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function for calculation to obtain probability distribution information, and outputting a calculation result and a prediction category.
Further, in the first step, gaussian filtering is performed on the input image to suppress noise and smooth the image, and at the same time, the image is inverted and color, saturation and contrast are adjusted.
Furthermore, in the third step, four continuous convolution layers are used together to fully fuse the characteristics of different channels of the image.
Further, in the fourth step, three different scales are adopted to map the feature map, and three different convolution operations are respectively adopted to perform convolution, so that the feature map with a fixed size can be obtained finally no matter what the size of the input image is.
Further, in the fifth step, LDA is adopted to reduce the dimension of the feature matrix, and the global divergence matrix StIs defined as:
where m is the total number of samples, xiAnd the ith sample vector is, mu is a mean vector of all samples, and T is a mathematical sign for solving a transpose matrix in the matrix theory.
Within-class dispersion matrix SωIs defined as:
wherein N is the total number of classes of the sample, XiFor the class i sample matrix, x is the vector of each sample of class i, μiIs the mean vector of all samples of the ith class.
Inter-class dispersion matrix SbIs defined as:
Sb=St-Sω
the optimization objective is thus defined as:
wherein W ∈ Rd×(N-1)And calculating a projection matrix formed by a group of optimal identification vectors for a matrix formed by N-1 characteristic vectors through an optimization target formula, projecting the N-dimensional characteristic space by the matrix, and outputting the N-1-dimensional low-dimensional characteristic space.
Further, the network adopts the overlapped maximum pooling, namely, an overlapping area exists between adjacent pooling windows, so that the richness of the characteristics can be improved, and the over-fitting phenomenon is avoided.
Further, data in the neural network is sent to an activation function for calculation, the used activation function is a modified linear unit Leaky ReLU, and the function can properly retain information of a negative axis, so that after characteristic information is calculated through the function, information of a negative interval cannot be completely lost, and the obtained information is more complete.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
1. the invention does not limit the size of the input image, the network can input the image with any size through the multi-scale convolution layer structure in the network, the image characteristic information is greatly reserved, and the final accuracy is improved.
2. The invention further reduces the parameter quantity of the model and accelerates the operation speed of the network. The invention adopts LDA algorithm to learn the similarity, and enhances the discrimination capability of the characteristics. Meanwhile, the method has the function of reducing the dimension, and the main characteristic information of the image is not damaged, so that the network is more efficient and faster.
3. The image classification improved algorithm based on the convolutional neural network can reduce the dependence of a network model on the number of samples, can further reduce the number of parameters by adopting an LDA algorithm and adopting multi-scale convolution, simplifies the calculated amount and improves the accuracy of image classification.
Drawings
Fig. 1 is a diagram of an improved multi-scale convolutional layer structure proposed by the present invention.
Fig. 2 is a network architecture diagram of the improved image classification algorithm of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in FIG. 1, an improved image classification method based on a convolutional neural network, the proposed improved multi-scale convolutional layer structure, maps feature maps with three different scales, 8 × 8, 6 × 6, 4 × 4, and then convolutes the feature maps with convolution kernels of three different sizes, the step sizes are S2, S1, S1, wherein the sizes of the convolution kernels are 2 × 2, 3 × 3, 1 × 1, the number of the convolution kernels is 256, and Leaky ReLU is an activation function, so that a feature map of a fixed size can be obtained finally no matter the size of an input image.
As shown in fig. 2, an image classification improvement method based on a convolutional neural network obtains an image with vivid characteristic information and less interference information by performing an image preprocessing operation on the image. And then inputting the image obtained by preprocessing operation into the neural network, performing feature extraction on the image through two layers of convolution pooling layer sets, wherein the size of a convolution kernel used is larger and is used for extracting more obvious edge feature information of the image, and then, passing the obtained feature map through four continuous layers of convolution layers with the same convolution kernel size and is used for extracting more feature information of different channels of the image, and simultaneously, fully fusing multi-channel information to obtain a more abstract and more representative feature map. In the process, the set convolution kernel size and the proper step size are adopted, so that the image size is not changed in a series of convolution processes. After the pyramid pooling layer in the scheme of the invention, all input images with different sizes are converted into feature maps with fixed sizes, and on the basis, an LDA algorithm is adopted for calculation and classification to obtain the final predicted classification label of the input images. The specific implementation process is as follows:
the method comprises the following steps: carrying out a series of preprocessing operations such as image enhancement, filtering and noise reduction on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the image obtained after the preprocessing, extracting image characteristics, performing pooling by adopting a maximum pooling method to obtain a characteristic graph, reducing the size of a convolution kernel, performing convolution pooling operation again to obtain a more abstract characteristic graph, and performing next-step characteristic fusion;
step three: convolving the characteristic diagram obtained in the second step through a plurality of continuous convolution layers, fully fusing the characteristics of different channels to ensure that the finally obtained characteristic diagram is more abstract and representative, and sending the characteristic diagram into a pyramid pooling layer for pooling;
step four: after mapping the characteristic graph according to different scales, performing convolution operation by adopting convolution kernels with different sizes respectively, thus obtaining the characteristic graph with fixed size;
step five: and further reducing the dimension and classifying the feature map by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function, and calculating to obtain the probability of the projected sample feature information belonging to a certain class, wherein the maximum probability is the prediction class of the image.
The multi-scale convolutional layer in the network structure has a plurality of layers, and the specific structure is as follows:
three different scales are adopted to map the characteristic diagram, which are respectively 8 multiplied by 8, 6 multiplied by 6 and 4 multiplied by 4. And performing convolution operation on the corresponding mapping characteristic graphs by adopting three different convolution kernels respectively, wherein the step sizes are S2, S1 and S1 respectively, the sizes of the convolution kernels are 2 x 2, 3 x 3 and 1 x 1 respectively, the numbers of the convolution kernels are 256, 256 and 256 respectively, and Leaky ReLU is an activation function. With this configuration, a feature map of a fixed size can be obtained regardless of the size of the input image, and the size of the input image is not limited.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (7)
1. An image classification improvement method based on a convolutional neural network is characterized by comprising the following steps: the method specifically comprises the following steps:
the method comprises the following steps: carrying out image enhancement, filtering and noise reduction preprocessing operations on an input image so as to reduce the influence on image feature extraction;
step two: performing convolution operation on the preprocessed image, extracting image features, performing pooling by adopting a maximum pooling method, extracting pixel points with the maximum receiving domain value, discarding other pixel points, keeping key information of the image while the size of the obtained feature map is reduced, reducing the size of a convolution kernel, performing convolution pooling operation again, and outputting a more abstract feature map;
step three: convolving the feature map output in the step two and obtained by the pooling convolution operation through a plurality of continuous convolution layers, fully fusing the features of different channels, and sending the feature map into a pyramid pooling layer for pooling;
step four: carrying out convolution operation on the feature map through the multi-scale convolution layer, so as to obtain a feature map with a fixed size;
step five: and further reducing the dimension and classifying the feature graph by adopting an LDA method, projecting by utilizing the LDA, bringing the projected sample feature information into a probability density function for calculation to obtain probability distribution information, and outputting a calculation result and a prediction category.
2. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the first step, Gaussian filtering is performed on an input image to suppress noise and smooth the image, and the image is simultaneously turned over to adjust color, saturation and contrast.
3. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the third step, four continuous convolution layers are used together to fully fuse the characteristics of different channels of the image.
4. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the fourth step, three different scales are adopted to map the characteristic diagram, and three different convolution operations are respectively adopted to perform convolution, so that the characteristic diagram with fixed size can be finally obtained no matter what the size of the input image is.
5. The convolutional neural network-based image classification improving method according to claim 1, wherein: in the fifth step, LDA is adopted to reduce the dimension of the characteristic matrix, and the global divergence matrix StIs defined as:
where m is the total number of samples, xiThe vector is the ith sample vector, mu is the mean vector of all samples, and T is the mathematical sign of the transpose matrix in the matrix theory;
within-class dispersion matrix SωIs defined as:
wherein N is the total number of classes of the sample, XiFor the class i sample matrix, x is the vector of each sample of class i, μiThe mean vector of all samples in the ith class is obtained;
degree of inter-class dispersionMatrix SbIs defined as:
Sb=St-Sω
the optimization objective is thus defined as:
wherein W ∈ Rd×(N-1)W is a feature matrix composed of N-1 feature vectors, a projection matrix composed of a group of optimal identification vectors is obtained through calculation by optimizing a target formula, the matrix projects the N-dimensional feature space, and the N-1-dimensional low-dimensional feature space is output.
6. The convolutional neural network-based image classification improving method according to claim 1, wherein: the network adopts overlapped maximum pooling, namely, an overlapping area exists between adjacent pooling windows, so that the richness of characteristics is improved, and the over-fitting phenomenon is avoided.
7. The convolutional neural network-based image classification improving method according to claim 1, wherein: the data in the neural network is sent to an activation function for calculation, and the used activation function is a modified linear unit Leaky ReLU.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910624323.0A CN110321967B (en) | 2019-07-11 | 2019-07-11 | Image classification improvement method based on convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910624323.0A CN110321967B (en) | 2019-07-11 | 2019-07-11 | Image classification improvement method based on convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110321967A CN110321967A (en) | 2019-10-11 |
CN110321967B true CN110321967B (en) | 2021-06-01 |
Family
ID=68121994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910624323.0A Active CN110321967B (en) | 2019-07-11 | 2019-07-11 | Image classification improvement method based on convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110321967B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110956611A (en) * | 2019-11-01 | 2020-04-03 | 武汉纺织大学 | Smoke detection method integrated with convolutional neural network |
CN111178507B (en) * | 2019-12-26 | 2024-05-24 | 集奥聚合(北京)人工智能科技有限公司 | Atlas convolution neural network data processing method and apparatus |
CN111325149B (en) * | 2020-02-20 | 2023-05-26 | 中山大学 | Video action recognition method based on time sequence association model of voting |
CN112132145B (en) * | 2020-08-03 | 2023-08-01 | 深圳大学 | Image classification method and system based on model extended convolutional neural network |
CN112731410B (en) * | 2020-12-25 | 2021-11-05 | 上海大学 | Underwater target sonar detection method based on CNN |
CN113052189B (en) * | 2021-03-30 | 2022-04-29 | 电子科技大学 | Improved MobileNet V3 feature extraction network |
CN113205111B (en) * | 2021-04-07 | 2023-05-26 | 零氪智慧医疗科技(天津)有限公司 | Identification method and device suitable for liver tumor and electronic equipment |
CN113435389B (en) * | 2021-07-09 | 2024-03-01 | 大连海洋大学 | Chlorella and golden algae classification and identification method based on image feature deep learning |
CN114387467B (en) * | 2021-12-09 | 2022-07-29 | 哈工大(张家口)工业技术研究院 | Medical image classification method based on multi-module convolution feature fusion |
CN115239928B (en) * | 2022-09-05 | 2022-12-20 | 四川蜀天信息技术有限公司 | 3D data large-screen visualization system based on GIS |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096605B (en) * | 2016-06-02 | 2019-03-19 | 史方 | A kind of image obscuring area detection method and device based on deep learning |
EP3255586A1 (en) * | 2016-06-06 | 2017-12-13 | Fujitsu Limited | Method, program, and apparatus for comparing data graphs |
CN106897739B (en) * | 2017-02-15 | 2019-10-22 | 国网江苏省电力公司电力科学研究院 | A kind of grid equipment classification method based on convolutional neural networks |
-
2019
- 2019-07-11 CN CN201910624323.0A patent/CN110321967B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110321967A (en) | 2019-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110321967B (en) | Image classification improvement method based on convolutional neural network | |
WO2020238293A1 (en) | Image classification method, and neural network training method and apparatus | |
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
Bouti et al. | A robust system for road sign detection and classification using LeNet architecture based on convolutional neural network | |
US10846566B2 (en) | Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks | |
Gomez-Ojeda et al. | Training a convolutional neural network for appearance-invariant place recognition | |
CN111191583B (en) | Space target recognition system and method based on convolutional neural network | |
Mao et al. | Deep residual pooling network for texture recognition | |
CN109002755B (en) | Age estimation model construction method and estimation method based on face image | |
CN106126581A (en) | Cartographical sketching image search method based on degree of depth study | |
CN109063719B (en) | Image classification method combining structure similarity and class information | |
CN110175615B (en) | Model training method, domain-adaptive visual position identification method and device | |
CN110569738A (en) | natural scene text detection method, equipment and medium based on dense connection network | |
CN108230330B (en) | Method for quickly segmenting highway pavement and positioning camera | |
CN111191626A (en) | Fine identification method for multi-category vehicles | |
CN111310820A (en) | Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
Madan et al. | Traffic Sign Classification using Hybrid HOG-SURF Features and Convolutional Neural Networks. | |
CN109543546B (en) | Gait age estimation method based on depth sequence distribution regression | |
CN108664968B (en) | Unsupervised text positioning method based on text selection model | |
CN114882278A (en) | Tire pattern classification method and device based on attention mechanism and transfer learning | |
Sun et al. | Multiple-kernel, multiple-instance similarity features for efficient visual object detection | |
Khlif et al. | Learning text component features via convolutional neural networks for scene text detection | |
Özyurt et al. | A new method for classification of images using convolutional neural network based on Dwt-Svd perceptual hash function | |
Akhand et al. | Multiple convolutional neural network training for Bangla handwritten numeral recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |