CN113469198A - Image classification method based on improved VGG convolutional neural network model - Google Patents

Image classification method based on improved VGG convolutional neural network model Download PDF

Info

Publication number
CN113469198A
CN113469198A CN202110734218.XA CN202110734218A CN113469198A CN 113469198 A CN113469198 A CN 113469198A CN 202110734218 A CN202110734218 A CN 202110734218A CN 113469198 A CN113469198 A CN 113469198A
Authority
CN
China
Prior art keywords
neural network
module
convolutional neural
attention mechanism
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110734218.XA
Other languages
Chinese (zh)
Inventor
刘一柳
王志胜
马瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202110734218.XA priority Critical patent/CN113469198A/en
Publication of CN113469198A publication Critical patent/CN113469198A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image classification method based on an improved VGG convolutional neural network model, which comprises the following steps: step 1: establishing an attention mechanism module; step 2: adding an attention mechanism in the VGG convolutional neural network model to obtain a VGG convolutional neural network model based on the attention mechanism; step 3, training the VGG convolutional neural network model based on the attention mechanism by adopting the preprocessed training set, and testing the classification result of the VGG convolutional neural network model based on the attention mechanism by adopting the preprocessed test set; when the training times reach the preset maximum iteration times or the VGG convolutional neural network model based on the attention mechanism is converged, stopping training to obtain the finally trained VGG convolutional neural network model based on the attention mechanism; and 4, step 4: and classifying the images by adopting a trained attention-based VGG convolutional neural network model. The invention can improve the image classification precision.

Description

Image classification method based on improved VGG convolutional neural network model
Technical Field
The invention belongs to the field of image classification.
Background
Due to the rapid development of hardware technology, deep learning has gained great attention in computer vision. As a branch of deep learning, the convolutional neural network exhibits an extremely strong processing capability when processing images. In the aspect of image classification, a convolutional neural network, such as VGG, ResNet, realizes a supervised learning process of an image from feature extraction to classification in an end-to-end mode. However, convolutional neural networks implement the conversion of features from low-level to high-level semantics by a large number of convolutional layers. A great amount of characteristic redundancy is inevitably generated, and the attention mechanism aims to enable the convolutional neural network to effectively learn useful information and eliminate redundant information, namely the network focuses on more distinctive characteristics and inhibits redundant characteristics. But the channel attention SENet acquires the global relationship through global average pooling, but loses much spatial information. The hybrid attention BAM attempts to establish attention in the spatial domain and the channel domain, respectively, and a local receptive field is acquired by using a convolution kernel in the spatial domain, and a global dependency relationship is still difficult to acquire.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problems in the prior art, the invention provides an image classification method based on an improved VGG convolutional neural network model.
The technical scheme is as follows: the invention provides an image classification method based on an improved VGG convolutional neural network model, which specifically comprises the following steps:
step 1: establishing an attention mechanism;
step 2: adding an attention mechanism in the VGG convolutional neural network model to obtain a VGG convolutional neural network model based on the attention mechanism;
and step 3: presetting a training set and a testing set, preprocessing images in the training set and the testing set, training a VGG convolutional neural network model based on an attention mechanism by adopting the preprocessed training set, and testing a classification result of the VGG convolutional neural network model based on the attention mechanism by adopting the preprocessed testing set, thereby adjusting parameters of the VGG convolutional neural network model based on the attention mechanism; when the training times reach the preset maximum iteration times or the VGG convolutional neural network model based on the attention mechanism is converged, stopping training to obtain the finally trained VGG convolutional neural network model based on the attention mechanism;
and 4, step 4: and (4) classifying the images by adopting the VGG convolutional neural network model based on the attention mechanism trained in the step (3).
Further, the attention mechanism module in the step 1 comprises an average pooling layer, a first dimension replacement module, a first self-attention module, a second dimension replacement module, a second self-attention module, a normalization layer and a calibration module;
the average pooling layer is characteristic of the input to attention mechanism
Figure BDA0003140947040000021
Spatially averaged pooling
Figure BDA0003140947040000022
C is the channel number of the input features, H represents the space height of the input features, and W represents the space width of the input features;
the first dimension permuting module pair
Figure BDA0003140947040000023
The dimension replacement is specifically: will be provided with
Figure BDA0003140947040000024
The space is divided into Q characteristic groups on average, P elements exist in one group, P and Q are both hyperparameters, and P is multiplied by Q and multiplied by H is multiplied by W; forming the t-th element in Q characteristic groups into the t-th column vector
Figure BDA0003140947040000025
The number of the first self-attention modules is P, and the first self-attention modules are to be connected with the power supply
Figure BDA0003140947040000026
As input to the tth first self-attention module, obtain an output
Figure BDA0003140947040000027
Figure BDA0003140947040000028
Wherein
Figure BDA0003140947040000029
For inner product, Softmax is a function of the probability distribution,
Figure BDA00031409470400000210
is a transposed symbol;
the second dimension permutation module pair ZLThe dimension replacement is specifically: will ZLThe space is divided into P characteristic groups, one group has Q elements, the k-th element in the P characteristic groups forms the k-th column vector
Figure BDA00031409470400000211
Figure BDA00031409470400000212
The number of the second self-attention modules is Q, and the number of the second self-attention modules is equal to Q
Figure BDA00031409470400000213
Inputting the current data into a kth second self-attention module to obtain the output of the kth second self-attention module:
Figure BDA00031409470400000214
wherein,
Figure BDA00031409470400000215
the normalization layer adopts sigmoid function to YSNormalization is carried out to obtain
Figure BDA00031409470400000216
Figure BDA00031409470400000217
The calibration module calibrates the input characteristics U according to the following formula to finally obtain the output of the attention mechanism
Figure BDA00031409470400000218
Figure BDA00031409470400000219
Wherein
Figure BDA00031409470400000220
Presentation pair
Figure BDA00031409470400000221
And the corresponding spatial position in U, and propagates along the channel direction.
Further, the VGG convolutional neural network model based on attention mechanism in step 2 includes: the system comprises a first feature extraction module, a second feature extraction module, a third feature extraction module, a fourth feature extraction module, a fifth feature extraction module, a full connection layer and a Softmax classifier which are connected in sequence; the first and second feature extraction modules have the same structure and respectively comprise a first convolution operation module, a second convolution operation module, a first attention mechanism and a first maximum pooling layer which are connected in sequence; the third feature extraction module comprises a third convolution operation module, a fourth convolution operation module, a fifth convolution operation module, a second attention mechanism and a second maximum pooling layer which are sequentially connected; the fourth and fifth feature extraction modules have the same structure and respectively comprise a sixth convolution operation module, a seventh convolution operation module, an eighth convolution operation module and a third maximum pooling layer which are sequentially connected; the first convolution operation module, the second convolution operation module, the third convolution operation module, the fourth convolution operation module, the fifth convolution operation module and the sixth convolution operation module are identical in structure and respectively comprise a convolution layer, a ReLU activation function and a batch normalization layer which are sequentially connected; the first attention mechanism and the second attention mechanism are both the attention mechanism in the step 1.
Further, the preprocessing in the step 3 specifically includes sequentially performing horizontal turning, mirroring, clipping and standard normalization on each image in the training set and the test set.
Has the advantages that:
1. according to the invention, an attention mechanism is introduced into the VGG convolutional neural network, so that the useful information can be more effectively emphasized by the convolutional neural network, redundant information is eliminated, and the capability of the network in distinguishing features is improved.
2. The attention mechanism of the invention forms self-attention grouping twice by using long and short distances in a spatial domain, finally establishes a global dependency relationship and can better acquire context information.
3. The convolution in the convolutional neural network is an operation for acquiring a local relationship, the receptive field is limited, the attention mechanism of the invention acquires a global relationship by combining a long-distance relationship and a short-distance relationship, the range of the receptive field is the size of the whole characteristic diagram, the defect of the locality of the convolution in the VGG convolutional neural network is overcome, the network can acquire better context information, and the network classification performance is improved.
Drawings
FIG. 1 is a network model diagram of the attention mechanism of the present invention.
FIG. 2 is a diagram of a VGG convolutional neural network classification model based on an attention mechanism.
Detailed Description
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention.
The embodiment provides an image classification method based on an improved VGG convolutional neural network model, which comprises the following steps:
(1) and acquiring a natural color image with a label from the CIFAR-10 data set, preprocessing all images, and dividing the data set into a training set and a testing set. There are 10 different types of images in the training set for this embodiment.
(2) And combining the attention mechanism with the VGG convolutional neural network model to obtain the VGG convolutional neural network based on the attention mechanism.
(3) And inputting the preprocessed images of the training set into a convolutional neural network of VGG (convolutional neural network) based on an attention mechanism, and adjusting network parameters to finish network model training. Then, inputting the images of the test set into the trained network model, and evaluating the quality of the network model by using the index of classification accuracy; and finally, classifying the images by adopting a trained attention-based VGG convolutional neural network model.
The image preprocessing in the step 1 comprises the steps of horizontally turning, mirroring, cutting and standard normalization of the image in sequence. The purpose is to enhance data and improve generalization ability.
In the step 2, an attention mechanism is combined with the VGG convolutional neural network model, specifically, the attention mechanism is embedded into the VGG convolutional neural network model. The attention mechanism is a global relation model established in a space dimension, and the problem that the convolutional layer in the convolutional network is small in receptive field is solved. As shown in fig. 1, the specific processing steps of the attention mechanism on the input features are as follows:
step 2.1: will input features
Figure BDA0003140947040000041
As a feature vector in the spatial dimension
Figure BDA0003140947040000042
C represents the number of characteristic channels, H represents the height of characteristic space, W represents the width of characteristic space, which form a real number set
Figure BDA0003140947040000043
Of (c) is calculated. Wherein,
Figure BDA0003140947040000044
where (i, j) corresponds to a spatial position, i represents the ith row in the spatial position, i ∈ {1,2, …, H }; j denotes the jth column in spatial position, j ∈ {1,2, …, W }. Compressing the input characteristic input along the channel dimension by using an average pooling layer, and taking the average value of the spatial positions to obtain
Figure BDA0003140947040000045
Wherein
Figure BDA0003140947040000046
Is the average of the features within the spatial location (i, j),
Figure BDA0003140947040000047
is composed of
Figure BDA0003140947040000048
The elements of (1);
step 2.2: establishing a long distance relation in a space dimension, specifically:
step 2.2.1: using a first dimension permutation module, in a spatial dimension
Figure BDA0003140947040000049
Spatially equally divided into Q feature groups, with P elements in a group (also spatially divided into P × Q parts), P and Q both being hyperparameters and P × Q ═ H × W, in this embodiment: the content of P is 4, and the content of P is 4,
Figure BDA00031409470400000410
according to the following formula
Figure BDA00031409470400000411
Carry out dimension replacement to obtain
Figure BDA00031409470400000412
Any one of the elements and
Figure BDA00031409470400000413
long distance relationships between all elements in;
Figure BDA0003140947040000051
where Permute is a dimensional permutation,
Figure BDA0003140947040000052
is a transposed symbol.
Figure BDA0003140947040000053
Indicates to input
Figure BDA0003140947040000054
Dividing the vector into a tth column vector consisting of tth elements in each characteristic group after Q characteristic groups, namely a tth set;
step 2.2.2: adopting first self-attention (self-attention) modules, wherein the number of the first self-attention (self-attention) modules is P, and one column vector corresponds to one first self-attention module; tth first self-attention (self-attention) module first pair
Figure BDA0003140947040000055
The following operations are carried out
Figure BDA0003140947040000056
Wherein
Figure BDA0003140947040000057
Is that
Figure BDA0003140947040000058
Is an inner product, Softmax is mapped as a function of the probability distribution,
Figure BDA0003140947040000059
the long-distance relation matrix represents the similarity between two positions, and the matrix elements are limited to be between 0 and 1 through a Softmax function.
Then the tth first self-attention (self-attention) module maps the long-distance relationship matrix
Figure BDA00031409470400000510
And self-attention input
Figure BDA00031409470400000511
Multiplying to obtain the relationship between each position and all positions, specifically:
Figure BDA00031409470400000512
wherein
Figure BDA00031409470400000513
Indicating self-attention output.
Merging all P column vectors using the first self-attention module, i.e.
Figure BDA00031409470400000514
Step 2.3: establishing a short-distance relation in a spatial dimension, which is similar to the established long-distance relation in the spatial dimension, specifically:
step 2.3.1: using a second dimension permutation module to permute ZLThe space is divided into P characteristic groups in average, one characteristic group has Q elements (also divided into Q multiplied by P parts in space), and the Z is calculated according to the following short distance relation formulaLAnd (3) processing:
Figure BDA00031409470400000515
Figure BDA0003140947040000061
indicates to input ZLDividing the vector into P feature groups and then forming a kth column vector by the kth element in each feature group;
step 2.3.2: with Q second self-attentive modules, one for each column vector, the method will be described
Figure BDA0003140947040000062
As input to the kth second self-attention mode, an output is obtained
Figure BDA0003140947040000063
Figure BDA0003140947040000064
Wherein,
Figure BDA0003140947040000065
Figure BDA0003140947040000066
for a short distance relationship matrix, representing the similarity between two positions, the matrix elements are constrained between 0-1 by the Softmax function.
All Q column vector combinations applying self-attention, i.e.
Figure BDA0003140947040000067
Step 2.4: normalizing Y by using sigmoid functionSNormalizing, and then adopting a calibration module to recalibrate the original characteristic input U, specifically comprising the following steps:
Figure BDA0003140947040000068
wherein
Figure BDA0003140947040000069
Represents
Figure BDA00031409470400000610
Multiplied by the corresponding spatial position in U and propagated along the channel direction.
Figure BDA00031409470400000611
The output of the overall attention mechanism is shown.
In the step 2, the structure of a VGG convolutional neural network based on the attention mechanism is shown in fig. 2; taking the preprocessed training set as the input of a VGG convolutional neural network based on an attention mechanism, and sequentially performing a first processing stage, a second processing stage, a third processing stage, a fourth processing stage, a fifth processing stage and a sixth processing stage; wherein the first to fifth physical stages are a feature extraction stage; the resolution of the preprocessed image is 32 × 32, R, G, B3 channels;
a first treatment stage: after passing through two times of 3 × 3 convolution operation modules (i.e. convolution combination in fig. 2, the convolution operation module includes 3 × 3 convolution layers, a ReLU activation function layer and a BN batch normalization layer which are connected in sequence), attention is paid to the mechanism that the output characteristic layer is 64, the output is (32, 32, 64), and after passing through 2 × 2 maximum pooling layer, the output is (16, 16, 64).
The second stage and the first stage have the same treatment process: after two times of 3 × 3 convolution operation modules and attention mechanism, the output characteristic layer is 128, the output is (16, 16, 128), and the output is (8, 8, 128) after 2 × 2 maximum pooling layers.
And a third stage: after three times of 3-by-3 convolution operation modules and attention mechanism, the output characteristic layer is 256, the output is (8, 8, 256), and the output is (4, 4, 256) after 2-by-2 maximum pooling layers.
A fourth stage: and (4) outputting a characteristic layer of 512 after passing through the three times of 3-by-3 convolution operation modules, outputting (4, 4, 512), and outputting (2, 2, 512) after passing through the 2-by-2 maximum pooling layer.
The treatment process of the fifth stage and the fourth stage is the same: and (3) outputting a characteristic layer of 1024 through the three times of 3-by-3 convolution operation module, outputting (2, 2, 1024), and outputting (1, 1, 1024) through the 2-by-2 maximum pooling layer.
The sixth stage: through the full connectivity layer, the features (1, 1, 10) are output, and classified through a Softmax classifier.
The CIFAR-10 data set used in this implementation contains 60K color pictures, for a total of 10 classes, with a resolution of 32 x 32. Wherein 50K pictures are used as a training set for training, and 10K pictures are used as a test set for testing;
in the embodiment, during training, a model of TOP-1 precision performance is selected from a test set. In training, parameters are adjusted by a random gradient descent method, wherein the momentum is 0.8, and the weight attenuation is 5 e-4. The batch size is 64. The initial learning rate is 0.1, and every 50 traversal times (epochs) is reduced to 0.1 of the original learning rate for a total of 150 traversal times (epochs). All experiments used a GeForce RTX 2080Ti GPU.
In this embodiment, the preprocessed pictures are used as input and input to a VGG convolutional neural network model based on attention mechanism for training. And testing the classification accuracy on the test set every time the parameter of the VGG convolutional neural network model based on the attention mechanism is updated, until the model is trained to 150 traversal times (epochs), completing the model training, and keeping the model parameter unchanged. The test set accuracy will be the basis for evaluating the neural network classification quality. The comparison of the classification accuracy of the VGG convolutional neural network embedded with attention mechanism in real time and the original VGG convolutional neural network is shown in table 1:
TABLE 1
Classification network Reference quantity (M) Accuracy (%)
VGG neural network 31.256 92.16
VGG neural network based on attention mechanism 31.257 92.53
As can be seen from table 1, the VGG convolutional neural network model based on the attention mechanism of the present embodiment introduces almost no parameters while improving the classification accuracy.
The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims (4)

1. An image classification method based on an improved VGG convolutional neural network model is characterized by comprising the following steps:
step 1: establishing an attention mechanism;
step 2: adding an attention mechanism in the VGG convolutional neural network model to obtain a VGG convolutional neural network model based on the attention mechanism;
and step 3: presetting a training set and a testing set, preprocessing images in the training set and the testing set, training a VGG convolutional neural network model based on an attention mechanism by adopting the preprocessed training set, and testing a classification result of the VGG convolutional neural network model based on the attention mechanism by adopting the preprocessed testing set, thereby adjusting parameters of the VGG convolutional neural network model based on the attention mechanism; when the training times reach the preset maximum iteration times or the VGG convolutional neural network model based on the attention mechanism is converged, stopping training to obtain the finally trained VGG convolutional neural network model based on the attention mechanism;
and 4, step 4: and (4) classifying the images by adopting the VGG convolutional neural network model based on the attention mechanism trained in the step (3).
2. The image classification method based on the improved VGG convolutional neural network model is characterized in that the attention mechanism module in the step 1 comprises an average pooling layer, a first dimension replacement module, a first self-attention module, a second dimension replacement module, a second self-attention module, a normalization layer and a calibration module;
the average pooling layer is characteristic of the input to attention mechanism
Figure FDA0003140947030000011
Spatially averaged pooling
Figure FDA0003140947030000012
C is the channel number of the input features, H represents the space height of the input features, and W represents the space width of the input features;
the first dimension permuting module pair
Figure FDA0003140947030000013
The dimension replacement is specifically: will be provided with
Figure FDA0003140947030000014
The space is divided into Q characteristic groups on average, P elements exist in one group, P and Q are both hyperparameters, and P is multiplied by Q and multiplied by H is multiplied by W; forming the t-th element in Q characteristic groups into the t-th column vector
Figure FDA0003140947030000015
The number of the first self-attention modules is P, and the first self-attention modules are to be connected with the power supply
Figure FDA0003140947030000016
As input to the tth first self-attention module, obtain an output
Figure FDA0003140947030000017
Figure FDA0003140947030000018
Wherein
Figure FDA0003140947030000019
For inner product, Softmax is a function of the probability distribution,
Figure FDA00031409470300000110
is a transposed symbol;
the second dimension permutation module pair ZLThe dimension replacement is specifically: will ZLThe space is divided into P characteristic groups, one group has Q elements, the k-th element in the P characteristic groups forms the k-th column vector
Figure FDA0003140947030000021
Figure FDA0003140947030000022
The number of the second self-attention modules is Q, and the number of the second self-attention modules is equal to Q
Figure FDA0003140947030000023
Inputting the current data into a kth second self-attention module to obtain the output of the kth second self-attention module:
Figure FDA0003140947030000024
wherein,
Figure FDA0003140947030000025
the normalization layer adopts sigmoid function to YSNormalization is carried out to obtain
Figure FDA0003140947030000026
Figure FDA0003140947030000027
The calibration module calibrates the input characteristics U according to the following formula to finally obtain the output of the attention mechanism
Figure FDA0003140947030000028
Figure FDA0003140947030000029
Wherein
Figure FDA00031409470300000210
Presentation pair
Figure FDA00031409470300000211
And the corresponding spatial position in U, and propagates along the channel direction.
3. The image classification method based on the improved VGG convolutional neural network model is characterized in that the VGG convolutional neural network model based on the attention mechanism in the step 2 comprises the following steps: the system comprises a first feature extraction module, a second feature extraction module, a third feature extraction module, a fourth feature extraction module, a fifth feature extraction module, a full connection layer and a Softmax classifier which are connected in sequence; the first and second feature extraction modules have the same structure and respectively comprise a first convolution operation module, a second convolution operation module, a first attention mechanism and a first maximum pooling layer which are connected in sequence; the third feature extraction module comprises a third convolution operation module, a fourth convolution operation module, a fifth convolution operation module, a second attention mechanism and a second maximum pooling layer which are sequentially connected; the fourth and fifth feature extraction modules have the same structure and respectively comprise a sixth convolution operation module, a seventh convolution operation module, an eighth convolution operation module and a third maximum pooling layer which are sequentially connected; the first convolution operation module, the second convolution operation module, the third convolution operation module, the fourth convolution operation module, the fifth convolution operation module and the sixth convolution operation module are identical in structure and respectively comprise a convolution layer, a ReLU activation function and a batch normalization layer which are sequentially connected; the first attention mechanism and the second attention mechanism are both the attention mechanism in the step 1.
4. The image classification method based on the improved VGG convolutional neural network model as claimed in claim 1, wherein the preprocessing in step 3 is to perform horizontal inversion, mirroring, cropping and standard normalization on each image in the training set and the test set in turn.
CN202110734218.XA 2021-06-30 2021-06-30 Image classification method based on improved VGG convolutional neural network model Pending CN113469198A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734218.XA CN113469198A (en) 2021-06-30 2021-06-30 Image classification method based on improved VGG convolutional neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734218.XA CN113469198A (en) 2021-06-30 2021-06-30 Image classification method based on improved VGG convolutional neural network model

Publications (1)

Publication Number Publication Date
CN113469198A true CN113469198A (en) 2021-10-01

Family

ID=77874250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734218.XA Pending CN113469198A (en) 2021-06-30 2021-06-30 Image classification method based on improved VGG convolutional neural network model

Country Status (1)

Country Link
CN (1) CN113469198A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114092819A (en) * 2022-01-19 2022-02-25 成都四方伟业软件股份有限公司 Image classification method and device
CN114118142A (en) * 2021-11-05 2022-03-01 西安晟昕科技发展有限公司 Method for identifying radar intra-pulse modulation type
CN116058852A (en) * 2023-03-09 2023-05-05 同心智医科技(北京)有限公司 Classification system, method, electronic device and storage medium for MI-EEG signals

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382676A (en) * 2020-02-25 2020-07-07 南京大学 Sand image classification method based on attention mechanism
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method
CN111695494A (en) * 2020-06-10 2020-09-22 上海理工大学 Three-dimensional point cloud data classification method based on multi-view convolution pooling

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382676A (en) * 2020-02-25 2020-07-07 南京大学 Sand image classification method based on attention mechanism
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method
CN111695494A (en) * 2020-06-10 2020-09-22 上海理工大学 Three-dimensional point cloud data classification method based on multi-view convolution pooling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程起上: "《面向图像分类的卷积神经网络子结构设计理论与方法研究》", 《中国博士学位论文全文数据库》, no. 04, 15 April 2021 (2021-04-15), pages 51 - 53 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118142A (en) * 2021-11-05 2022-03-01 西安晟昕科技发展有限公司 Method for identifying radar intra-pulse modulation type
CN114092819A (en) * 2022-01-19 2022-02-25 成都四方伟业软件股份有限公司 Image classification method and device
CN116058852A (en) * 2023-03-09 2023-05-05 同心智医科技(北京)有限公司 Classification system, method, electronic device and storage medium for MI-EEG signals
CN116058852B (en) * 2023-03-09 2023-12-22 同心智医科技(北京)有限公司 Classification system, method, electronic device and storage medium for MI-EEG signals

Similar Documents

Publication Publication Date Title
CN113469198A (en) Image classification method based on improved VGG convolutional neural network model
CN112241766B (en) Liver CT image multi-lesion classification method based on sample generation and transfer learning
CN108537192B (en) Remote sensing image earth surface coverage classification method based on full convolution network
CN109191382B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN110659727A (en) Sketch-based image generation method
CN106951911A (en) A kind of quick multi-tag picture retrieval system and implementation method
CN112966667B (en) Method for identifying one-dimensional distance image noise reduction convolution neural network of sea surface target
Singh et al. Steganalysis of digital images using deep fractal network
CN113642445B (en) Hyperspectral image classification method based on full convolution neural network
KR102567128B1 (en) Enhanced adversarial attention networks system and image generation method using the same
CN116188836A (en) Remote sensing image classification method and device based on space and channel feature extraction
CN116363423A (en) Knowledge distillation method, device and storage medium for small sample learning
CN114373104A (en) Three-dimensional point cloud semantic segmentation method and system based on dynamic aggregation
CN115512368A (en) Cross-modal semantic image generation model and method
CN112183602A (en) Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks
CN115272766A (en) Hyperspectral image classification method based on hybrid Fourier operator Transformer network
CN115222754A (en) Mirror image segmentation method based on knowledge distillation and antagonistic learning
CN113344146B (en) Image classification method and system based on double attention mechanism and electronic equipment
CN110851627A (en) Method for describing sun black subgroup in full-sun image
CN114723784A (en) Pedestrian motion trajectory prediction method based on domain adaptation technology
CN116977747B (en) Small sample hyperspectral classification method based on multipath multi-scale feature twin network
CN117372777A (en) Compact shelf channel foreign matter detection method based on DER incremental learning
CN112232129A (en) Electromagnetic information leakage signal simulation system and method based on generation countermeasure network
CN114998725B (en) Hyperspectral image classification method based on self-adaptive spatial spectrum attention kernel generation network
CN115984949A (en) Low-quality face image recognition method and device with attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination