CN113743484A - Image classification method and system based on space and channel attention mechanism - Google Patents

Image classification method and system based on space and channel attention mechanism Download PDF

Info

Publication number
CN113743484A
CN113743484A CN202110961232.3A CN202110961232A CN113743484A CN 113743484 A CN113743484 A CN 113743484A CN 202110961232 A CN202110961232 A CN 202110961232A CN 113743484 A CN113743484 A CN 113743484A
Authority
CN
China
Prior art keywords
attention mechanism
image
image classification
data set
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110961232.3A
Other languages
Chinese (zh)
Inventor
杨军
刘孟鑫
马利亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningxia University
Original Assignee
Ningxia University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningxia University filed Critical Ningxia University
Priority to CN202110961232.3A priority Critical patent/CN113743484A/en
Publication of CN113743484A publication Critical patent/CN113743484A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image classification method and system based on a space and channel attention mechanism, which comprises the following steps: s1, acquiring a sample data set; s2, extracting a certain number of images from the sample data set, and generating a confrontation network DCGAN by utilizing depth convolution to generate a false sample to obtain an extended data set; s3, processing the expansion data set, including dimension reduction, denoising and data enhancement; s4, dividing the expansion data set according to the proportion to obtain a training set and a test set; s5, inputting the images in the training set into the constructed image classification network model for parameter adjustment training, extracting the characteristics of the images from the images, and finally storing the trained image classification network model; and S6, loading the images in the test set into the trained image classification network model for judgment, wherein the result output by the model is the classification result. The invention can realize accurate classification of the target type image.

Description

Image classification method and system based on space and channel attention mechanism
Technical Field
The invention relates to the technical field of image classification, in particular to an image classification method and system based on a space and channel attention mechanism.
Background
Image classification is an image processing method in which features of an input original image are extracted and classified into categories. The main processes include image preprocessing, feature and classifier design. Feature extraction is the most critical part of the image classification task. Traditional image classification is implemented based on feature coding, while modern image classification is implemented based on deep learning.
In recent years, image classification techniques have received widespread attention and have played an important role in many different fields. The technology is widely applied to agriculture, environment, medicine and other fields at present. The development of image classification technology has made great progress, and especially the deep Convolutional Neural Networks (CNNs) have made success, and through a series of methods, the image identification problem with huge data volume is successfully reduced in dimension, and the method is the best mode for image feature extraction. CNN was first proposed by YanLeCun as lennet-5 and applied to handwriting recognition, followed by the appearance of AlexNet, VGG, GoogleNet, to the now widely used ResNet and densneet. The convolutional neural network plays an important role in computer vision.
Feature extraction is a difficult point in the whole classification regardless of the modern method or the traditional method, and once good features are found, the classification becomes easy. Modern image classification extracts higher-dimensional, abstract features than traditional image classification methods, and these features are closely related to the classifier. However, when processing huge image data, image interference and other data, the feature extraction cannot meet the actual requirements, and the classification accuracy cannot be achieved. Therefore, a method of extracting higher dimensional image features is proposed.
Disclosure of Invention
The first purpose of the present invention is to overcome the disadvantages and shortcomings of the prior art, and to provide an image classification method based on a spatial and channel attention mechanism, which can realize accurate classification of target type images.
It is a second object of the present invention to provide an image classification system based on spatial and channel attention mechanisms.
The first purpose of the invention is realized by the following technical scheme: the image classification method based on the spatial and channel attention mechanism comprises the following steps:
s1, obtaining an image sample of the target type to be distinguished and classified, and constructing a corresponding sample data set;
s2, extracting a certain number of images from the acquired sample data set, and generating a countermeasure network DCGAN by utilizing depth convolution to generate false samples, so as to expand the acquired sample data set to obtain an expanded data set;
s3, processing the expansion data set, including dimension reduction, denoising and data enhancement;
s4, dividing the expansion data set processed in the step S3 according to the proportion, dividing most of data into training sets and dividing a small part of data into test sets;
s5, inputting the images in the training set into the constructed image classification network model for parameter adjustment training, extracting the characteristics of the images from the images, and finally storing the trained image classification network model; the constructed image classification network model consists of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification submodules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
and S6, loading the images in the test set into the trained image classification network model for judgment, wherein the result output by the model is the classification result.
Further, in step S2, a certain number of images are extracted from the acquired sample data set to generate false samples, an unsupervised learning deep convolution is used to generate a confrontation network DCGAN, and a mutual game is performed between a generator G and a discriminator D of the DCGAN to finally achieve nash balance, so that the sample generated by the generator G can deceive the discriminator D and is finally judged to be true; the specific process of generating the false sample is as follows:
s21, a generator G generates synthetic data from given noise, and finally converts the high-level representation into a pixel image with low resolution through up-sampling and deconvolution operations, wherein a full-connection layer and a pooling layer are not used in the whole process, and the given noise follows uniform distribution or normal distribution;
s22, judging whether the output of the generator G is real data or not by the discriminator D; the latter attempts to produce data that is closer to true, and correspondingly, the former attempts to better distinguish true data from generated data; therefore, the generator G and the discriminator D progress in the confrontation, and continue to confront after the progress, the data obtained by the generator G is more and more perfect and approaches to the real data, so that the image to be obtained can be generated; the optimization objective function is as follows:
Figure BDA0003222126270000031
where V (D, G) is a vector representation of generator G and discriminator D; x represents a real picture;
Figure BDA0003222126270000032
representing that the real picture obeys uniform distribution or normal distribution;
Figure BDA0003222126270000033
representing that the noise follows a uniform or normal distribution; d (x) represents the probability that the discriminator D discriminates whether the real picture is real or not; g (z) represents the picture generated by generator G; d (G (z)) represents the probability that the discriminator D judges whether or not the picture generated by the generator G is true.
Further, in step S3, a Principal Component Analysis (PCA) is used to perform dimensionality reduction and denoising on the extended data set, so as to reduce data redundancy, which includes the following contents:
a. changing the image into a matrix;
b. performing mean value removing operation on all the characteristics;
c. solving a covariance matrix;
d. solving the eigenvalue of covariance and corresponding eigenvector;
e. sorting the eigenvalues;
f. keeping the eigenvectors corresponding to the first N largest eigenvalues;
g. and projecting the original features into a new space constructed by the obtained N feature vectors to finally achieve the purpose of reducing the dimension.
Further, in step S3, the image in the extended data set is rotated, flipped, cropped, translated, and the brightness and contrast are adjusted to enhance the image data.
Further, in step S5, the convolution layer in DenseNet is composed of BN, Relu, and 1 × 1Conv, and with 1 × 1Conv, not only the dimensionality of the image can be reduced, but also the number of feature map outputs can be reduced; a Dense Block module in the DenseNet is an important component of the DenseNet, consists of BN, Relu, 1 × 1Conv and 3 × 3Conv, and is used for improving interlayer information flow, wherein Bottleneck is adopted in the Dense Block to reduce the calculation amount, and the original structure is increased by 1 × 1 Conv; the Transition layer in DenseNet is located between two dense blocks, used to change the size of the feature map, consisting of BN, Relu, 1 × 1Conv, and 2 × 2 average pooling.
Further, in step S5, the spatial attention mechanism first performs dimensionality reduction on the channel itself, obtains the maximum pooling result and the mean pooling result, respectively, then splices into a feature map, and then performs learning using a convolutional layer.
Further, in step S5, the channel attention mechanism SE-Net is composed of five parts, namely, global average pooling, full connection layer, Relu, full connection layer, and Sigmoid; the compression is realized by global average pooling, then channel statistics is generated, each two-dimensional characteristic channel is changed into a real number, the real number has a global receptive field to a certain extent, and the input characteristic channel is consistent with the output dimension; the excitation is to multiply the result obtained by compression with a first full connection layer, pass through a Relu layer, output dimension is unchanged, then multiply with a second full connection layer, and then pass through a Sigmoid function, and the excitation can help to capture the dependency relationship in the aspect of channels, thereby greatly reducing parameters and calculation amount.
Further, in step S5, the classification submodule is composed of a global average pooling, a full connection layer, BN and a Softmax function for reducing the type of the parameter and the image.
The second purpose of the invention is realized by the following technical scheme: an image classification system based on spatial and channel attention mechanisms, comprising:
the image acquisition module is used for acquiring image samples of the target types to be distinguished and classified and constructing corresponding sample data sets;
the data set processing module is used for extracting a certain number of images from the acquired sample data set to generate false samples, so that the sample data set is expanded to obtain an expanded data set; then, performing dimension reduction, denoising and data enhancement processing on the expansion data set; finally, dividing the processed expansion data set according to a proportion, dividing most of data into a training set, and dividing a small part of data into a test set;
the model training module is used for constructing an image classification network model first and then performing parameter adjustment training on the constructed image classification network model by using a training set; the constructed image classification network model consists of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification submodules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
and the image classification module is used for inputting the test set into the trained image classification network model to obtain the final classification result of the image.
Compared with the prior art, the invention has the following advantages and beneficial effects:
in order to improve the accuracy of image classification, the invention particularly designs an image classification network model combining a space and channel attention mechanism, adds the space attention mechanism, performs corresponding space transformation on the space domain information of the picture, only concerns the interested position, extracts the key information, simultaneously adds the channel attention mechanism Se-Net, can inhibit the uninteresting part aiming at the importance of the characteristic by modeling the importance degree of each characteristic channel, and enhances the interested part; in addition, a data set is expanded by utilizing a DCGAN generation technology aiming at image samples which are difficult to obtain, so that an image classification network model is trained better, the stability and robustness of the model performance are improved finally, the generalization capability of the model is improved, the overfitting phenomenon is relieved, the high accuracy of image classification is realized, and the method is applicable to various image classifications, has practical application value and is worthy of popularization.
Drawings
Fig. 1 is a sample diagram of the classified images in this embodiment.
Fig. 2 is a structural diagram of an image classification network model.
Fig. 3 is an architecture diagram of the system of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
The embodiment discloses a brain tumor image classification method based on a space and channel attention mechanism, which specifically comprises the following steps:
1) a total of 3540 samples of the target type were obtained and made up a corresponding sample dataset, as shown in fig. 1, where the samples included gliomas, meningiomas, pituitary tumors, and angioreticular tumors.
2) Extracting 120 images from the acquired sample data set, and generating a false sample by using a unsupervised learning deep convolution generation countermeasure network (DCGAN), so as to expand the acquired sample data set to obtain an expanded data set; the mutual game of the generator G and the discriminator D of the DCGAN finally achieves Nash equilibrium, so that the sample generated by the generator G can deceive the discriminator D, and the sample is finally judged to be true. The specific process of generating a dummy sample is as follows:
21) a generator G generates synthetic data from given noise (generally, uniform distribution or normal distribution), and finally converts high-level representation into a pixel image with low resolution through a series of up-sampling and deconvolution operations, wherein a full connection layer and a pooling layer are not used in the whole process;
22) the discriminator D discriminates whether the output of the generator G is real data; the latter attempts to produce data that is closer to true, and correspondingly, the former attempts to better distinguish true data from generated data; therefore, the generator G and the discriminator D progress in the confrontation, and continue to confront after the progress, the data obtained by the generator G is more and more perfect and approaches to the real data, so that the image to be obtained can be generated; the optimization objective function is as follows:
Figure BDA0003222126270000071
where V (D, G) is a vector representation of generator G and discriminator D; x represents a real picture;
Figure BDA0003222126270000072
representing that the real picture obeys uniform distribution or normal distribution;
Figure BDA0003222126270000073
representing that the noise follows a uniform or normal distribution; d (x) represents the probability that the discriminator D discriminates whether the real picture is real or not; g (z) represents the picture generated by generator G; d (G (z)) represents the probability that the discriminator D judges whether or not the picture generated by the generator G is true.
Random noise is input into the generator G, passes through a mapping and reshaping (equivalent to a full connection layer), then passes through a series of upsampling and deconvolution to generate false samples, and the size of each upsampled feature map is reduced by half by a channel and is doubled.
And the input of the discriminator D is an image, the image is processed by down sampling and full connection layers, then the image is input into a Sigmoid function, and finally the true and false probability is output, wherein if the true probability is 1, otherwise, the true probability is 0.
3) And processing the extended data set, including dimension reduction, denoising and data enhancement.
The Principal Component Analysis (PCA) method is adopted for dimensionality reduction and denoising, so that the redundancy of data is reduced, and the method comprises the following steps:
a. changing the image into a matrix;
b. performing mean value removing operation on all the characteristics;
c. solving a covariance matrix;
d. solving the eigenvalue of covariance and corresponding eigenvector;
e. sorting the eigenvalues;
f. keeping the eigenvectors corresponding to the first N largest eigenvalues;
g. and projecting the original features into a new space constructed by the obtained N feature vectors to finally achieve the purpose of reducing the dimension.
Data enhancement is realized by rotating, turning, cutting, translating and adjusting brightness and contrast of the image.
4) And dividing the processed expansion data set according to the number of 7:3 to obtain a training set and a test set.
5) Inputting the images in the training set into the constructed image classification network model for parameter adjustment training, extracting the characteristics of the images, and finally storing the trained image classification network model; referring to fig. 2, the constructed image classification network model is composed of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification sub-modules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
the image classification network model is specific as follows:
first, the image is convolved by the convolution layer in DenseNet, and the number of output feature maps can be reduced as well as the dimension of the image by using a convolution kernel of 1 × 1.
Putting the convolved result into a Dense Block, reducing the calculation amount by adopting Bottleneck inside the Dense Block because the input of the back layer is very large, and increasing the original structure by 1 × 1 Conv; the operations of BN, Relu, 1 × 1Conv, and 3 × 3Conv are performed in the sense Block to improve the inter-layer information flow.
The output of the Dense Block Block is input to the Transition layer located between two Dense blocks, pooled by BN, Relu, 1 × 1Conv, and 2 × 2 averaging, used to change the size of the feature map.
And reducing the dimension of the channel by using a space attention mechanism, respectively obtaining maximum pooling results and mean pooling results, splicing into a characteristic diagram, and learning by using a convolution layer.
The channel attention mechanism consists of five parts, namely global average pooling, a full connection layer, Relu, a full connection layer and Sigmoid, and comprises the following two steps:
the compression (sequeneze) is used for compressing the features, the compression operation is realized through global average pooling, then channel statistics is generated, each two-dimensional feature channel is changed into a real number, the real number has a global receptive field to a certain extent, and the input feature channel is consistent with the output dimension. The expression is as follows:
Figure BDA0003222126270000091
in the formula, ZcRepresents that U iscResult after compression, FsqRepresenting compression operation, c represents the number of channels, UcThe C-th feature map, u, representing the spatial dimension H × W × C (length × width × height)c(i, j) represents the c-th two-dimensional matrix with the size W multiplied by H in the three-dimensional matrix U, i represents W (width), and j represents H (height).
Excitation (Excitation) is to multiply the result obtained by compression with the first fully-connected layer, pass through a Relu layer, output unchanged dimension, then multiply with the second fully-connected layer, and then pass through a Sigmoid function. Therefore, the excitation can capture the dependence relation in the channel, and the parameters and the calculation amount are greatly reduced. The expression is as follows:
S=Fex(Z,W)=σ(W2δ(W1z))
in the formula, S is used for describing the weight of c characteristic graphs in U, a parameter W generates the weight for each characteristic channel, the correlation among the characteristic channels is explicitly modeled, Fex(Z, W) represents the actuation of the compressed results Z and W, W1Represents the weight of the first fully-connected layer, W2Represents the weight of the second fully-connected layer, W1Is of the dimension of
Figure BDA0003222126270000101
W2Is of the dimension of
Figure BDA0003222126270000102
r is a scaling parameter, c represents the number of channels, W1z represents the first fully-connected layer, δ (·) represents Relu; then with W2Multiplication is the second fully-connected layer, σ (·) stands for sigmoid.
Finally, the input channels are multiplied by their respective weights.
The classification submodule mainly comprises a global average pooling function, a full connection layer, BN and a Softmax function, and the class of each type of brain tumor is output by using the Softmax function, so that the classification submodule can reduce parameters and accurately distinguish the classes of the brain tumors.
6) And loading the images in the test set into a trained image classification network model for discrimination, wherein the result output by the model is the classification result.
Referring to fig. 3, the present embodiment also provides a brain tumor image classification system based on spatial and channel attention mechanism, including:
an image acquisition module: the method is used for acquiring MRI brain tumor images of target types, including 4 types of images of glioma, meningioma, pituitary tumor and angioreticular cell tumor, and corresponding sample data sets are formed.
A data set processing module: extracting a certain number of images from the acquired sample data set to generate false samples, so as to expand the sample data set to obtain an expanded data set; then, performing dimension reduction, denoising and data enhancement processing on the expansion data set; and finally, dividing the processed expansion data set according to a ratio of 7:3 to obtain a training set and a test set.
A model training module: firstly, an image classification network model is constructed, and then a training set is used for parameter adjustment training of the constructed image classification network model; the constructed image classification network model consists of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification submodules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
an image classification module: and inputting the test set into the trained image classification network model to obtain the final classification result of the MRI brain tumor image.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. The image classification method based on the spatial and channel attention mechanism is characterized by comprising the following steps of:
s1, obtaining an image sample of the target type to be distinguished and classified, and constructing a corresponding sample data set;
s2, extracting a certain number of images from the acquired sample data set, and generating a countermeasure network DCGAN by utilizing depth convolution to generate false samples, so as to expand the acquired sample data set to obtain an expanded data set;
s3, processing the expansion data set, including dimension reduction, denoising and data enhancement;
s4, dividing the expansion data set processed in the step S3 according to the proportion, dividing most of data into training sets and dividing a small part of data into test sets;
s5, inputting the images in the training set into the constructed image classification network model for parameter adjustment training, extracting the characteristics of the images from the images, and finally storing the trained image classification network model; the constructed image classification network model consists of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification submodules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
and S6, loading the images in the test set into the trained image classification network model for judgment, wherein the result output by the model is the classification result.
2. The method for image classification based on the spatial and channel attention mechanism as claimed in claim 1, wherein in step S2, a certain number of image generation false samples are extracted from the acquired sample data set, an unsupervised deep convolution is used to generate a countermeasure network DCGAN, and nash equalization is finally achieved by mutual game between a generator G and a discriminator D of the DCGAN, so that the sample generated by the generator G can deceive the discriminator D and the sample is finally discriminated as true; the specific process of generating the false sample is as follows:
s21, a generator G generates synthetic data from given noise, and finally converts the high-level representation into a pixel image with low resolution through up-sampling and deconvolution operations, wherein a full-connection layer and a pooling layer are not used in the whole process, and the given noise follows uniform distribution or normal distribution;
s22, judging whether the output of the generator G is real data or not by the discriminator D; the latter attempts to produce data that is closer to true, and correspondingly, the former attempts to better distinguish true data from generated data; therefore, the generator G and the discriminator D progress in the confrontation, and continue to confront after the progress, the data obtained by the generator G is more and more perfect and approaches to the real data, so that the image to be obtained can be generated; the optimization objective function is as follows:
Figure FDA0003222126260000021
where V (D, G) is a vector representation of generator G and discriminator D; x represents a real picture;
Figure FDA0003222126260000022
representing that the real picture obeys uniform distribution or normal distribution;
Figure FDA0003222126260000023
representing that the noise follows a uniform or normal distribution; d (x) represents the probability that the discriminator D discriminates whether the real picture is real or not; g (z) represents the picture generated by generator G; d (G (z)) represents the probability that the discriminator D judges whether or not the picture generated by the generator G is true.
3. The method for image classification based on the spatial and channel attention mechanism as claimed in claim 1, wherein in step S3, the Principal Component Analysis (PCA) is used to perform dimensionality reduction and denoising on the extended data set, so as to reduce data redundancy, which includes the following steps:
a. changing the image into a matrix;
b. performing mean value removing operation on all the characteristics;
c. solving a covariance matrix;
d. solving the eigenvalue of covariance and corresponding eigenvector;
e. sorting the eigenvalues;
f. keeping the eigenvectors corresponding to the first N largest eigenvalues;
g. and projecting the original features into a new space constructed by the obtained N feature vectors to finally achieve the purpose of reducing the dimension.
4. The method for image classification based on spatial and channel attention mechanism as claimed in claim 1, wherein in step S3, the image is rotated, flipped, cropped, translated, and adjusted in brightness and contrast to enhance the image data.
5. The image classification method based on the spatial and channel attention mechanism according to claim 1, wherein in step S5, the convolution layer in the DenseNet is composed of BN, Relu, and 1 × 1Conv, and the 1 × 1Conv can be used to reduce not only the dimension of the image but also the number of feature map outputs; a Dense Block module in the DenseNet is an important component of the DenseNet, consists of BN, Relu, 1 × 1Conv and 3 × 3Conv, and is used for improving interlayer information flow, wherein Bottleneck is adopted in the Dense Block to reduce the calculation amount, and the original structure is increased by 1 × 1 Conv; the Transition layer in DenseNet is located between two dense blocks, used to change the size of the feature map, consisting of BN, Relu, 1 × 1Conv, and 2 × 2 average pooling.
6. The image classification method based on the spatial and channel attention mechanism as claimed in claim 1, wherein in step S5, the spatial attention mechanism first performs dimensionality reduction on the channel itself, obtains the maximum pooling result and the mean pooling result respectively, then splices them into a feature map, and then performs learning using a convolutional layer.
7. The image classification method based on the spatial and channel attention mechanisms according to claim 1, characterized in that in step S5, the channel attention mechanism SE-Net is composed of five parts of global average pooling, full connection layer, Relu, full connection layer and Sigmoid; the compression is realized by global average pooling, then channel statistics is generated, each two-dimensional characteristic channel is changed into a real number, the real number has a global receptive field to a certain extent, and the input characteristic channel is consistent with the output dimension; the excitation is to multiply the result obtained by compression with a first full connection layer, pass through a Relu layer, output dimension is unchanged, then multiply with a second full connection layer, and then pass through a Sigmoid function, and the excitation can help to capture the dependency relationship in the aspect of channels, thereby greatly reducing parameters and calculation amount.
8. The image classification method based on the spatial and channel attention mechanism as claimed in claim 1, wherein in step S5, the classification submodule is composed of global average pooling, full connection layer, BN and Softmax functions for reducing the types of parameters and images.
9. An image classification system based on a spatial and channel attention mechanism, comprising:
the image acquisition module is used for acquiring image samples of the target types to be distinguished and classified and constructing corresponding sample data sets;
the data set processing module is used for extracting a certain number of images from the acquired sample data set to generate false samples, so that the sample data set is expanded to obtain an expanded data set; then, performing dimension reduction, denoising and data enhancement processing on the expansion data set; finally, dividing the processed expansion data set according to a proportion, dividing most of data into a training set, and dividing a small part of data into a test set;
the model training module is used for constructing an image classification network model first and then performing parameter adjustment training on the constructed image classification network model by using a training set; the constructed image classification network model consists of a DenseNet, a space attention mechanism, a channel attention mechanism SE-Net and classification submodules, wherein the DenseNet is a main network of the model and is used for extracting global features of images and multiplexing the features; embedding a space attention mechanism into the DenseNet, performing corresponding space transformation on the space domain information of the picture, only concerning interested positions, and extracting key information; the channel attention mechanism SE-Net inhibits or enhances different channels aiming at the importance of the characteristics by modeling the importance degree of each characteristic channel; the classification submodule uses Softmax as a core to accurately classify various images; the image classification network model acquires an interested part through a space attention mechanism, acquires the weight of the characteristics by utilizing a channel attention mechanism SE-Net, emphasizes useful information to inhibit useless information and inputs the useful information to a classification submodule;
and the image classification module is used for inputting the test set into the trained image classification network model to obtain the final classification result of the image.
CN202110961232.3A 2021-08-20 2021-08-20 Image classification method and system based on space and channel attention mechanism Pending CN113743484A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110961232.3A CN113743484A (en) 2021-08-20 2021-08-20 Image classification method and system based on space and channel attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110961232.3A CN113743484A (en) 2021-08-20 2021-08-20 Image classification method and system based on space and channel attention mechanism

Publications (1)

Publication Number Publication Date
CN113743484A true CN113743484A (en) 2021-12-03

Family

ID=78732000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110961232.3A Pending CN113743484A (en) 2021-08-20 2021-08-20 Image classification method and system based on space and channel attention mechanism

Country Status (1)

Country Link
CN (1) CN113743484A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114298234A (en) * 2021-12-31 2022-04-08 深圳市铱硙医疗科技有限公司 Brain medical image classification method and device, computer equipment and storage medium
CN114463587A (en) * 2022-01-30 2022-05-10 中国农业银行股份有限公司 Abnormal data detection method, device, equipment and storage medium
CN115439702A (en) * 2022-11-08 2022-12-06 武昌理工学院 Weak noise image classification method based on frequency domain processing
CN115601821A (en) * 2022-12-05 2023-01-13 中国汽车技术研究中心有限公司(Cn) Interaction method based on expression recognition
CN116612339A (en) * 2023-07-21 2023-08-18 中国科学院宁波材料技术与工程研究所 Construction device and grading device of nuclear cataract image grading model
CN117315377A (en) * 2023-11-29 2023-12-29 山东理工职业学院 Image processing method and device based on machine vision and electronic equipment
CN117893763A (en) * 2024-01-22 2024-04-16 内蒙古工业大学 ResCo-UNet-based buckwheat grain image segmentation method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633513A (en) * 2017-09-18 2018-01-26 天津大学 The measure of 3D rendering quality based on deep learning
CN110414601A (en) * 2019-07-30 2019-11-05 南京工业大学 Photovoltaic module fault diagnosis method, system and equipment based on deep convolution countermeasure network
CN111709265A (en) * 2019-12-11 2020-09-25 深学科技(杭州)有限公司 Camera monitoring state classification method based on attention mechanism residual error network
WO2020244108A1 (en) * 2019-06-05 2020-12-10 Boe Technology Group Co., Ltd. Methods and apparatuses for semantically segmenting input image, and computer-program product
CN112308830A (en) * 2020-10-27 2021-02-02 苏州大学 Attention mechanism and deep supervision strategy-based automatic division identification method for retinopathy of prematurity
CN113269799A (en) * 2021-05-18 2021-08-17 哈尔滨理工大学 Cervical cell segmentation method based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633513A (en) * 2017-09-18 2018-01-26 天津大学 The measure of 3D rendering quality based on deep learning
WO2020244108A1 (en) * 2019-06-05 2020-12-10 Boe Technology Group Co., Ltd. Methods and apparatuses for semantically segmenting input image, and computer-program product
CN110414601A (en) * 2019-07-30 2019-11-05 南京工业大学 Photovoltaic module fault diagnosis method, system and equipment based on deep convolution countermeasure network
CN111709265A (en) * 2019-12-11 2020-09-25 深学科技(杭州)有限公司 Camera monitoring state classification method based on attention mechanism residual error network
CN112308830A (en) * 2020-10-27 2021-02-02 苏州大学 Attention mechanism and deep supervision strategy-based automatic division identification method for retinopathy of prematurity
CN113269799A (en) * 2021-05-18 2021-08-17 哈尔滨理工大学 Cervical cell segmentation method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
党吉圣,杨军: "多特征融合的三维模型识别与分割", 西安电子科技大学学报, vol. 47, no. 4, pages 149 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114298234A (en) * 2021-12-31 2022-04-08 深圳市铱硙医疗科技有限公司 Brain medical image classification method and device, computer equipment and storage medium
CN114298234B (en) * 2021-12-31 2022-10-04 深圳市铱硙医疗科技有限公司 Brain medical image classification method and device, computer equipment and storage medium
CN114463587A (en) * 2022-01-30 2022-05-10 中国农业银行股份有限公司 Abnormal data detection method, device, equipment and storage medium
CN115439702A (en) * 2022-11-08 2022-12-06 武昌理工学院 Weak noise image classification method based on frequency domain processing
CN115439702B (en) * 2022-11-08 2023-03-24 武昌理工学院 Weak noise image classification method based on frequency domain processing
CN115601821A (en) * 2022-12-05 2023-01-13 中国汽车技术研究中心有限公司(Cn) Interaction method based on expression recognition
CN115601821B (en) * 2022-12-05 2023-04-07 中国汽车技术研究中心有限公司 Interaction method based on expression recognition
CN116612339A (en) * 2023-07-21 2023-08-18 中国科学院宁波材料技术与工程研究所 Construction device and grading device of nuclear cataract image grading model
CN116612339B (en) * 2023-07-21 2023-11-14 中国科学院宁波材料技术与工程研究所 Construction device and grading device of nuclear cataract image grading model
CN117315377A (en) * 2023-11-29 2023-12-29 山东理工职业学院 Image processing method and device based on machine vision and electronic equipment
CN117315377B (en) * 2023-11-29 2024-02-27 山东理工职业学院 Image processing method and device based on machine vision and electronic equipment
CN117893763A (en) * 2024-01-22 2024-04-16 内蒙古工业大学 ResCo-UNet-based buckwheat grain image segmentation method

Similar Documents

Publication Publication Date Title
CN113743484A (en) Image classification method and system based on space and channel attention mechanism
CN112084362B (en) Image hash retrieval method based on hierarchical feature complementation
CN106599883B (en) CNN-based multilayer image semantic face recognition method
CN111415316A (en) Defect data synthesis algorithm based on generation of countermeasure network
CN110363215A (en) The method that SAR image based on production confrontation network is converted into optical imagery
CN112580590A (en) Finger vein identification method based on multi-semantic feature fusion network
CN110211127B (en) Image partition method based on bicoherence network
CN110490265B (en) Image steganalysis method based on double-path convolution and feature fusion
CN111860124B (en) Remote sensing image classification method based on space spectrum capsule generation countermeasure network
CN115222998B (en) Image classification method
CN113870157A (en) SAR image synthesis method based on cycleGAN
CN114155371A (en) Semantic segmentation method based on channel attention and pyramid convolution fusion
CN111709313A (en) Pedestrian re-identification method based on local and channel combination characteristics
Xu et al. LMO-YOLO: A ship detection model for low-resolution optical satellite imagery
CN116453199B (en) GAN (generic object model) generation face detection method based on fake trace of complex texture region
CN117037004B (en) Unmanned aerial vehicle image detection method based on multi-scale feature fusion and context enhancement
CN111639697B (en) Hyperspectral image classification method based on non-repeated sampling and prototype network
CN112580480A (en) Hyperspectral remote sensing image classification method and device
CN109740552A (en) A kind of method for tracking target based on Parallel Signature pyramid neural network
CN113850182B (en) DAMR _ DNet-based action recognition method
Gao et al. Adaptive random down-sampling data augmentation and area attention pooling for low resolution face recognition
Fan et al. Attention-modulated triplet network for face sketch recognition
Li et al. A new algorithm of vehicle license plate location based on convolutional neural network
CN113343953B (en) FGR-AM method and system for remote sensing scene recognition
CN114005002B (en) Image recognition method of kernel fully-connected neural network based on kernel operation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination