CN110766063B - Image classification method based on compressed excitation and tightly connected convolutional neural network - Google Patents
Image classification method based on compressed excitation and tightly connected convolutional neural network Download PDFInfo
- Publication number
- CN110766063B CN110766063B CN201910987689.4A CN201910987689A CN110766063B CN 110766063 B CN110766063 B CN 110766063B CN 201910987689 A CN201910987689 A CN 201910987689A CN 110766063 B CN110766063 B CN 110766063B
- Authority
- CN
- China
- Prior art keywords
- convolutional neural
- neural network
- tensor
- picture
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 68
- 230000005284 excitation Effects 0.000 title claims abstract description 31
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 26
- 230000006835 compression Effects 0.000 claims abstract description 19
- 238000007906 compression Methods 0.000 claims abstract description 19
- 230000000694 effects Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 28
- 238000007781 pre-processing Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000009826 distribution Methods 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 5
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- ZAIPMKNFIOOWCQ-UEKVPHQBSA-N cephalexin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@@H]3N(C2=O)C(=C(CS3)C)C(O)=O)=CC=CC=C1 ZAIPMKNFIOOWCQ-UEKVPHQBSA-N 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image classification method based on compression excitation and a tightly-connected convolutional neural network, which combines a lightweight tightly-connected convolutional neural network (DenseNet) with a high-performance compression and excitation module (SE), calculates a loss function by training the convolutional neural network, and updates the network according to gradient descent; testing a convolutional neural network and calculating classification accuracy; and repeating the steps to construct an accuracy value with highest storage accuracy and a convolutional neural network model parameter, so as to obtain the convolutional neural network model with the best effect. The compression and excitation module can explicitly model the interdependence relationship between channels, and has small calculated amount; compared with the traditional convolutional neural network image classification method, the method can obtain high-accuracy image classification results with a small quantity of parameters and calculation amount.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to an image classification method based on compressed excitation and a tightly connected convolutional neural network.
Background
In 2016, the world champion of alpha go and go, the profession nine-player plum stone performed a man-machine go war of great attention, and the result was four to one great wins. Nouns such as artificial intelligence and deep learning have come into the field of view of the masses from now on, and we have also come into the AI era of the whole people. A simple photo of a face often contains a lot of information, such as age, gender, race, appearance, etc., which can be obtained from the picture and classified by using related techniques in the field of artificial intelligence.
Image classification refers to the process of automatically classifying images into a set of predefined categories according to certain classification rules. The basic process of image classification is divided into two parts, a training image and a test image. The training image is divided into three parts (1) data preprocessing (2) feature extraction and representation (3) classifier design and learning. The test image is also divided into three parts, the first two parts are the same as the training image, (1) data preprocessing, (2) feature extraction and representation (3) classification decision. The performance of image classification is closely related to the feature extraction and classification methods.
The image feature extraction is the basis of image classification, the traditional method adopts the manually designed features to extract the features, whether the manually designed features are reasonable or not, so that great uncertainty is brought to the performance of image classification, and the problem is solved by the occurrence of a convolutional neural network. In 2012, alexin et al proposed alexin et al to obtain a first name in an ImageNet contest, performance far exceeding a second name, from which convolutional neural networks and deep learning have received extensive attention and development. Various new models are growing at macroscopic speeds, such as ZF-Net at 2013, google-Net at 2014 and VGG at VGG, resNet at 2015, denseNet at 2016, etc. Convolutional neural networks do not require manual feature extraction, and can automatically learn complex and useful features of various levels directly from large image datasets. For example, the bottom level features may be features of a contour class, and the highest level features are features of the most basic line class. The convolutional neural network is also an end-to-end image classification method, and the end-to-end model comprises a plurality of modules, each module is designed for a specific task and has own input and output, one end of the module is an original image, the other end of the module is the output of the module, and the modules form a whole to complete the final task.
The convolutional neural network remarkably improves the image classification performance and greatly promotes the development of computer vision. However, with the development of convolutional neural networks, in order to improve the accuracy of the model, the depth of the network is continuously increased, and the model is becoming larger, which greatly increases the calculation cost, and requires larger calculation resources and image data to train the network. There is a need for an image classification method that reduces computational costs and improves convolutional neural network performance.
Disclosure of Invention
The invention aims to: aiming at the defects of the prior art, the invention provides an image classification method based on compressed excitation and tightly connected convolutional neural network, which can improve the performance of the convolutional neural network.
The technical scheme is as follows: the invention discloses an image classification method based on a compression excitation module and a tightly connected convolutional neural network, which comprises the following steps:
(1) Preprocessing the collected pictures containing the category labels, converting the pictures into tensors, and forming a training set and a testing set;
(2) Training a convolutional neural network, prescribing training times, inputting a picture tensor of a training set into the tightly-connected convolutional neural network combined with a compression excitation module, inputting an output result into a softmax function, calculating the probability that the picture belongs to each category, and marking the probability as a prediction label;
(3) Comparing the prediction label obtained in the step (2) with a category label contained in the picture, calculating the deviation between the prediction label and an actual label through a loss function, calculating the gradient of the convolutional neural network parameter according to the loss function, and updating the network parameter by using a gradient descent method;
(4) Testing the convolutional neural network, inputting the picture tensor of the test set into the updated convolutional neural network to obtain a prediction label of the test picture, comparing the prediction label with the picture label contained in the prediction label, calculating and recording the prediction accuracy of the convolutional neural network, and storing the model parameters of the convolutional neural network;
(5) Repeating the step (2), the step (3) and the step (4), obtaining the prediction accuracy of the convolutional neural network of the test set after updating again, comparing with the previous prediction accuracy, and storing the accuracy with higher accuracy and the convolutional neural network model parameters;
(6) After the specified training times are reached, stopping training and testing, outputting the highest accuracy and storing the corresponding convolutional neural network parameters, and obtaining the convolutional neural network model with the best effect.
Further, the preprocessing of the picture in the step (1) is realized by the following formula:
wherein μ is the mean of the pictures, X represents the picture tensor, σ represents the standard deviation, max represents the maximum value of the picture tensor, min represents the minimum value of the picture tensor, X 1 Representing normalized picture tensor, x 0 Representing the normalized picture tensor.
Further, the ratio of the training set to the test set in the step (1) is 5:1.
Further, the step (2) includes the steps of:
(21) Each convolution layer contains a series of nonlinear transforms F l (. Cndot.) contains normalization (BN), modified linear units (ReLU) and convolution operations (Conv), l representing the number of layers:
wherein x=[x1 ,x 2 ,…,x D ]Is tensor input with D channels, w i Is the weight on the corresponding ith channel on the convolution kernel, and the output tensor size of the convolution layer satisfies the following formula:
wherein O is the size of the output tensor, I is the size of the input tensor, K is the size of the convolution kernel, P is the zero filling number, S is the moving step length, and the number of channels of the output tensor is equal to the convolution kernel number;
(22) The next layer of DenseNet is directly connected with all the previous layers, the input of the first layer is the splice of all the previous layers, and the output of the first layer can be expressed as:
y l =F l (x l )=F l ([x 0 ,y 1 ,…,y l-1 ])
wherein xl =[x 0 ,y 1 ,…,y l-1 ]Is an input of the first layer, y l Is the output of the first layer, the prerequisite for the splicing operation is x 0 And from y 1 To y l-1 The tensor size of (1) is unchanged, a convolution kernel of 3x3 with a step size of 1 and zero padding with a size of 1, i.e. k=3, p=1, s=1;
(23) Y is recorded l =[y 1 ,y 2 ,…,y C ]C is y l Is equal to the convolution kernel number, then y l Input compression excitation module (SE), first compression operation, generate a channel descriptor z= [ z) through global average pooling 1 ,z 2 ,…,z C ]:
Where H W is tensor y l The channel descriptor z obtained by compressing the spatial features contains global spatial information, and then performs excitation operation to completely capture the channel dependency relationship using a gating mechanism containing sigmoid function, as shown in the following formula:
s=σ(g(z,W))=σ(W 2 δ(W 1 z))
sigma denotes the sigmoid function, delta denotes the ReLU function, and the two linear layers (FC) each contain the parameter W 1 and W2 Forming a bottleneck layer; each channel of tensors is called a feature map, which is on-going by s-pair feature mapsThe track dimension is multiplied by a point to obtain a weight for each feature map to represent the importance of each feature map in the tensor in the global acceptance domain, as follows:
y c :=s c ·y c
excitation operation recalibration output y l =[y 1 ,y 2 ,…,y C ]Characteristic response on channel, will output y l =[y 1 ,y 2 ,…,y C ]And transmitting to the next layer, repeating the process, and finally inputting the output result of the tightly connected network combined with the compression excitation module into a softmax function, calculating the probability that the picture belongs to each category, and recording as a prediction label Y of the picture.
Further, the step (3) includes the steps of:
(31) Calculating the deviation of a predicted label and an actual label through a cross entropy loss function, giving two probability distributions p and q, and expressing the cross entropy loss of p through q as follows:
wherein P represents a label Y-q of the picture, q represents a predicted value Y, and the smaller the cross entropy is, the closer the two probability distributions are, namely the closer the predicted label is to the real label;
(32) Calculating convolutional neural network parameters θ from cross entropy loss functions i And updating parameters of the network by using a gradient descent method:
wherein ,L(θi ) Representing the loss function in θ i As a parameter, α represents a learning rate for controlling the gradient descent speed.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: the tightly connected convolutional neural network takes the output of all layers in front of the current layer as input, so that the characteristic weight is realized, the efficiency of parameters is improved, and the model can obtain good performance by using only a small amount of parameters; the mutual dependency relationship between modeling channels displayed by the compression excitation module is self-adaptively recalibrates the characteristic response of the channel direction, so that characteristic selection is realized, information characteristics are selectively emphasized, and useless characteristics are restrained; the combination of the two not only reduces the scale and parameter quantity of the model, but also greatly improves the performance of the convolutional neural network.
Drawings
FIG. 1 is a structural flow diagram of a tightly-coupled convolutional neural network based on a combined compressive excitation module;
fig. 2 is a network architecture diagram based on a tightly-coupled convolutional neural network incorporating a compressed excitation module.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention discloses a lightweight image classification method based on a tightly-connected convolutional neural network combined with a compression excitation module, aiming at the problem that the capacity of a model is overlarge because of the improvement of performance of a current convolutional neural network image classification model. The tightly connected convolutional neural network (DenseNet) takes the output of all layers in front of the current layer as input, so that the characteristic weight is realized, the efficiency of parameters is improved, and the model can obtain good performance by using only a small amount of parameters. The compression excitation module (SE) displays the interdependence relation between modeling channels, and the characteristic response of the channel direction is self-adaptively recalibrated, so that characteristic selection is realized, information characteristics are selectively emphasized, and useless characteristics are restrained. By combining the two, the performance of the convolutional neural network is greatly improved, the embodiment combines the tightly convolutional neural network, the parameter number of the model is greatly reduced on the premise of ensuring high accuracy of image classification, and the structure of the model is simplified as shown in fig. 2.
As shown in fig. 1, the present invention specifically includes the following steps:
1. image preprocessing to form training set and test set
(1) The dataset contains 6 tens of thousands of pictures, each containing a tag Y. Firstly, the data set is divided into a training set and a testing set, wherein the training set comprises 5 ten thousand pictures, and the testing set comprises 1 ten thousand pictures.
(2) All pictures are cut into fixed shapes, 32X32, then the picture level is randomly and horizontally flipped to expand the training dataset, finally the pictures are converted into tensors X, the tensors are normalized by using the channel mean and standard deviation, and then the picture tensors are normalized to between 0 and 1. The core formula of this process is:
wherein μ is the mean of the pictures, X represents the picture tensor, σ represents the standard deviation, max represents the maximum value of the picture tensor, min represents the minimum value of the picture tensor, X 1 Representing normalized picture tensor, x 0 Representing the normalized picture tensor.
2. Training convolutional neural networks
(1) And (3) defining the training times, inputting the picture tensor of the training set obtained in the step (1) into a tightly-connected convolutional neural network combined with the compression excitation module, and enabling the input picture tensor to enter the tightly-connected block after passing through a convolutional layer. Each convolution layer contains a series of nonlinear transforms F l (. Cndot.) contains normalization (BN), modified linear units (ReLU) and convolution operations (Conv), l representing the number of layers. The core formula of this process is:
wherein x=[x1 ,x 2 ,…,x D ]Is a tensor input with D channels. w (w) i Is a convolution kernel upper pairThe weight on the ith channel should be. The output tensor size of the convolutional layer satisfies the following formula:
wherein O is the size of the output tensor, I is the size of the input tensor, K is the size of the convolution kernel, P is the zero padding number, S is the moving step length, and the number of channels of the output tensor is equal to the convolution kernel number.
(2) The next layer of DenseNet is directly connected to all the previous layers, so the input of the first layer is a splice of all the previous layers, and the output of the first layer can be expressed as:
y l =F l (x l )=F l ([x 0 ,y 1 ,…,y l-1 ])
wherein xl =[x 0 ,y 1 ,…,y l-1 ]Is an input of the first layer, y l Is the output of the first layer, the prerequisite for the splicing operation is x 0 And from y 1 To y l-1 We use a convolution kernel of 3x3 with a step size of 1 and zero padding of 1, i.e. k=3, p=1, s=1, without change in the tensor size of (a).
(3) Y is recorded l =[y 1 ,y 2 ,…,y C ]C is y l Is equal to the convolution kernel number, then y l Input compression excitation module (SE), first compression operation, generate a channel descriptor z= [ z) through global average pooling 1 ,z 2 ,…,z C ]The core formula is as follows:
where H W is tensor y l The channel descriptor z obtained by compressing the spatial features contains global spatial information, and then the excitation operation is performed, and a gating mechanism containing sigmoid function is used to completely capture the channel dependence, as shown in the following formulaThe following is shown:
s=σ(g(z,W))=σ(W 2 δ(W 1 z))
wherein sigma represents a sigmoid function, delta represents a ReLU function, and the two linear layers (FC) each comprise a parameter W 1 and W2 Bottleneck layers are composed to reduce the number of parameters and fit more complex nonlinear relationships. Each channel of the tensor is called a feature map, and the importance of each feature map in the tensor in the global acceptance field is expressed by s performing point multiplication on the feature map in the channel dimension to obtain the weight of each feature map, and the formula is as follows:
y c :=s c ·y c
excitation operation recalibration output y l =[y 1 ,y 2 ,…,y C ]Characteristic response on the channel. Then output y l =[y 1 ,y 2 ,…,y C ]And transmitting to the next layer, repeating the process, and finally inputting the output result of the tightly connected network combined with the compression excitation module into a softmax function, calculating the probability that the picture belongs to each category, and recording as a prediction label Y of the picture.
3. Calculating a loss function, updating the network according to gradient descent
(1) Comparing the prediction label Y in the step 2 with the class label Y which is carried by the picture, calculating the deviation between the prediction label and the actual label through a cross entropy loss function, giving two probability distributions p and q, and expressing the cross entropy loss of p through q as follows:
wherein P represents the label Y-q of the picture, q represents the predicted value Y, and the smaller the cross entropy is, the closer the two probability distributions are, namely the closer the predicted label is to the real label. Assuming that the pictures are classified into three classification tasks, the class label y= (1, 0) of a certain picture, and the model outputs the prediction label y= (0.5,0.4,0.1) obtained after softmax regression, then the cross entropy is:
L((1,0,0),(0.5,0.4,0.1))=-(1×log0.5+0×log0.4+0×log0.1)≈0.3
(2) Calculating convolutional neural network parameters θ from cross entropy loss functions i And then updating the parameters of the network by using a gradient descent method. The gradient descent method is shown in the following formula:
wherein ,L(θi ) Representing the loss function in θ i As a parameter, α represents a learning rate for controlling the gradient descent speed.
4. Testing convolutional neural network, calculating classification accuracy
(1) Inputting the picture tensor of the test set into the updated convolutional neural network to obtain the probability that the test picture belongs to each category, and marking the category with the highest probability as the prediction label of the picture. Assuming that the model output is p= (0.7,0.2,0.1) as predicted after softmax regression, the predicted label is denoted y= (1, 0).
(2) Comparing the prediction label Y with the real label Y of the picture, and calculating the same quantity of the prediction label Y and the real label Y of the picture in the test set, so as to calculate the prediction accuracy of the convolutional neural network, and record the model accuracy and the model parameters of the convolutional neural network.
5. And (3) repeating the steps (2), 3 and 4), updating the parameters of the convolutional neural network, calculating the prediction accuracy of the convolutional neural network on a test set, comparing with the result of the last prediction accuracy, and storing the accuracy with higher accuracy and the convolutional neural network model.
6. After the specified training times are reached, training is set for 300 times, training and testing are stopped, the highest accuracy is output, and the corresponding convolutional neural network parameters and models are stored, so that the convolutional neural network model with the best effect is obtained.
Claims (3)
1. An image classification method based on compressed excitation and tightly-connected convolutional neural network, which is characterized by comprising the following steps:
(1) Preprocessing the collected pictures containing the category labels, converting the pictures into tensors, and forming a training set and a testing set;
(2) Training a convolutional neural network, prescribing training times, inputting a picture tensor of a training set into the tightly-connected convolutional neural network combined with a compression excitation module, inputting an output result into a softmax function, calculating the probability that the picture belongs to each category, and marking the probability as a prediction label;
(3) Comparing the prediction label obtained in the step (2) with a category label contained in the picture, calculating the deviation between the prediction label and an actual label through a loss function, calculating the gradient of the convolutional neural network parameter according to the loss function, and updating the network parameter by using a gradient descent method;
(4) Testing the convolutional neural network, inputting the picture tensor of the test set into the updated convolutional neural network to obtain a prediction label of the test picture, comparing the prediction label with the picture label contained in the prediction label, calculating and recording the prediction accuracy of the convolutional neural network, and storing the model parameters of the convolutional neural network;
(5) Repeating the step (2), the step (3) and the step (4), obtaining the prediction accuracy of the convolutional neural network of the test set after updating again, comparing with the previous prediction accuracy, and storing the accuracy with higher accuracy and the convolutional neural network model parameters;
(6) Stopping training and testing after the specified training times are reached, outputting the highest accuracy and storing the corresponding convolutional neural network parameters, and obtaining the convolutional neural network model with the best effect;
the step (2) comprises the following steps:
(21) Each convolution layer contains a series of nonlinear transforms F l Contain normalization (BN), modified linear units (ReLU) and convolution operations (Conv), l represents the number of layers:
wherein x=x1 ,x 2 ,…,x D Is tensor input with D channels, w i Is the weight on the corresponding ith channel on the convolution kernel, and the output tensor size of the convolution layer satisfies the following formula:
wherein O is the size of the output tensor, I is the size of the input tensor, K is the size of the convolution kernel, P is the zero filling number, S is the moving step length, and the number of channels of the output tensor is equal to the convolution kernel number;
(22) The next layer of DenseNet is directly connected with all the previous layers, the input of the first layer is the splice of all the previous layers, and the output of the first layer can be expressed as:
y l =F l x l =F l x 0 ,y 1 ,…,y l-1
wherein xl =x 0 ,y 1 ,…,y l-1 Is an input of the first layer, y l Is the output of the first layer, the prerequisite for the splicing operation is x 0 And from y 1 To y l-1 The tensor size of (1) is unchanged, a convolution kernel of 3x3 with a step size of 1 and zero padding with a size of 1, i.e. k=3, p=1, s=1;
(23) Y is recorded l =y 1 ,y 2 ,…,y C C is y l Is equal to the convolution kernel number, then y l Input compression excitation module, firstly perform compression operation, generate a channel descriptor z=z through global average pooling 1 ,z 2 ,…,z C :
Where c represents the number of channels and H W is the tensor y l The channel descriptor z obtained by compressing the spatial features contains global spatial information, and then performs excitation operation to completely capture the channel dependency relationship using a gating mechanism containing sigmoid function, as shown in the following formula:
wherein ,representing a sigmoid function, delta representing a ReLU function, the two linear layers each comprising a parameter W 1 and W2 Forming a bottleneck layer; each channel of the tensor is called a feature map, and the importance of each feature map in the tensor in the global acceptance field is represented by s performing point multiplication on the feature map in the channel dimension to obtain the weight of each feature map, and the formula is as follows:
y c :=s c ·y c
excitation operation recalibration output y l =y 1 ,y 2 ,…,y C Characteristic response on channel, will output y l =y 1 ,y 2 ,…,y C Transmitting to the next layer, repeating the above process, and finally inputting the output result of the tightly connected network combined with the compression excitation module into a softmax function, calculating the probability that the picture belongs to each category, and recording as a prediction label Y of the picture;
the step (3) comprises the following steps:
(31) Calculating the deviation of a predicted label and an actual label through a cross entropy loss function, giving two probability distributions p and q, and expressing the cross entropy loss of p through q as follows:
wherein p represents the label Y of the picture ^ Q represents a predicted value Y, and the smaller the cross entropy is, the closer the two probability distributions are, namely the closer the predicted label is to the real label;
(32) Calculating convolutional neural network parameters θ from cross entropy loss functions i And updating parameters of the network by using a gradient descent method:
wherein ,Lθi Representing the loss function in θ i As a parameter, α represents a learning rate for controlling the gradient descent speed.
2. The image classification method based on compressed excitation and tightly-connected convolutional neural network according to claim 1, wherein the picture preprocessing of step (1) is implemented by the following formula:
wherein μ is the mean of the pictures, X represents the picture tensor, σ represents the standard deviation, max represents the maximum value of the picture tensor, min represents the minimum value of the picture tensor, K 1 Representing normalized picture tensor, x 0 Representing the normalized picture tensor.
3. The method of image classification based on compressed excitation and tightly-coupled convolutional neural networks of claim 1, wherein the ratio of training set to test set in step (1) is 5:1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910987689.4A CN110766063B (en) | 2019-10-17 | 2019-10-17 | Image classification method based on compressed excitation and tightly connected convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910987689.4A CN110766063B (en) | 2019-10-17 | 2019-10-17 | Image classification method based on compressed excitation and tightly connected convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110766063A CN110766063A (en) | 2020-02-07 |
CN110766063B true CN110766063B (en) | 2023-04-28 |
Family
ID=69332111
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910987689.4A Active CN110766063B (en) | 2019-10-17 | 2019-10-17 | Image classification method based on compressed excitation and tightly connected convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110766063B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598126A (en) * | 2020-04-08 | 2020-08-28 | 天津大学 | Lightweight traditional Chinese medicinal material identification method |
CN111523483B (en) * | 2020-04-24 | 2023-10-03 | 北京邮电大学 | Chinese meal dish image recognition method and device |
CN111709446B (en) * | 2020-05-14 | 2022-07-26 | 天津大学 | X-ray chest radiography classification device based on improved dense connection network |
CN111783558A (en) * | 2020-06-11 | 2020-10-16 | 上海交通大学 | Satellite navigation interference signal type intelligent identification method and system |
CN111832577A (en) * | 2020-07-19 | 2020-10-27 | 武汉悟空游人工智能应用软件有限公司 | Sensitivity prediction method based on dense connection |
CN112183468A (en) * | 2020-10-27 | 2021-01-05 | 南京信息工程大学 | Pedestrian re-identification method based on multi-attention combined multi-level features |
CN112464732B (en) * | 2020-11-04 | 2022-05-03 | 北京理工大学重庆创新中心 | Optical remote sensing image ground feature classification method based on double-path sparse hierarchical network |
CN112488003A (en) * | 2020-12-03 | 2021-03-12 | 深圳市捷顺科技实业股份有限公司 | Face detection method, model creation method, device, equipment and medium |
CN113222124B (en) * | 2021-06-28 | 2023-04-18 | 重庆理工大学 | SAUNet + + network for image semantic segmentation and image semantic segmentation method |
CN113642231A (en) * | 2021-07-09 | 2021-11-12 | 西北大学 | CNN-GRU landslide displacement prediction method based on compression excitation network and application |
CN115240006B (en) * | 2022-07-29 | 2023-09-19 | 南京航空航天大学 | Convolutional neural network optimization method and device for target detection and network structure |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647775A (en) * | 2018-04-25 | 2018-10-12 | 陕西师范大学 | Super-resolution image reconstruction method based on full convolutional neural networks single image |
-
2019
- 2019-10-17 CN CN201910987689.4A patent/CN110766063B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647775A (en) * | 2018-04-25 | 2018-10-12 | 陕西师范大学 | Super-resolution image reconstruction method based on full convolutional neural networks single image |
Non-Patent Citations (3)
Title |
---|
SESR:Single Image Super Resolution with Recursive Squeeze and Excitation Networks;Xi Cheng et al.;《IEEE》;20181231;全文 * |
基于改进卷积神经网络的图像分类方法;胡貌男等;《通信技术》;20181110(第11期);全文 * |
基于深度学习的马铃薯畸形检测方法研究;汪成龙等;《惠州学院学报》;20180628(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110766063A (en) | 2020-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110766063B (en) | Image classification method based on compressed excitation and tightly connected convolutional neural network | |
CN108647742B (en) | Rapid target detection method based on lightweight neural network | |
CN109389037B (en) | Emotion classification method based on deep forest and transfer learning | |
CN110084281A (en) | Image generating method, the compression method of neural network and relevant apparatus, equipment | |
CN109544524A (en) | A kind of more attribute image aesthetic evaluation systems based on attention mechanism | |
CN106803069A (en) | Crowd's level of happiness recognition methods based on deep learning | |
CN111582397B (en) | CNN-RNN image emotion analysis method based on attention mechanism | |
CN110175628A (en) | A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation | |
Jiang et al. | Cascaded subpatch networks for effective CNNs | |
CN111696101A (en) | Light-weight solanaceae disease identification method based on SE-Inception | |
CN113516133B (en) | Multi-modal image classification method and system | |
CN111832546A (en) | Lightweight natural scene text recognition method | |
CN112101364B (en) | Semantic segmentation method based on parameter importance increment learning | |
CN115223082A (en) | Aerial video classification method based on space-time multi-scale transform | |
CN112861659B (en) | Image model training method and device, electronic equipment and storage medium | |
CN113420651B (en) | Light weight method, system and target detection method for deep convolutional neural network | |
CN113051399A (en) | Small sample fine-grained entity classification method based on relational graph convolutional network | |
CN113269224A (en) | Scene image classification method, system and storage medium | |
US20230222768A1 (en) | Multiscale point cloud classification method and system | |
CN114168795B (en) | Building three-dimensional model mapping and storing method and device, electronic equipment and medium | |
CN113378812A (en) | Digital dial plate identification method based on Mask R-CNN and CRNN | |
CN114492634B (en) | Fine granularity equipment picture classification and identification method and system | |
CN114170657A (en) | Facial emotion recognition method integrating attention mechanism and high-order feature representation | |
CN113408418A (en) | Calligraphy font and character content synchronous identification method and system | |
CN117033609A (en) | Text visual question-answering method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |