CN114565792A - Image classification method and device based on lightweight convolutional neural network - Google Patents

Image classification method and device based on lightweight convolutional neural network Download PDF

Info

Publication number
CN114565792A
CN114565792A CN202210189921.1A CN202210189921A CN114565792A CN 114565792 A CN114565792 A CN 114565792A CN 202210189921 A CN202210189921 A CN 202210189921A CN 114565792 A CN114565792 A CN 114565792A
Authority
CN
China
Prior art keywords
layer
sampling
neural network
feature map
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210189921.1A
Other languages
Chinese (zh)
Inventor
王天江
张量奇
沈海波
罗逸豪
曹翔
潘蕾西兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202210189921.1A priority Critical patent/CN114565792A/en
Publication of CN114565792A publication Critical patent/CN114565792A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image classification method and device based on a lightweight convolutional neural network, and belongs to the field of image classification in computer deep learning. The invention comprises the following steps: s1: constructing a lightweight convolutional neural network model, wherein the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling concatenation unit, global pooling layer and full tie layer, the sampling concatenation unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer; s2: and inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result. According to the method, the lightweight convolutional neural network model with low parameter number, low calculation amount and high inference speed is constructed, and then the lightweight convolutional neural network model is used for image classification, so that compared with the prior lightweight convolutional neural network model, the method greatly reduces the parameter number and greatly improves the classification speed of the model under the similar classification accuracy.

Description

Image classification method and device based on lightweight convolutional neural network
Technical Field
The invention belongs to the field of deep learning image classification, and particularly relates to an image classification method and device based on a lightweight convolutional neural network.
Background
In recent years, Convolutional Neural Networks (CNNs) have been widely used in computer vision, such as image classification, and in order to improve the accuracy of classification, the depth and width of CNN models are rapidly increased, so that the number of model parameters and the amount of computation are rapidly increased, which hinders the application of CNNs to devices with weak computing power.
For applications on mobile or embedded devices, lightweight models are an important approach. At present, a method for constructing a model by a layer of deep Convolution and a layer of point-by-point Convolution based on a Deep Separable Convolution (DSCs) algorithm and searching for an optimal Architecture by using a Neural Architecture Search (NAS) algorithm has achieved remarkable success, such as a MobileNet series and an EfficientNet series. These models all use depth separable convolution instead of standard convolution, and the parameters and the calculation amount of the depth separable convolution are reduced by several times compared with the standard convolution; after the backbone of the network architecture is determined, the NAS algorithm is used for searching the optimal model again. Such as the width of each layer, etc., so that the classification accuracy of the model is guaranteed with a reduced number of model parameters and computations.
The method greatly reduces the parameter amount and the calculation amount, but the inference speed on the GPU is not improved or even reduced compared with the classical network such as ResNet. The reason for this is that the GPU resources cannot be better utilized in the deep separable convolution, and these networks use more network layers than the classical networks with the same accuracy, including more nonlinear activation layers and batch normalization layers, which all affect the inference speed of the model.
Disclosure of Invention
In view of the above drawbacks or needs for improvement of the prior art, the present invention provides an image classification method and apparatus based on a lightweight convolutional neural network, which aims to design a lightweight neural network model comprising sequentially connected: standard convolution layer, a plurality of sampling concatenation unit, global pooling layer and full tie layer, the sampling concatenation unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer; inputting the pictures to be classified into the lightweight convolutional neural network model to obtain a classification result; therefore, the inference speed of the model is improved while the parameter quantity of the convolutional neural network model is reduced.
To achieve the above object, according to one aspect of the present invention, there is provided a method of designing a lightweight convolutional neural network architecture, comprising:
s1: constructing a lightweight convolutional neural network model, wherein the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling concatenation unit, global pooling layer and full-link layer, the sampling concatenation unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer;
s2: inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result, wherein the classification result comprises the following steps:
s21: expanding the channels of the pictures to be classified into the specified channel number by using the standard convolution layer so as to obtain an original characteristic diagram;
s22: utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target feature map into a third adjacent sampling splicing unit, and so on until the final sampling splicing unit outputs a final target feature map;
s23: and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, reducing the dimensionality, and inputting the final target characteristic diagram into a full-connection layer, so that the full-connection layer outputs the classification result corresponding to the classification picture.
In one embodiment, the down-sampling layer is configured to include, connected in sequence: a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer; the down-sampling layer outputs two groups of output characteristic graphs;
the input feature map is subjected to Gaussian down-sampling layer to obtain a group of output feature maps;
and sequentially inputting the input characteristic diagram into a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer in the down-sampling layer to obtain another group of output characteristic diagrams.
In one embodiment, a gaussian down-sampling layer in the down-sampling layers performs convolution operation on the feature map output by the previous layer, and the resolution of the output feature map is half of that of the input feature map.
In one embodiment, the point-by-point convolution layer in the down-sampling layer expands or contracts the input feature map according to the number of feature channels input and output.
In one embodiment, a ReLU is used as the activation function in a non-linear activation layer in the downsampling layer.
In one embodiment, the generic layer comprises, connected in sequence: the device comprises a depth convolutional layer, a splicing layer, a point-by-point convolutional layer, a nonlinear activation layer and a batch standardization layer;
inputting a group of characteristic graphs and a group of characteristic graphs into each universal layer; b, inputting the characteristic maps into the depth convolution layer to carry out convolution to obtain c characteristic maps; inputting the a group of characteristic diagrams and the c group of characteristic diagrams into a splicing layer, a point-by-point convolution layer, a nonlinear activation layer and a batch standardization layer in the general layer to obtain a new b group of characteristic diagrams;
and the b group feature map is used as a new a group feature map and the new b group feature map is used as the input of the next adjacent general layer.
According to another aspect of the present invention, there is provided an image classification apparatus based on a lightweight convolutional neural network, including:
the building module is used for building a lightweight convolutional neural network model, and the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling concatenation unit, global pooling layer and full tie layer, the sampling concatenation unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer;
the classification module is used for inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result, and specifically comprises the following steps:
expanding the channels of the pictures to be classified into the specified channel number by utilizing the standard convolutional layer so as to obtain an original characteristic diagram;
utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target characteristic diagram into a third adjacent sampling splicing unit, and repeating the steps until the last sampling splicing unit outputs a final target characteristic diagram;
and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, and inputting the final target characteristic diagram into the full connection layer after the global pooling layer has low dimensionality so that the full connection layer outputs the classification result corresponding to the classification picture.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
according to the method, a lightweight convolutional neural network model with low parameter, low calculation amount and high inference speed is constructed by modifying a network unit structure, and then the lightweight convolutional neural network model is used for carrying out picture classification; the image classification method has the advantages that the parameter number and the calculated amount of the model are low, the inference speed on the GPU is higher, and the image classification efficiency can be improved by utilizing the model to classify the images.
Drawings
FIG. 1 is a diagram illustrating a partial convolution kernel of a visualized MobileNet V2 downsampled layer, the convolution kernel having a size of 1 × 3 × 3 in accordance with an embodiment of the present invention;
FIG. 2 is a diagram illustrating a partial convolution kernel of a non-downsampled layer of EfficientNet-B0 visualized in an embodiment of the present invention, where the convolution kernel size is 1 × 5 × 5;
FIG. 3 is a partially similar feature view of layers 5 and 7 of adjacent layers of RegNetX-400MF visualized in one embodiment of the invention;
FIG. 4 is a schematic diagram of a downsampling layer constructed in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a generic layer constructed in one embodiment of the invention;
fig. 6 is a flowchart of an image classification method based on a lightweight convolutional neural network according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The design basis of the lightweight convolutional neural network architecture is provided through the visual characteristic diagram and the convolutional kernel. It should be noted that the visualized network may be any depth separable convolutional neural network with trained parameters; and performing normalization operation on the characteristic diagram and the convolution kernel in the visualization process, and if the central value of the convolution kernel is a negative number, multiplying the whole convolution kernel by-1.
According to 1, as shown in fig. 1, most of convolution kernels in a downsampled layer (a depth convolution layer with a step size of 2) are approximate to a gaussian blur convolution kernel, and a blurring operation performed before downsampling conforms to a sampling theorem, so that in a network constructed by the method, the gaussian blur kernel is used for replacing the convolution kernels in the downsampled layer. It should be noted that fig. 1 is composed of convolution kernels of 3 × 3 size (each smallest square represents one pixel, 9 pixels), and darker pixel values are larger. Many of these 3x3 convolution kernels are similar to gaussian convolution kernels, i.e., two gaussian kernels with different variances. So an attempt can be made to use gaussian convolution kernels instead.
According to 2, as shown in fig. 2, most of the convolution kernels in the depth convolution layers except the downsampling layer are approximate to the identical kernel (namely, only the convolution kernel with a value in the center), so that the identical kernel is used to replace part of the depth convolution kernels in the constructed network, which is equivalent to directly removing the convolution kernels of part of the depth convolution layers. Note that to improve accuracy and computational efficiency, the nonlinear activation layer and batch normalization layer are no longer used after the deep convolutional layer. It should be noted that fig. 2 also includes a plurality of convolution kernels of 3 × 3. Many of these convolution kernels are similar to the Identity Kernel (Identity Kernel, with 1 in the middle and 0 in the others) of 3x3, and the Identity Kernel can be directly removed, thereby reducing the amount of computation.
According to 3, as shown in fig. 3, a large part of feature maps output by adjacent layers are divided into similar repeated feature maps, so that partial feature maps are multiplexed in the adjacent layers through identity mapping, and the multiplexed feature maps do not participate in the next deep convolution operation, which ensures that the network width is not changed and reduces the parameter number and the calculation amount by half.
The invention provides an image classification method based on a lightweight convolutional neural network, which comprises the following steps:
s1: constructing a lightweight convolutional neural network model, wherein the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling concatenation unit, global pooling layer and full tie layer, the sampling concatenation unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer;
s2: inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result, wherein the classification result comprises the following steps:
s21: expanding the channels of the pictures to be classified into the specified channel number by using the standard convolution layer so as to obtain an original characteristic diagram;
s22: utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target feature map into a third adjacent sampling splicing unit, and so on until the final sampling splicing unit outputs a final target feature map;
s23: and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, reducing the dimensionality, and inputting the final target characteristic diagram into a full-connection layer, so that the full-connection layer outputs the classification result corresponding to the classification picture.
In one embodiment, the down-sampling layer is configured to include, connected in sequence: a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer; the down-sampling layer outputs two groups of output characteristic graphs;
the input feature map is subjected to Gaussian down-sampling layer to obtain a group of output feature maps;
and sequentially inputting the input characteristic diagram into a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer in the down-sampling layer to obtain another group of output characteristic diagrams.
As shown in fig. 4, the down-sampling layer provided by the present application is composed of 4 layers: gaussian down-sampling layer (depth convolution layer with step length of 2 after being replaced by Gaussian convolution kernel), point-by-point convolution layer, nonlinear activation layer, batch normalization layer. The Gaussian down-sampling layer performs convolution operation on the feature map output by the upper layer, and the resolution of the output feature map is half of that of the input feature map; expanding or contracting the input characteristic diagram by the point-by-point convolution layer according to the number of input and output characteristic channels; the nonlinear activation layer and the batch normalization layer respectively carry out nonlinear activation and normalization operation on the output characteristics of the previous layer; finally, the output characteristic diagram of the Gaussian down-sampling layer and the output characteristic diagram of the batch normalization layer are used together as the output of the unit. In particular, ReLU is used as the activation function in the nonlinear activation layer.
In one embodiment, a gaussian down-sampling layer in the down-sampling layers performs convolution operation on the feature map output by the previous layer, and the resolution of the output feature map is half of that of the input feature map.
In one embodiment, the point-by-point convolution layer in the down-sampling layer expands or contracts the channel of the input characteristic according to the number of the characteristic channels of the input and output.
In one embodiment, a ReLU is used as the activation function in a non-linear activation layer in the downsampling layer.
In one embodiment, the generic layer comprises, connected in sequence: the device comprises a depth convolutional layer, a splicing layer, a point-by-point convolutional layer, a nonlinear activation layer and a batch standardization layer;
inputting a group of characteristic graphs and a group of characteristic graphs into each universal layer; b, inputting the characteristic maps into the depth convolution layer to carry out convolution to obtain c characteristic maps; inputting the a group of characteristic diagrams and the c group of characteristic diagrams into a splicing layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer in the general layer to obtain a new b group of characteristic diagrams;
and the new b-group feature map is used as the input of the next adjacent general layer as the new round a-group feature map and the input b-group feature map in the new round.
As shown in fig. 5, the common layer is composed of 5 layers in total: deep convolutional layer, splicing layer, point-by-point convolutional layer, nonlinear activation layer, batch normalization layer. According to the criterion 2 in the step one, half of convolution kernels in the depth convolution layer are removed, namely the depth convolution layer only carries out convolution operation on the input 2 nd group of feature maps; the splicing layer splices the 1 st group of feature maps input by the unit and the feature maps output by the depth convolution layer into 1 group as output; according to the reference 3 in the step one, in order to multiplex the characteristic diagrams and reduce the parameter number, the output characteristic diagrams of the splicing layer are processed by the convolution layer point by point and the number of channels is reduced by half, and then the processing is carried out by a nonlinear activation layer and a batch normalization layer; finally, the unit takes the input 2 nd set of feature maps as the output 1 st set of feature maps, and takes the output of the point-by-point convolution layer as the 2 nd set of output feature maps of the unit.
In the present invention, a lightweight neural network is constructed using a downsampled layer (down block) and a generic layer (HalfConvBlock), as shown in fig. 6:
1. expanding the input picture channels to a specified number of channels using standard convolution;
2. downsampling and expanding the number of channels by using a DownBlock, and outputting 2 groups of feature maps;
3. repeatedly using HalfConvBlock to extract features, wherein input and output are 2 groups of feature graphs;
4. splicing 2 groups of characteristic graphs into 1 group;
4. the operations of 2, 3 and 4 were repeated 3 times again;
5. and outputting a final classification result by using the full connection layer.
It should be noted that the final model may set the number of layers of different layers according to specific requirements (such as the number of parameters or the amount of calculation), for example, the specific details of constructing a network with the number of parameters of 1.5M are shown in table 1 (the splicing layer after each general layer is omitted in the table).
Figure BDA0003524869390000081
Figure BDA0003524869390000091
TABLE 1
When the parameter number is limited to 1.5M or 2.5M and the similar accuracy of other models is reached, the patent model greatly leads both the parameter number and the inference speed, as shown in Table 2 (accuracy on ImageNet when the patent model uses 1.5M parameters, GPU inference speed is measured on RTX 6000) and Table 3. In addition, the embodiments constructed according to the parameter quantities of 1.5M and 2.5M are only used to illustrate the technical solution of the present invention, and it should be understood by those skilled in the art that only the modifications or equivalent substitutions are made to the technical solution of the present invention, especially only the number of network layers and the number of channels are modified, without departing from the spirit and scope of the technical solution.
Figure BDA0003524869390000092
TABLE 2
Figure BDA0003524869390000093
Figure BDA0003524869390000101
TABLE 3
According to another aspect of the present invention, there is provided an image classification apparatus based on a lightweight convolutional neural network, including:
the building module is used for building a lightweight convolutional neural network model, and the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling splice unit and full articulamentum, the sampling splice unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer;
the classification module is used for inputting the pictures to be classified into the lightweight convolutional neural network model to obtain classification results, and specifically comprises the following steps:
expanding the channels of the pictures to be classified into the specified channel number by using the standard convolution layer so as to obtain an original characteristic diagram;
utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target feature map into a third adjacent sampling splicing unit, and so on until the final sampling splicing unit outputs a final target feature map;
and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, reducing the dimensionality, and then inputting the final target characteristic diagram into the full connection layer, so that the full connection layer outputs the classification result corresponding to the classification picture.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. An image classification method based on a lightweight convolutional neural network, comprising:
s1: constructing a lightweight convolutional neural network model, wherein the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling splice unit and full articulamentum, the sampling splice unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a splicing layer;
s2: inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result, and specifically comprises the following steps:
s21: expanding the channels of the pictures to be classified into the specified channel number by using the standard convolution layer so as to obtain an original characteristic diagram;
s22: utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target characteristic diagram into a third adjacent sampling splicing unit, and repeating the steps until the last sampling splicing unit outputs a final target characteristic diagram;
s23: and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, reducing the dimensionality, and then inputting the final target characteristic diagram into the full connection layer, so that the full connection layer outputs the classification result corresponding to the classification picture.
2. The method for image classification based on a lightweight convolutional neural network as claimed in claim 1,
the down-sampling layer is used for sequentially connecting: a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer; the down-sampling layer outputs two groups of output characteristic graphs;
the input feature map is subjected to Gaussian down-sampling layer to obtain a group of output feature maps;
and sequentially inputting the input characteristic diagram into a Gaussian down-sampling layer, a point-by-point convolution layer, a nonlinear activation layer and a batch normalization layer in the down-sampling layer to obtain another group of output characteristic diagrams.
3. The method for image classification based on a lightweight convolutional neural network as claimed in claim 2,
and a Gaussian down-sampling layer in the down-sampling layers performs convolution operation on the feature map output by the upper layer, and the resolution of the output feature map is half of that of the input feature map.
4. The method for image classification based on a lightweight convolutional neural network according to claim 2,
and the point-by-point convolution layer in the downsampling layer expands or contracts the channel of the input characteristic according to the number of the characteristic channels of input and output.
5. The method for image classification based on a lightweight convolutional neural network as claimed in claim 2,
ReLU is used as an activation function in a non-linear activation layer in the down-sampling layer.
6. The method for image classification based on a lightweight convolutional neural network as claimed in claim 1,
the general layer comprises the following components connected in sequence: the device comprises a depth convolutional layer, a splicing layer, a point-by-point convolutional layer, a nonlinear activation layer and a batch standardization layer;
inputting a group of characteristic graphs and a group of characteristic graphs into each universal layer; b, inputting the characteristic maps into the depth convolution layer to carry out convolution to obtain c characteristic maps; inputting the a group of characteristic diagrams and the c group of characteristic diagrams into a splicing layer, a point-by-point convolution layer, a nonlinear activation layer and a batch standardization layer in the general layer to obtain a b group of characteristic diagrams;
and the new b-group feature map is used as the input of the next adjacent general layer as the new round a-group feature map and the input b-group feature map in the new round.
7. An image classification device based on a lightweight convolutional neural network, for performing the method of any of claims 1-6, the image classification device comprising:
the building module is used for building a lightweight convolutional neural network model, and the lightweight neural network model comprises the following components in sequential connection: standard convolution layer, a plurality of sampling splice unit and full articulamentum, the sampling splice unit is including connecting gradually: a downsampling layer, a plurality of generic layers, and a stitching layer;
the classification module is used for inputting the picture to be classified into the lightweight convolutional neural network model to obtain a classification result, and specifically comprises the following steps:
expanding the channels of the pictures to be classified into the specified channel number by using the standard convolution layer so as to obtain an original characteristic diagram;
utilizing a downsampling layer in a first sampling splicing unit to downsample the original feature map to obtain two groups of first feature maps, then utilizing a plurality of general layers to respectively perform feature extraction on the two groups of first feature maps to obtain respectively corresponding second feature maps, and utilizing the splicing layer to splice the two groups of second feature maps to obtain a first target feature map; inputting the first target feature map into a second adjacent sampling and splicing unit so as to perform downsampling, feature extraction and splicing on the first target feature map to obtain a second target feature map; inputting the second target feature map into a third adjacent sampling splicing unit, and so on until the final sampling splicing unit outputs a final target feature map;
and inputting the final target characteristic diagram output by the last sampling and splicing unit into the global pooling layer, reducing the dimensionality, and then inputting the final target characteristic diagram into the full connection layer, so that the full connection layer outputs the classification result corresponding to the classification picture.
CN202210189921.1A 2022-02-28 2022-02-28 Image classification method and device based on lightweight convolutional neural network Pending CN114565792A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210189921.1A CN114565792A (en) 2022-02-28 2022-02-28 Image classification method and device based on lightweight convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210189921.1A CN114565792A (en) 2022-02-28 2022-02-28 Image classification method and device based on lightweight convolutional neural network

Publications (1)

Publication Number Publication Date
CN114565792A true CN114565792A (en) 2022-05-31

Family

ID=81715751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210189921.1A Pending CN114565792A (en) 2022-02-28 2022-02-28 Image classification method and device based on lightweight convolutional neural network

Country Status (1)

Country Link
CN (1) CN114565792A (en)

Similar Documents

Publication Publication Date Title
CN109964250B (en) Method and system for analyzing images in convolutional neural networks
CN110084274B (en) Real-time image semantic segmentation method and system, readable storage medium and terminal
CN111209910A (en) Systems, methods, and non-transitory computer-readable media for semantic segmentation
CN112990219B (en) Method and device for image semantic segmentation
CN110598788A (en) Target detection method and device, electronic equipment and storage medium
CN115358932B (en) Multi-scale feature fusion face super-resolution reconstruction method and system
CN112183295A (en) Pedestrian re-identification method and device, computer equipment and storage medium
CN112598110B (en) Neural network construction method, device, equipment and medium
CN112419152A (en) Image super-resolution method and device, terminal equipment and storage medium
CN111709415B (en) Target detection method, device, computer equipment and storage medium
CN112419191A (en) Image motion blur removing method based on convolution neural network
CN111882053B (en) Neural network model compression method based on splicing convolution
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN115082928A (en) Method for asymmetric double-branch real-time semantic segmentation of network for complex scene
Hua et al. Dynamic scene deblurring with continuous cross-layer attention transmission
CN111414823B (en) Human body characteristic point detection method and device, electronic equipment and storage medium
CN114494006A (en) Training method and device for image reconstruction model, electronic equipment and storage medium
CN110490876B (en) Image segmentation method based on lightweight neural network
CN111882028A (en) Convolution operation device for convolution neural network
CN111967478A (en) Feature map reconstruction method and system based on weight inversion, storage medium and terminal
CN116029905A (en) Face super-resolution reconstruction method and system based on progressive difference complementation
CN114565792A (en) Image classification method and device based on lightweight convolutional neural network
CN115578561A (en) Real-time semantic segmentation method and device based on multi-scale context aggregation network
CN113688783B (en) Face feature extraction method, low-resolution face recognition method and equipment
CN112529064B (en) Efficient real-time semantic segmentation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination