CN113065653A - Design method of lightweight convolutional neural network for mobile terminal image classification - Google Patents

Design method of lightweight convolutional neural network for mobile terminal image classification Download PDF

Info

Publication number
CN113065653A
CN113065653A CN202110462584.4A CN202110462584A CN113065653A CN 113065653 A CN113065653 A CN 113065653A CN 202110462584 A CN202110462584 A CN 202110462584A CN 113065653 A CN113065653 A CN 113065653A
Authority
CN
China
Prior art keywords
network
channels
mainnet
module
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110462584.4A
Other languages
Chinese (zh)
Inventor
袁海英
成君鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110462584.4A priority Critical patent/CN113065653A/en
Publication of CN113065653A publication Critical patent/CN113065653A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a design method of a lightweight convolutional neural network for mobile terminal image classification, which comprises the following steps: designing a lightweight convolutional neural network, and processing input characteristics with different resolutions by adopting a MainNet network and an AuxiliaryNet network; training and testing the network model; performing structure pruning on the trained network, and determining a clipping threshold value in the MainNet and AuxiliaryNet networks by using a sum of total ratio method and a k average method respectively; reconstructing the lightweight convolutional neural network according to the cutting condition, and adjusting the number of channels of each layer of the network so as to balance the classification precision and the model complexity; and (4) reconstructing the model for multiple times, training and pruning the reconstructed model to obtain the final model with excellent performance. The network model finally obtained by the method has high classification accuracy, the parameter and the calculated amount are far smaller than those of other mainstream networks, and the pressure of the deployment of the convolutional neural network in a mobile terminal is effectively reduced.

Description

Design method of lightweight convolutional neural network for mobile terminal image classification
Technical Field
The invention relates to the technical field of deep learning and image processing, in particular to a design method of a lightweight convolutional neural network for mobile terminal image classification.
Background
Deep learning shows huge potential in the computer vision field (image processing, target detection, video analysis and the like), and with the rapid increase of application requirements of the fields of industry, security, traffic, internet and the like on embedded systems and mobile terminal devices, a convolutional neural network model facing a real-time image classification task faces new technical challenges. Since most convolutional neural network models need to be run on a PC or a server, huge calculation amount and parameter are often involved, and such large-scale network models cannot be deployed on resource-limited mobile terminal devices. Therefore, the method for researching the general and efficient lightweight convolutional neural network structure design and model compression has wide application prospect and important engineering value aiming at the problem of real-time image classification in the application of the mobile terminal.
Disclosure of Invention
The purpose of the invention is as follows: when the mobile terminal device executes a real-time image classification task, the mobile terminal device is limited by hardware resources and application scenes, and a large-scale convolutional neural network model is often difficult to deploy. The invention improves the existing lightweight convolutional neural network model in a targeted manner, greatly reduces the operation and storage cost of the model on the premise of ensuring the model performance, and ensures that the model is easy to deploy on mobile terminal equipment. Since the computational cost of the convolutional neural network model is largely dependent on the input image size, the model can save 75% of the computation when the input size is reduced to half. Therefore, the invention designs a lightweight convolutional neural network for image classification of a mobile terminal, which mainly comprises a MainNet part and an AuxiliaryNet part, wherein the AuxiliaryNet part is used for extracting the characteristic information of an input image, the MainNet part is used for extracting the characteristic information of the input image after down sampling, and the output characteristics of the two parts are spliced; different pruning strategies are adopted for MainNet and AuxiliaryNet, so that the calculated amount and the parameter amount of the model are reduced on the premise of ensuring the classification performance of the model; and the network structure is reconstructed according to the pruning effect, so that the classification performance and the calculation complexity of the model are balanced, and the calculation amount and the parameter quantity of the model are further reduced.
The technical scheme is as follows: in order to achieve the purpose, the invention provides a design method of a lightweight convolutional neural network for mobile terminal image classification, which comprises the following steps:
step 1: a light-weight convolutional neural network is designed by adopting a deep separable convolution as a main structure of the network; the designed lightweight convolutional neural network mainly comprises 3-4 modules, wherein when the resolution of an original input image is less than 224 × 224, 3 modules are used, and when the resolution of the original input image is more than or equal to 224 × 224, 4 modules are used; each module consists of a MainNet part and an AuxiliaryNet part; the MainNet is a main body of the designed lightweight network, and the number of channels of the output characteristic diagram is equal to the number of channels of the input characteristic diagram; AuxiliaryNet is a supplementary network of MainNet, is used for obtaining the characteristic diagram information before down-sampling, and controls the number of output characteristic diagram channels by using a coefficient alpha, wherein the default initial value of alpha is 1; the two network outputs are spliced and sent to the next module through branch fusion, and the number of channels of the output characteristics is controlled by a coefficient beta; if the output channel numbers of the MainNet and the AuxiliaryNet are a and b respectively, the channel number c after branch fusion is beta (a + b), and the default initial value of beta is 1;
step 2: training the network in the step 1 under a training set, wherein the training iteration times are 160; after training is finished, testing is carried out on the test set, and the accuracy and the calculated amount of the model are recorded;
and step 3: carrying out channel pruning on the trained network; channel pruning belongs to structural pruning, and the simplification of a model is realized by pruning unimportant channels in a network; in the pruning process, the clipping threshold is determined by a BN layer gamma factor added in the Loss function;
respectively determining cutting threshold values in the MainNet network and the AuxiliaryNet network by using a sum-of-sum ratio method and a k-average method; for the MainNet, determining a cutting threshold value in a mode of a ratio of the sum of the two; setting a total of f factors gamma arranged from small to large12,...,γk,...,γfThe sum of which is Zsum(ii) a Are summed up in sequence, denoted as
Figure BDA0003042875600000021
During the accumulation process, when the first satisfaction
Figure BDA0003042875600000022
When k is recorded, the threshold th ═ y (γ)k-1k)/2;For AuxiliaryNet, a K-means clustering mode is adopted, and the clustering number is 2: classified in the neighborhood of 0; the rest part is classified into another class, and the minimum value of the class is taken as a threshold value;
and 4, step 4: reconstructing the model according to the cutting condition in the step 3; setting cutting rates p and q to respectively reflect the cutting conditions of each convolution layer in the MainNet and the AuxiliaryNet, wherein n is the repeated times in the MainNet, X is the number of channels to be cut in each layer, Y is the total number of channels in each layer, and the total number of channels in the MainNet is the number of channels after expansion;
Figure BDA0003042875600000031
q=X/Y
the layer with high clipping rate has lower importance, and the number of channels is reduced; the layer with low clipping rate has higher importance, and the number of channels is increased; reassigning the channel control coefficients alpha and beta into alpha 'and beta' according to the clipping rates p and q to realize model reconstruction; controlling the channel number of AuxiliaryNet by alpha, and adjusting the alpha in each module according to the cutting rate q of AuxiliaryNet in the module; beta controls the number of output channels after branch fusion, beta in the first module is adjusted according to the cutting rate p of MainNet in the second module, beta in the second module is adjusted according to the cutting rate p of MainNet in the third module, and so on, and the value of beta in the last module is not adjusted; the α and β are reassigned as α 'and β':
Figure BDA0003042875600000032
Figure BDA0003042875600000033
and 5: training the reconstructed network model on a training set, wherein the training iteration times are 160 times; after the training is finished, pruning is carried out according to the method in the step 3, the model after pruning is tested on a test set, and the accuracy and the calculated amount of the model are recorded;
step 6: taking the model accuracy and the calculated amount obtained in the step 2 as references, and judging whether the calculated amount of the network model obtained in the step 5 is reduced under the condition of keeping the accuracy (the accuracy is reduced by less than 1%); if the accuracy rate is maintained and the calculated amount is reduced, repeating the steps 4 and 5; and if the accuracy rate is reduced by more than 1% or the calculated amount is not reduced any more, outputting the model at the moment as a final model.
Optionally, the MainNet in step 1 specifically includes: the MainNet is a main body of the designed lightweight network and is used for processing the characteristic diagram information after down sampling; if the resolution of the input image is K x K, the resolution of the down-sampled image is (K/2) x (K/2); the basic module of the MainNet is bottleeck, and the repetition frequency of the bottleeck in the module is n; the bottleeck includes the following operations: (1) performing pointwise convolution to expand channels, wherein the number of input channels is CinThe number of channels after expansion is CinT, where t is the expansion coefficient that controls the degree of expansion; (2) performing depthwise convolution operation to realize data processing, and extracting accurate image characteristic information in a higher spatial dimension; (3) reducing the channel number to input dimension by adopting pointwise convolution operation, and outputting the channel number Cout=CinThat is, the number of channels of each bottleeck of the MainNet before and after the liter dimension is kept unchanged; furthermore, bottleeck adopts a residual structure and linearly adds the input and output results.
Further, in the process of training, the resolution of the feature map of the input network is gradually reduced by the expansion coefficient t, and t is decreased gradually; when the number of modules is 3, the value of t in each module is 6, 4 and 2 respectively; when the number of modules is 4, the value of t in each module is 6, 4, 2, 1, respectively.
Optionally, when the module is 3, the value of n in each module is 3, 4, 1; when a module is 4, the value of n in each module is 2, 3, 2, 1, respectively.
Optionally, the auxiliaryenet in step 1 specifically includes: AuxiliaryNet is a supplementary network of MainNet and is used for acquiring feature map information before down-sampling; first, AuxiliaryNet passesControlling the number of channels of the input image by poitwise convolution; unlike MainNet, here pointwise convolution aims at reducing the number of input channels, the degree of reduction of the number of channels is controlled by a coefficient α, which defaults to an initial value of 1; when the number of input feature map channels is CinThen, the number of channels of the feature map after pointwise convolution is equal to Cinα; secondly, AuxiliaryNet performs depthwise convolution of 3 multiplied by 3, wherein the convolution step length is 2, so as to ensure that the size of an output characteristic diagram is consistent with that of MainNet; finally, fusing the channel information by using pointwise, and outputting the channel number C of the feature mapout=Cin*α。
The technical scheme adopted by the invention has the advantages and beneficial effects that:
the invention provides a design method of a lightweight convolutional neural network for mobile terminal image classification. By carrying out lightweight design on the network structure, the network scale is greatly reduced on the premise of not sacrificing the classification performance, and the storage cost and the calculated amount are effectively reduced; adopting a targeted pruning strategy, reconstructing the lightweight convolutional neural network according to the cutting condition, and adjusting the number of channels of each layer of the network so as to balance the classification precision and the model complexity; and (4) reconstructing the model for multiple times, training and pruning the reconstructed model to obtain the final model with excellent performance. The finally obtained network model has good classification performance, the parameter quantity and the calculated quantity are far smaller than those of a mainstream network, and the method can adapt to the deployment requirement of the mobile terminal equipment with limited hardware resources.
Drawings
FIG. 1 is a flow chart of the steps of the present invention;
FIG. 2 is a schematic diagram of the structure of branch fusion;
FIG. 3 is a schematic structural diagram of a designed MainNet;
FIG. 4 is a schematic structural diagram of a designed AuxiliaryNet;
Detailed Description
The embodiment relates to a design method of a lightweight convolutional neural network facing to mobile terminal image classification, and the specific flow is shown in fig. 1, and the method comprises the following 6 steps:
step 1: a light-weight convolutional neural network is designed by adopting a deep separable convolution as a main structure of the network; the designed lightweight convolutional neural network mainly comprises 3-4 modules, wherein when the resolution of an original input image is less than 224 × 224, 3 modules are used, and when the resolution of the original input image is more than or equal to 224 × 224, 4 modules are used; each module consists of a MainNet part and an AuxiliaryNet part; the MainNet is a main body of the designed lightweight network, and the number of channels of the output characteristic diagram is equal to the number of channels of the input characteristic diagram; AuxiliaryNet is a supplementary network of MainNet, is used for obtaining the characteristic diagram information before down-sampling, and controls the number of output characteristic diagram channels by using a coefficient alpha, wherein the default initial value of alpha is 1; the two network outputs are spliced and sent to the next module through branch fusion, referring to fig. 2, the number of channels of the output characteristics is controlled by a coefficient beta; if the output channel numbers of the MainNet and the auxiarynet are a and b, respectively, the channel number c after branch fusion is β (a + b), and β defaults to an initial value of 1.
The MainNet in the step 1 is shown in fig. 3, and specifically includes: the MainNet is a main body of the designed lightweight network and is used for processing the characteristic diagram information after down sampling; if the resolution of the input image is K x K, the resolution of the down-sampled image is (K/2) x (K/2); the basic module of the MainNet is bottleeck, and the repetition frequency of the bottleeck in the module is n; the bottleeck includes the following operations: (1) performing pointwise convolution to expand channels, wherein the number of input channels is CinThe number of channels after expansion is CinT, where t is the expansion coefficient that controls the degree of expansion; (2) performing depthwise convolution operation to realize data processing, and extracting accurate image characteristic information in a higher spatial dimension; (3) reducing the channel number to input dimension by adopting pointwise convolution operation, and outputting the channel number Cout=CinThat is, the number of channels of each bottleeck of the MainNet before and after the liter dimension is kept unchanged; furthermore, bottleeck adopts a residual structure and linearly adds the input and output results.
The expansion coefficient t is that the resolution of the feature map input into the network is gradually reduced in the training process, and t is decreased gradually; when the number of modules is 3, the value of t in each module is 6, 4 and 2 respectively; when the number of modules is 4, the value of t in each module is 6, 4, 2, 1, respectively.
When the module is 3, the value of n in each module is 3, 4 and 1 respectively; when a module is 4, the value of n in each module is 2, 3, 2, 1, respectively.
Referring to fig. 4, the AuxiliaryNet in step 1 specifically includes: AuxiliaryNet is a supplementary network of MainNet and is used for acquiring feature map information before down-sampling; firstly, AuxiliaryNet controls the number of channels of an input image through pointwise convolution; unlike MainNet, here pointwise convolution aims at reducing the number of input channels, the degree of reduction of the number of channels is controlled by a coefficient α, which defaults to an initial value of 1; when the number of input feature map channels is CinThen, the number of channels of the feature map after pointwise convolution is equal to Cinα; secondly, AuxiliaryNet performs depthwise convolution of 3 multiplied by 3, wherein the convolution step length is 2, so as to ensure that the size of an output characteristic diagram is consistent with that of MainNet; finally, fusing the channel information by using pointwise, and outputting the channel number C of the feature mapout=Cin*α。
In this example, the input image size is 32 × 3, 3 modules are used, and the network structure is shown in table 1.
Table 1 lightweight convolutional neural network architecture designed by the present invention
Figure BDA0003042875600000061
Figure BDA0003042875600000071
Figure BDA0003042875600000081
Step 2: training the network in the step 1 under a training set, wherein the training iteration times are 160; during training, adding a gamma factor of a BN layer into a Loss function, applying an L1 regular constraint, and further training W and gamma jointly, wherein the Loss function is as follows:
Figure BDA0003042875600000082
the first item of the Loss function is a Loss function of the network, and a cross entropy function is adopted, wherein x is data input by training, y is a label, and W is network weight. The second term is the L1 regular constraint term of the BN layer gamma factor. Wherein gamma is a scaling factor of the BN layer, each layer of channel has corresponding gamma, and the gamma value of the channel with lower importance is smaller; lambda is a hyper-parameter with a value of 0.0001 for balancing the first and second terms; a constraint function g () is applied to the γ factor of the BN layer, g (γ) ═ γ |.
After training is finished, testing is carried out on the test set, and the accuracy and the calculated amount of the model are recorded;
and step 3: carrying out channel pruning on the trained network; channel pruning belongs to structural pruning, and the simplification of a model is realized by pruning unimportant channels in a network; in the pruning process, the clipping threshold is determined by a BN layer gamma factor added in the Loss function;
respectively determining cutting threshold values in the MainNet network and the AuxiliaryNet network by using a sum-of-sum ratio method and a k-average method; for the MainNet, determining a cutting threshold value in a mode of a ratio of the sum of the two; setting a total of f factors gamma arranged from small to large12,...,γk,...,γfThe sum of which is Zsum(ii) a Are summed up in sequence, denoted as
Figure BDA0003042875600000083
During the accumulation process, when the first satisfaction
Figure BDA0003042875600000084
When k is recorded, the threshold th ═ y (γ)k-1k) 2; for AuxiliaryNet, a K-means clustering mode is adopted, and the clustering number is 2: classified in the neighborhood of 0; the rest part is classified into another class, and the minimum value of the class is taken as a threshold value;
and 4, step 4: reconstructing the model according to the cutting condition in the step 3; setting cutting rates p and q to respectively reflect the cutting conditions of each convolution layer in the MainNet and the AuxiliaryNet, wherein n is the repeated times in the MainNet, X is the number of channels to be cut in each layer, Y is the total number of channels in each layer, and the total number of channels in the MainNet is the number of channels after expansion;
Figure BDA0003042875600000091
q=X/Y
the layer with high clipping rate has lower importance, and the number of channels is reduced; the layer with low clipping rate has higher importance, and the number of channels is increased; reassigning the channel control coefficients alpha and beta into alpha 'and beta' according to the clipping rates p and q to realize model reconstruction; controlling the channel number of AuxiliaryNet by alpha, and adjusting the alpha in each module according to the cutting rate q of AuxiliaryNet in the module; beta controls the number of output channels after branch fusion, beta in the first module is adjusted according to the cutting rate p of MainNet in the second module, beta in the second module is adjusted according to the cutting rate p of MainNet in the third module, and so on, and the value of beta in the last module is not adjusted; the α and β are reassigned as α 'and β':
Figure BDA0003042875600000092
Figure BDA0003042875600000093
and 5: training the reconstructed network model on a training set, wherein the training iteration times are 160 times; after the training is finished, pruning is carried out according to the method in the step 3, the model after pruning is tested on a test set, and the accuracy and the calculated amount of the model are recorded;
step 6: taking the model accuracy and the calculated amount obtained in the step 2 as references, and judging whether the calculated amount of the network model obtained in the step 5 is reduced under the condition of keeping the accuracy (the accuracy is reduced by less than 1%); if the accuracy rate is maintained and the calculated amount is reduced, repeating the steps 4 and 5; and if the accuracy rate is reduced by more than 1% or the calculated amount is not reduced any more, outputting the model at the moment as a final model.
In order to verify the effectiveness of the model, classical lightweight convolutional neural network models such as SqueezeNet, MobileNet V1/V2 and ShuffleNet V1/V2 are selected for comparison in the experiment. Experiments were completed in the same experimental environment under the CIFAR100 dataset, PyTorch deep learning framework. The Cifar100 data set contains 100 classes in total, each class containing 600 images (500 training images and 100 test images) with an image size of 32 x 3. The experiment used a random gradient descent method as the training algorithm, with momentum set to 0.9 and weight attenuation set to 0.0001. The initial learning rate is set to 0.2, the learning rate is reduced by cosine attenuation, the number of training iterations is set to 160, the batch size is 128, and the loss function is a cross entropy function. The results of the experiment are shown in table 2:
TABLE 2 comparison of the inventive model with other models (Params and Flops do not include fully connected layers)
Figure BDA0003042875600000101
The experimental results show that: the image classification accuracy of the designed model is superior to that of other lightweight networks, and the network parameter quantity and the calculation complexity are greatly reduced. The final network model designed by the invention has a small parameter of 0.22M and low FLOPs of 13.15M, and achieves the classification accuracy of 70.82% on the CIFAR100 data set.
The above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention, but all changes that can be made by applying the principles of the present invention and performing non-inventive work on the basis of the principles shall fall within the scope of the present invention.

Claims (5)

1. A design method of a lightweight convolutional neural network for mobile terminal image classification is characterized by comprising the following steps:
step 1: a light-weight convolutional neural network is designed by adopting a deep separable convolution as a main structure of the network; the designed lightweight convolutional neural network mainly comprises 3-4 modules, wherein when the resolution of an original input image is less than 224 × 224, 3 modules are used, and when the resolution of the original input image is more than or equal to 224 × 224, 4 modules are used; each module consists of a MainNet part and an AuxiliaryNet part; the MainNet is a main body of the designed lightweight network, and the number of channels of the output characteristic diagram is equal to the number of channels of the input characteristic diagram; AuxiliaryNet is a supplementary network of MainNet, is used for obtaining the characteristic diagram information before down-sampling, and controls the number of output characteristic diagram channels by using a coefficient alpha, wherein the default initial value of alpha is 1; the two network outputs are spliced and sent to the next module through branch fusion, and the number of channels of the output characteristics is controlled by a coefficient beta; if the output channel numbers of the MainNet and the AuxiliaryNet are a and b respectively, the channel number c after branch fusion is beta (a + b), and the default initial value of beta is 1;
step 2: training the network in the step 1 under a training set, wherein the training iteration times are 160; after training is finished, testing is carried out on the test set, and the accuracy and the calculated amount of the model are recorded;
and step 3: carrying out channel pruning on the trained network; channel pruning belongs to structural pruning, and the simplification of a model is realized by pruning unimportant channels in a network; in the pruning process, the clipping threshold is determined by a BN layer gamma factor added in the Loss function;
respectively determining cutting threshold values in the MainNet network and the AuxiliaryNet network by using a sum-of-sum ratio method and a k-average method; for the MainNet, determining a cutting threshold value in a mode of a ratio of the sum of the two; setting a total of f factors gamma arranged from small to large1,γ2,...,γk,...,γfThe sum of which is Zsum(ii) a Are summed up in sequence, denoted as
Figure FDA0003042875590000011
During the accumulation process, when the first satisfaction
Figure FDA0003042875590000012
When k is recorded, the threshold th ═ y (γ)k-1k) 2; for AuxiliaryNet, a K-means clustering mode is adopted, and the clustering number is 2: classified in the neighborhood of 0; the rest part is classified into another class, and the minimum value of the class is taken as a threshold value;
and 4, step 4: reconstructing the model according to the cutting condition in the step 3; setting cutting rates p and q to respectively reflect the cutting conditions of each convolution layer in the MainNet and the AuxiliaryNet, wherein n is the repeated times in the MainNet, X is the number of channels to be cut in each layer, Y is the total number of channels in each layer, and the total number of channels in the MainNet is the number of channels after expansion;
Figure FDA0003042875590000021
q=X/Y
the layer with high clipping rate has lower importance, and the number of channels is reduced; the layer with low clipping rate has higher importance, and the number of channels is increased; reassigning the channel control coefficients alpha and beta into alpha 'and beta' according to the clipping rates p and q to realize model reconstruction; controlling the channel number of AuxiliaryNet by alpha, and adjusting the alpha in each module according to the cutting rate q of AuxiliaryNet in the module; beta controls the number of output channels after branch fusion, beta in the first module is adjusted according to the cutting rate p of MainNet in the second module, beta in the second module is adjusted according to the cutting rate p of MainNet in the third module, and so on, and the value of beta in the last module is not adjusted; the α and β are reassigned as α 'and β':
Figure FDA0003042875590000022
Figure FDA0003042875590000023
and 5: training the reconstructed network model on a training set, wherein the training iteration times are 160 times; after the training is finished, pruning is carried out according to the method in the step 3, the model after pruning is tested on a test set, and the accuracy and the calculated amount of the model are recorded;
step 6: taking the model accuracy and the calculated amount obtained in the step 2 as references, and judging whether the calculated amount of the network model obtained in the step 5 is reduced under the condition of keeping the accuracy (the accuracy is reduced by less than 1%); if the accuracy rate is maintained and the calculated amount is reduced, repeating the steps 4 and 5; and if the accuracy rate is reduced by more than 1% or the calculated amount is not reduced any more, outputting the model at the moment as a final model.
2. The method for designing a lightweight convolutional neural network for moving-end-image-oriented classification according to claim 1, wherein the MainNet in step 1 specifically includes:
the MainNet is a main body of the designed lightweight network and is used for processing the characteristic diagram information after down sampling; if the resolution of the input image is K x K, the resolution of the down-sampled image is (K/2) x (K/2); the basic module of the MainNet is bottleeck, and the repetition frequency of the bottleeck in the module is n; the bottleeck includes the following operations: (1) performing pointwise convolution to expand channels, wherein the number of input channels is CinThe number of channels after expansion is CinT, where t is the expansion coefficient that controls the degree of expansion; (2) performing depthwise convolution operation to realize data processing, and extracting accurate image characteristic information in a higher spatial dimension; (3) reducing the channel number to input dimension by adopting pointwise convolution operation, and outputting the channel number Cout=CinThat is, the number of channels of each bottleeck of the MainNet before and after the liter dimension is kept unchanged; furthermore, bottleeck adopts a residual structure and linearly adds the input and output results.
3. The method for designing a lightweight convolutional neural network for mobile-end-oriented image classification as claimed in claim 2, wherein the expansion coefficient t is such that in the training process, the resolution of the feature map of the input network is gradually reduced, and t is decreased; when the number of modules is 3, the value of t in each module is 6, 4 and 2 respectively; when the number of modules is 4, the value of t in each module is 6, 4, 2, 1, respectively.
4. The method for designing the lightweight convolutional neural network for mobile-end-image-oriented classification as claimed in claim 1, wherein the number of repetitions n, when a module is 3, n in each module has a value of 3, 4, 1; when a module is 4, the value of n in each module is 2, 3, 2, 1, respectively.
5. The method for designing a lightweight convolutional neural network for moving-end-image classification according to claim 1, wherein the auxiarynet in step 1 specifically includes:
AuxiliaryNet is a supplementary network of MainNet and is used for acquiring feature map information before down-sampling; firstly, AuxiliaryNet controls the number of channels of an input image through pointwise convolution; unlike MainNet, here pointwise convolution aims at reducing the number of input channels, the degree of reduction of the number of channels is controlled by a coefficient α, which defaults to an initial value of 1; when the number of input feature map channels is CinThen, the number of channels of the feature map after pointwise convolution is equal to Cinα; secondly, AuxiliaryNet performs depthwise convolution of 3 multiplied by 3, wherein the convolution step length is 2, so as to ensure that the size of an output characteristic diagram is consistent with that of MainNet; finally, fusing the channel information by using pointwise, and outputting the channel number C of the feature mapout=Cin*α。
CN202110462584.4A 2021-04-27 2021-04-27 Design method of lightweight convolutional neural network for mobile terminal image classification Pending CN113065653A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110462584.4A CN113065653A (en) 2021-04-27 2021-04-27 Design method of lightweight convolutional neural network for mobile terminal image classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110462584.4A CN113065653A (en) 2021-04-27 2021-04-27 Design method of lightweight convolutional neural network for mobile terminal image classification

Publications (1)

Publication Number Publication Date
CN113065653A true CN113065653A (en) 2021-07-02

Family

ID=76568097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110462584.4A Pending CN113065653A (en) 2021-04-27 2021-04-27 Design method of lightweight convolutional neural network for mobile terminal image classification

Country Status (1)

Country Link
CN (1) CN113065653A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554084A (en) * 2021-07-16 2021-10-26 华侨大学 Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN113743591A (en) * 2021-09-14 2021-12-03 北京邮电大学 Method and system for automatically pruning convolutional neural network
CN114677545A (en) * 2022-03-29 2022-06-28 电子科技大学 Lightweight image classification method based on similarity pruning and efficient module

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN111882040A (en) * 2020-07-30 2020-11-03 中原工学院 Convolutional neural network compression method based on channel number search
CN111967305A (en) * 2020-07-01 2020-11-20 华南理工大学 Real-time multi-scale target detection method based on lightweight convolutional neural network
CN112418396A (en) * 2020-11-20 2021-02-26 北京工业大学 Sparse activation perception type neural network accelerator based on FPGA
CN112528830A (en) * 2020-12-07 2021-03-19 南京航空航天大学 Lightweight CNN mask face pose classification method combined with transfer learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2018102037A4 (en) * 2018-12-09 2019-01-17 Ge, Jiahao Mr A method of recognition of vehicle type based on deep learning
CN111967305A (en) * 2020-07-01 2020-11-20 华南理工大学 Real-time multi-scale target detection method based on lightweight convolutional neural network
CN111882040A (en) * 2020-07-30 2020-11-03 中原工学院 Convolutional neural network compression method based on channel number search
CN112418396A (en) * 2020-11-20 2021-02-26 北京工业大学 Sparse activation perception type neural network accelerator based on FPGA
CN112528830A (en) * 2020-12-07 2021-03-19 南京航空航天大学 Lightweight CNN mask face pose classification method combined with transfer learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
白士磊;殷柯欣;朱建启;: "轻量级YOLOv3的交通标志检测算法", 计算机与现代化, no. 09, 15 September 2020 (2020-09-15) *
邵伟平;王兴;曹昭睿;白帆;: "基于MobileNet与YOLOv3的轻量化卷积神经网络设计", 计算机应用, no. 1, 10 July 2020 (2020-07-10) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554084A (en) * 2021-07-16 2021-10-26 华侨大学 Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN113554084B (en) * 2021-07-16 2024-03-01 华侨大学 Vehicle re-identification model compression method and system based on pruning and light convolution
CN113743591A (en) * 2021-09-14 2021-12-03 北京邮电大学 Method and system for automatically pruning convolutional neural network
CN113743591B (en) * 2021-09-14 2023-12-26 北京邮电大学 Automatic pruning convolutional neural network method and system
CN114677545A (en) * 2022-03-29 2022-06-28 电子科技大学 Lightweight image classification method based on similarity pruning and efficient module
CN114677545B (en) * 2022-03-29 2023-05-23 电子科技大学 Lightweight image classification method based on similarity pruning and efficient module

Similar Documents

Publication Publication Date Title
CN113065653A (en) Design method of lightweight convolutional neural network for mobile terminal image classification
CN108846445B (en) Image processing method
CN109165660B (en) Significant object detection method based on convolutional neural network
CN110084221B (en) Serialized human face key point detection method with relay supervision based on deep learning
CN109740731B (en) Design method of self-adaptive convolution layer hardware accelerator
CN107506822B (en) Deep neural network method based on space fusion pooling
WO2018068421A1 (en) Method and device for optimizing neural network
CN108847223B (en) Voice recognition method based on deep residual error neural network
CN110969250A (en) Neural network training method and device
CN112215755B (en) Image super-resolution reconstruction method based on back projection attention network
JP2020119518A (en) Method and device for transforming cnn layers to optimize cnn parameter quantization to be used for mobile devices or compact networks with high precision via hardware optimization
CN112489164A (en) Image coloring method based on improved depth separable convolutional neural network
CN112001294A (en) YOLACT + + based vehicle body surface damage detection and mask generation method and storage device
CN113674172A (en) Image processing method, system, device and storage medium
Huai et al. Zerobn: Learning compact neural networks for latency-critical edge systems
CN113689517A (en) Image texture synthesis method and system of multi-scale channel attention network
Porziani et al. Automatic shape optimisation of structural parts driven by BGM and RBF mesh morphing
CN112164077A (en) Cell example segmentation method based on bottom-up path enhancement
Cusulin et al. A numerical method for spatial diffusion in age‐structured populations
CN108960326B (en) Point cloud fast segmentation method and system based on deep learning framework
CN111461988A (en) Seismic velocity model super-resolution technology based on multi-task learning
He et al. A lightweight multi-scale feature integration network for real-time single image super-resolution
Mokhtar et al. Pedestrian wind factor estimation in complex urban environments
CN110097116A (en) A kind of virtual sample generation method based on independent component analysis and Density Estimator
CN115482434A (en) Small sample high-quality generation method based on multi-scale generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination