CN113392899B - Image classification method based on binary image classification network - Google Patents

Image classification method based on binary image classification network Download PDF

Info

Publication number
CN113392899B
CN113392899B CN202110650074.XA CN202110650074A CN113392899B CN 113392899 B CN113392899 B CN 113392899B CN 202110650074 A CN202110650074 A CN 202110650074A CN 113392899 B CN113392899 B CN 113392899B
Authority
CN
China
Prior art keywords
output image
size
convolution kernel
image
binary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110650074.XA
Other languages
Chinese (zh)
Other versions
CN113392899A (en
Inventor
刘启和
王钰涵
周世杰
张准
董婉祾
但毅
严张豹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110650074.XA priority Critical patent/CN113392899B/en
Publication of CN113392899A publication Critical patent/CN113392899A/en
Application granted granted Critical
Publication of CN113392899B publication Critical patent/CN113392899B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Facsimile Image Signal Circuits (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image classification method based on a binarization image classification network, which comprises the following steps: s1: collecting an original image, and initializing the original image; s2: building an image classification network according to the initialized original image; s3: and (4) carrying out image classification by utilizing a softmax classifier of the image classification network. The image classification method carries out binarization processing on the convolution kernel of the convolution operation module with the largest operation amount in the traditional image classification, uses 4 binarization convolution kernels with the same specification to carry out linear approximation, and saves the cost of algorithm storage space.

Description

Image classification method based on binary image classification network
Technical Field
The invention belongs to the technical field of image classification, and particularly relates to an image classification method based on a binarization image classification network.
Background
In recent years, Deep Neural Networks (DNNs) have revolutionized the field of machine learning and pattern recognition. However, most existing DNN models are computationally expensive and memory intensive, which hinders their deployment in devices with low memory resources or applications with stringent latency requirements.
Deep Neural Networks (DNNs) have revolutionized the field of machine learning and pattern recognition. Taking image classification as an example: classical network structures such as LeNet, AlexNet, ResNet, VggNet, etc. are proposed in succession. The structures mainly aim at a server side, and have higher requirements on GPU computing power and storage space under the hardware environment with sufficient computing power in training and reasoning. This is difficult to the deployment of mobile terminal, for example the terminal that hardware condition is limited such as unmanned aerial vehicle, smart car.
Disclosure of Invention
The invention aims to solve the problem of image classification and provides an image classification method based on a binarization image classification network.
The technical scheme of the invention is as follows: an image classification method based on a binarization image classification network comprises the following steps:
s1: collecting an original image, and initializing the original image;
s2: building an image classification network according to the initialized original image;
s3: and (4) carrying out image classification by utilizing a softmax classifier of the image classification network.
Further, step S1 includes the following sub-steps:
s11: collecting an original image with the size of 224 × 3, and adding 0 elements with the width of 3 to the periphery of the original image respectively to obtain a first output image with the size of 230 × 3;
s12: performing convolution operation on the first output image by using a convolution kernel with the size of 7 × 7 and the step size of 1 to obtain a second output image with the size of 224 × 64, and performing batch normalization on the second output image to obtain a third output image with the size of 224 × 64;
s13: activating the third output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated third output image to obtain a fourth output image with the size of 112 x 64;
s14: and (4) carrying out binarization on the fourth output image by using a sign function S (x') to obtain a fifth output image with the size of 112 × 64, and finishing the initialization of the original image.
Further, in step S14, the expression of the symbolic function S (x') is:
Figure BDA0003110819500000021
where x denotes an input image of the sign function and α denotes a first parameter to be learned.
Further, step S2 includes the following sub-steps:
s21: adding 0 elements with the width of 1 to the periphery of the fifth output image respectively to obtain a sixth output image with the size of 114 x 64;
s22: performing convolution operation on the sixth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a seventh output image with the size of 112 × 128;
s23: activating the seventh output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated seventh output image to obtain an eighth output image with the size of 56 × 128;
s23: adding 0 elements with the width of 1 to the periphery of the eighth output image respectively to obtain a ninth output image with the size of 58 × 128;
s24: performing convolution operation on the ninth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a tenth output image with the size of 56 × 256;
s25: activating the tenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated tenth output image to obtain an eleventh output image with the size of 28 × 256;
s26: adding 0 elements with the width of 1 to the periphery of the eleventh output image respectively to obtain a twelfth output image with the size of 30 x 256;
s27: performing convolution operation on the twelfth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a thirteenth output image with the size of 28 × 512;
s28: activating the thirteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated thirteenth output image to obtain a fourteenth output image with the size of 14 × 512;
s29: adding 0 elements with the width of 1 to the periphery of the fourteenth output image respectively to obtain a fifteenth output image with the size of 16 x 512;
s210: performing convolution operation on the fifteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a sixteenth output image with the size of 14 × 512;
s211: activating the sixteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated sixteenth output image to obtain a seventeenth output image with the size of 7 × 512;
s212: adding 0 elements with the width of 1 to the periphery of the seventeenth output image respectively to obtain an eighteenth output image with the size of 9 x 512;
s213: performing convolution operation on the eighteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a nineteenth output image with the size of 7 × 512;
s214: activating the nineteenth output image by using a nonlinear activation function H (x), and stretching the activated nineteenth output image to obtain a twentieth output image with the size of 1 × 25088;
s215: and inputting the twentieth output image to a full connection layer with 4096 neurons in two layers to complete the construction of the image classification network.
Further, in step S13, step S23, step S25, step S28, step S211, and step S214, the size of the pooled pool subjected to the maximum pooling is 2 × 2, and the step size is 2 × 2.
Further, in step S13, step S23, step S25, step S28, and step S211, the expression of the nonlinear activation function h (x) is:
Figure BDA0003110819500000041
wherein x represents an input image of the nonlinear activation function, β represents a second parameter to be learned, γ represents a third parameter to be learned, and τ represents a fourth parameter to be learned.
Further, the binarization of the convolution kernel with a size of 3 × 3 in step S2 includes the following sub-steps:
a21: using a size of 3 x Cin*CoutFirst binarized convolution kernel B ofi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Performing linear approximation on convolution kernels with the size of 3 x 3 respectively, wherein CinIndicates the number of input channels, CoutRepresenting the number of output channels;
a22: carrying out normalization processing on each element in the convolution kernel matrix after linear approximation to obtain a convolution kernel after normalization processing;
a23: setting a first binary convolution kernel Bi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Corresponding activation thresholds, respectively bi1、bi2、bi3And bi4
bi1=0.2493,bi2=0.4987,bi3=0.7480,bi4=0.9973。
A24: at the first binarization convolution kernel Bi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Respectively reducing the normalized convolution kernel matrix to be less than an activation threshold bi1、bi2、bi3And bi4Is determined to be 0 and is greater than an activation threshold bi1、bi2、bi3And bi4Is determined to be 1, and a first binarized convolution kernel B is initialized randomlyi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4The binarization of the convolution kernel of size 3 x 3 is completed.
Further, in step a21, the calculation formula for performing linear approximation is:
Wi≈αi1*Bi1i2*Bi2i3*Bi3i4*Bi4
wherein, WiRepresenting the convolution kernel after a linear approximation, alphai1Representing a first binarized convolution kernel Bi1Weight of (a), ai2Representing a second binary convolution kernel Bi2Weight of (a), ai3Representing a third binary convolution kernel Bi3Weight of (a), ai4Representing a fourth binary convolution kernel Bi4The weight of (c);
in step A22, the linear approximation is performed on each element a in the convolution kernel matrixijThe formula for normalization is:
Figure BDA0003110819500000051
wherein, a'ijRepresents each element after normalization, min represents element aijMax represents the element aijMaximum value of (2).
The invention has the beneficial effects that:
(1) the image classification method carries out binarization processing on the convolution kernel of the convolution operation module with the largest operation amount in the traditional image classification, uses 4 binarization convolution kernels with the same specification to carry out linear approximation, and saves the cost of algorithm storage space.
(2) The image classification method uses the nonlinear activation function with displacement to process the related output, enhances the representation capability of the extracted features under the limitation of binarization, and uses the sign function with displacement to process the related output, so that two sides of convolution operation, namely, a convolution kernel element and an input element, are binarized, addition and subtraction, even binary logic operation replaces the traditional floating point number multiplication, the operation speed is greatly increased, and the dependence of the algorithm on hardware is reduced.
Drawings
FIG. 1 is a flow chart of an image classification method;
FIG. 2 is a diagram of a network structure for initializing an image after binarization;
FIG. 3 is a diagram of a binarized feature extraction network architecture;
FIG. 4 is a diagram of the nonlinear activation function H (x).
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the present invention provides an image classification method based on a binarization image classification network, which comprises the following steps:
s1: collecting an original image, and initializing the original image;
s2: building an image classification network according to the initialized original image;
s3: and (4) carrying out image classification by utilizing a softmax classifier of the image classification network.
In the embodiment of the present invention, as shown in fig. 2, step S1 includes the following sub-steps:
s11: collecting an original image with the size of 224 × 3, and adding 0 elements with the width of 3 to the periphery of the original image respectively to obtain a first output image with the size of 230 × 3;
s12: performing convolution operation on the first output image by using a convolution kernel with the size of 7 × 7 and the step size of 1 to obtain a second output image with the size of 224 × 64, and performing batch normalization on the second output image to obtain a third output image with the size of 224 × 64;
s13: activating the third output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated third output image to obtain a fourth output image with the size of 112 x 64;
s14: and (4) carrying out binarization on the fourth output image by using a sign function S (x') to obtain a fifth output image with the size of 112 × 64, and finishing the initialization of the original image.
In the embodiment of the present invention, in step S14, the expression of the symbolic function S (x') is:
Figure BDA0003110819500000061
where x denotes an input image of the sign function, and α denotes a first parameter to be learned.
The parameter-corresponding gradient is calculated as:
Figure BDA0003110819500000062
in the embodiment of the present invention, as shown in fig. 3, step S2 includes the following sub-steps:
s21: adding 0 elements with the width of 1 to the periphery of the fifth output image respectively to obtain a sixth output image with the size of 114 x 64;
s22: performing convolution operation on the sixth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a seventh output image with the size of 112 × 128;
s23: activating the seventh output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated seventh output image to obtain an eighth output image with the size of 56 × 128;
s23: adding 0 elements with the width of 1 to the periphery of the eighth output image respectively to obtain a ninth output image with the size of 58 × 128;
s24: performing convolution operation on the ninth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a tenth output image with the size of 56 × 256;
s25: activating the tenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated tenth output image to obtain an eleventh output image with the size of 28 × 256;
s26: adding 0 elements with the width of 1 to the periphery of the eleventh output image respectively to obtain a twelfth output image with the size of 30 x 256;
s27: performing convolution operation on the twelfth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a thirteenth output image with the size of 28 × 512;
s28: activating the thirteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated thirteenth output image to obtain a fourteenth output image with the size of 14 × 512;
s29: adding 0 elements with the width of 1 to the periphery of the fourteenth output image respectively to obtain a fifteenth output image with the size of 16 x 512;
s210: performing convolution operation on the fifteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a sixteenth output image with the size of 14 × 512;
s211: activating the sixteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated sixteenth output image to obtain a seventeenth output image with the size of 7 × 512;
s212: adding 0 elements with the width of 1 to the periphery of the seventeenth output image respectively to obtain an eighteenth output image with the size of 9 x 512;
s213: performing convolution operation on the eighteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a nineteenth output image with the size of 7 × 512;
s214: activating the nineteenth output image by using a nonlinear activation function H (x), and stretching the activated nineteenth output image to obtain a twentieth output image with the size of 1 × 25088;
s215: and inputting the twentieth output image to a full connection layer with 4096 neurons in two layers to complete the construction of the image classification network.
In the embodiment of the present invention, in step S13, step S23, step S25, step S28, step S211, and step S214, the size of the pooled cell subjected to the maximum pooling is 2 × 2, and the step size is 2 × 2.
In the embodiment of the present invention, as shown in fig. 4, in step S13, step S23, step S25, step S28, and step S211, the expression of the nonlinear activation function h (x) is:
Figure BDA0003110819500000081
wherein x represents an input image of the nonlinear activation function, β represents a second parameter to be learned, γ represents a third parameter to be learned, and τ represents a fourth parameter to be learned.
Figure BDA0003110819500000082
I{·}An operation is defined as
Figure BDA0003110819500000083
In the embodiment of the present invention, the step S2, the binarization of the convolution kernel with a size of 3 × 3 includes the following sub-steps:
a21: using a size of 3 x Cin*CoutFirst binarized convolution kernel B ofi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Performing linear approximation on convolution kernels with the size of 3 x 3 respectively, wherein CinIndicates the number of input channels, CoutRepresenting the number of output channels;
a22: carrying out normalization processing on each element in the convolution kernel matrix after linear approximation to obtain a convolution kernel after normalization processing;
a23: setting a first binary convolution kernel Bi1And a second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Corresponding activation thresholds, respectively bi1、bi2、bi3And bi4
bi1=0.2493,bi2=0.4987,bi3=0.7480,bi4=0.9973。
A24: at the first binarization convolution kernel Bi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Respectively reducing the normalized convolution kernel matrix to be less than an activation threshold bi1、bi2、bi3And bi4Is determined to be 0 and is greater than an activation threshold bi1、bi2、bi3And bi4Is determined to be 1, and a first binarized convolution kernel B is initialized randomlyi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4The binarization of the convolution kernel of size 3 x 3 is completed.
The approximated forward propagation output O of the convolution kernel is as follows:
Figure BDA0003110819500000091
where a is the input to the convolution kernel.
The approximated convolution kernel backpropagates are as follows:
Figure BDA0003110819500000092
as can be seen by the straight-through estimator STE:
Figure BDA0003110819500000093
in the embodiment of the present invention, in step a21, the calculation formula for performing linear approximation is:
Wi≈αi1*Bi1i2*Bi2i3*Bi3i4*Bi4
wherein, WiRepresenting the convolution kernel after a linear approximation, alphai1Representing a first binarized convolution kernel Bi1Weight of (a), ai2Representing a second binary convolution kernel Bi2Weight of (a), ai3Representing a third binary convolution kernel Bi3Weight of (a), ai4Representing a fourth binary convolution kernel Bi4The weight of (c);
in step A22, the linear approximation is performed on each element a in the convolution kernel matrixijThe formula for normalization is:
Figure BDA0003110819500000101
wherein, a'ijRepresents each element after normalization, min represents element aijMax represents the element aijMaximum value of (2).
The working principle and the process of the invention are as follows:
the invention has the beneficial effects that: the invention researches an image classification neural network binarization method, adopts a binarization convolution kernel to replace a conventional convolution kernel, not only reduces the storage cost, but also changes the conventional floating point number multiplication operation into a binary addition and subtraction method in the aspect of convolution calculation, and even can replace the binary logic operation by matching with hardware design, thereby improving the calculation efficiency and reducing the dependence of the algorithm on the hardware calculation power.
(1) The image classification method carries out binarization processing on the convolution kernel of the convolution operation module with the largest operation amount in the traditional image classification, uses 4 binarization convolution kernels with the same specification to carry out linear approximation, and saves the cost of algorithm storage space.
(2) The image classification method uses the nonlinear activation function with displacement to process the related output, enhances the representation capability of the extracted features under the limitation of binarization, and uses the sign function with displacement to process the related output, so that two sides of convolution operation, namely, a convolution kernel element and an input element, are binarized, addition and subtraction, even binary logic operation replaces the traditional floating point number multiplication, the operation speed is greatly increased, and the dependence of the algorithm on hardware is reduced.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (5)

1. An image classification method based on a binarization image classification network is characterized by comprising the following steps:
s1: collecting an original image, and initializing the original image;
s2: building an image classification network according to the initialized original image;
s3: utilizing a softmax classifier of an image classification network to classify images;
the step S1 includes the following sub-steps:
s11: collecting an original image with the size of 224 × 3, and adding 0 elements with the width of 3 to the periphery of the original image respectively to obtain a first output image with the size of 230 × 3;
s12: performing convolution operation on the first output image by using a convolution kernel with the size of 7 × 7 and the step size of 1 to obtain a second output image with the size of 224 × 64, and performing batch normalization on the second output image to obtain a third output image with the size of 224 × 64;
s13: activating the third output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated third output image to obtain a fourth output image with the size of 112 x 64;
s14: carrying out binarization on the fourth output image by using a sign function S (x') to obtain a fifth output image with the size of 112 × 64, and finishing initialization of the original image;
the step S2 includes the following sub-steps:
s21: adding 0 elements with the width of 1 to the periphery of the fifth output image respectively to obtain a sixth output image with the size of 114 x 64;
s22: performing convolution operation on the sixth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a seventh output image with the size of 112 × 128;
s23: activating the seventh output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated seventh output image to obtain an eighth output image with the size of 56 × 128;
s23: adding 0 elements with the width of 1 to the periphery of the eighth output image respectively to obtain a ninth output image with the size of 58 × 128;
s24: performing convolution operation on the ninth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a tenth output image with the size of 56 × 256;
s25: activating the tenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated tenth output image to obtain an eleventh output image with the size of 28 × 256;
s26: adding 0 elements with the width of 1 to the periphery of the eleventh output image respectively to obtain a twelfth output image with the size of 30 x 256;
s27: performing convolution operation on the twelfth output image by using the binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a thirteenth output image with the size of 28 × 512;
s28: activating the thirteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated thirteenth output image to obtain a fourteenth output image with the size of 14 × 512;
s29: adding 0 elements with the width of 1 to the periphery of the fourteenth output image respectively to obtain a fifteenth output image with the size of 16 x 512;
s210: performing convolution operation on the fifteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a sixteenth output image with the size of 14 × 512;
s211: activating the sixteenth output image by using a nonlinear activation function H (x), and performing maximum pooling on the activated sixteenth output image to obtain a seventeenth output image with the size of 7 × 512;
s212: adding 0 elements with the width of 1 to the periphery of the seventeenth output image respectively to obtain an eighteenth output image with the size of 9 x 512;
s213: performing convolution operation on the eighteenth output image by using a binarization convolution kernel with the size of 3 × 3 and the step size of 1 to obtain a nineteenth output image with the size of 7 × 512;
s214: activating the nineteenth output image by using a nonlinear activation function H (x), and stretching the activated nineteenth output image to obtain a twentieth output image with the size of 1 × 25088;
s215: inputting the twentieth output image to a full connection layer with 4096 neurons in two layers to complete the construction of an image classification network;
in step S2, the binarization of the convolution kernel with a size of 3 × 3 includes the following sub-steps:
a21: using a size of 3 x Cin*CoutFirst binarized convolution kernel B ofi1And a second binary convolution kernel Bi2And the third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Performing linear approximation on convolution kernels with the size of 3 x 3 respectively, wherein CinIndicates the number of input channels, CoutRepresenting the number of output channels;
a22: carrying out normalization processing on each element in the convolution kernel matrix after linear approximation to obtain a convolution kernel after normalization processing;
a23: setting a first binary convolution kernel Bi1A second binary convolution kernel Bi2And the third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Corresponding activation thresholds, respectively bi1、bi2、bi3And bi4
A24: at the first binarization convolution kernel Bi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4Respectively reducing the normalized convolution kernel matrix to be less than an activation threshold bi1、bi2、bi3And bi4Is determined to be 0 and is greater than an activation threshold bi1、bi2、bi3And bi4Is determined to be 1, and a first binarized convolution kernel B is initialized randomlyi1A second binary convolution kernel Bi2The third binary convolution kernel Bi3And a fourth binary convolution kernel Bi4The binarization of the convolution kernel of size 3 x 3 is completed.
2. The image classification method based on the binary image classification network according to claim 1, characterized in that in said step S14, the expression of the symbolic function S (x') is:
Figure FDA0003528574890000031
where x denotes an input image of the sign function and α denotes a first parameter to be learned.
3. The image classification method based on the binarization image classification network of claim 1, wherein in the step S13, the step S23, the step S25, the step S28, the step S211 and the step S214, the pooling pool size for performing the maximal pooling is 2 x 2, and the step size is 2 x 2.
4. The image classification method based on the binarization image classification network as claimed in claim 1, wherein in the step S13, the step S23, the step S25, the step S28 and the step S211, the expression of the nonlinear activation function H (x) is as follows:
Figure FDA0003528574890000041
wherein x represents an input image of the nonlinear activation function, β represents a second parameter to be learned, γ represents a third parameter to be learned, and τ represents a fourth parameter to be learned.
5. The image classification method based on the binary image classification network according to claim 1, characterized in that in said step a21, the calculation formula for linear approximation is:
Wi≈αi1*Bi1i2*Bi2i3*Bi3i4*Bi4
wherein, WiRepresenting the convolution kernel after a linear approximation, alphai1Representing a first binarized convolution kernelBi1Weight of (a), ai2Representing a second binary convolution kernel Bi2Weight of (a), ai3Representing a third binary convolution kernel Bi3Weight of (a), ai4Represents the fourth binary convolution kernel Bi4The weight of (c);
in the step a22, the linear approximation is performed on each element a in the convolution kernel matrixijThe formula for normalization is:
Figure FDA0003528574890000042
wherein, a'ijRepresents each element after normalization, min represents element aijMax represents the element aijMaximum value of (2).
CN202110650074.XA 2021-06-10 2021-06-10 Image classification method based on binary image classification network Active CN113392899B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110650074.XA CN113392899B (en) 2021-06-10 2021-06-10 Image classification method based on binary image classification network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110650074.XA CN113392899B (en) 2021-06-10 2021-06-10 Image classification method based on binary image classification network

Publications (2)

Publication Number Publication Date
CN113392899A CN113392899A (en) 2021-09-14
CN113392899B true CN113392899B (en) 2022-05-10

Family

ID=77620361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110650074.XA Active CN113392899B (en) 2021-06-10 2021-06-10 Image classification method based on binary image classification network

Country Status (1)

Country Link
CN (1) CN113392899B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108723A (en) * 2018-01-19 2018-06-01 深圳市恩钛控股有限公司 A kind of face feature extraction method based on deep learning
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN112784909A (en) * 2021-01-28 2021-05-11 哈尔滨工业大学 Image classification and identification method based on self-attention mechanism and self-adaptive sub-network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10360494B2 (en) * 2016-11-30 2019-07-23 Altumview Systems Inc. Convolutional neural network (CNN) system based on resolution-limited small-scale CNN modules

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108723A (en) * 2018-01-19 2018-06-01 深圳市恩钛控股有限公司 A kind of face feature extraction method based on deep learning
CN110188795A (en) * 2019-04-24 2019-08-30 华为技术有限公司 Image classification method, data processing method and device
CN112784909A (en) * 2021-01-28 2021-05-11 哈尔滨工业大学 Image classification and identification method based on self-attention mechanism and self-adaptive sub-network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Unknown Attack Detection Based on Zero-Shot Learning;Zhun Zhang 等;《IEEE Access》;20201026;193981-193991 *
基于卷积神经网络的图像分类算法研究;郭田梅;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20180315;I138-1268 *
面向图像目标识别和检测的深度神经网络关键技术研究;李扬;《中国博士学位论文全文数据库 (信息科技辑)》;20190115;I138-97 *

Also Published As

Publication number Publication date
CN113392899A (en) 2021-09-14

Similar Documents

Publication Publication Date Title
Chaurasia et al. Linknet: Exploiting encoder representations for efficient semantic segmentation
CN109543502B (en) Semantic segmentation method based on deep multi-scale neural network
WO2020063715A1 (en) Method and system for training binary quantized weight and activation function for deep neural networks
Cheng et al. Fast neural networks with circulant projections
CN109255381B (en) Image classification method based on second-order VLAD sparse adaptive depth network
CN110458797B (en) Salient object detection method based on depth map filter
Gonwirat et al. Deblurgan-cnn: effective image denoising and recognition for noisy handwritten characters
Xiao et al. Design of a very compact cnn classifier for online handwritten chinese character recognition using dropweight and global pooling
Yoo et al. Fast training of convolutional neural network classifiers through extreme learning machines
Long et al. A survey of related research on compression and acceleration of deep neural networks
Verma et al. Computational cost reduction of convolution neural networks by insignificant filter removal
Ale et al. Lightweight deep learning model for facial expression recognition
Liu et al. Car plate character recognition using a convolutional neural network with shared hidden layers
CN111340189A (en) Space pyramid graph convolution network implementation method
Ma et al. Acceleration of multi‐task cascaded convolutional networks
CN113392899B (en) Image classification method based on binary image classification network
Ma et al. YOLOX-Mobile: a target detection algorithm more suitable for mobile devices
Tao et al. Design of face recognition system based on convolutional neural network
Wang et al. Exploring fine-grained sparsity in convolutional neural networks for efficient inference
CN110782001A (en) Improved method for using shared convolution kernel based on group convolution neural network
CN110188692B (en) Enhanced cyclic cascading method for effective target rapid identification
CN113077044A (en) General lossless compression and acceleration method for convolutional neural network
Chen et al. Multi-level generative chaotic recurrent network for image inpainting
CN111209975A (en) Ship target identification method based on multitask learning
Inouchi et al. Functionally-predefined kernel: a way to reduce CNN computation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant