CN110837887A - Compression and acceleration method of deep convolutional neural network, neural network model and application thereof - Google Patents

Compression and acceleration method of deep convolutional neural network, neural network model and application thereof Download PDF

Info

Publication number
CN110837887A
CN110837887A CN201911103074.7A CN201911103074A CN110837887A CN 110837887 A CN110837887 A CN 110837887A CN 201911103074 A CN201911103074 A CN 201911103074A CN 110837887 A CN110837887 A CN 110837887A
Authority
CN
China
Prior art keywords
neural network
training
binary
weight
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911103074.7A
Other languages
Chinese (zh)
Inventor
张菊莉
贺占庄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Microelectronics Technology Institute
Original Assignee
Xian Microelectronics Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Microelectronics Technology Institute filed Critical Xian Microelectronics Technology Institute
Priority to CN201911103074.7A priority Critical patent/CN110837887A/en
Publication of CN110837887A publication Critical patent/CN110837887A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a compression and acceleration method of a deep convolutional neural network, a neural network model and application thereof, and belongs to the field of deep convolutional neural networks. The method comprises the following steps: 1) converting the deep depth convolution neural network into a wide and shallow neural network structure; 2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight; binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network; inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training; wherein, in the training process of the neural network after the binarization, the convolution layer carries out addition and subtraction operations. The problem that the existing deep convolutional neural network cannot be applied to an embedded computing platform is solved.

Description

Compression and acceleration method of deep convolutional neural network, neural network model and application thereof
Technical Field
The invention belongs to the field of deep convolutional neural networks, and particularly relates to a compression and acceleration method of a deep convolutional neural network, a neural network model and application of the neural network model.
Background
In-orbit target identification requires a satellite to complete a series of actions such as feature extraction, classification and identification of a target in real time in an in-orbit mode, and meanwhile high accuracy and rapidity are kept. The traditional target identification method generally adopts a method of manually extracting global features and local features, then the extracted features are segmented and the global information of the target is modeled, and then the identification information of the target is given. This method has the following disadvantages: the manual feature extraction requires professional image processing knowledge, a method with good performance and robustness needs to be selected according to the characteristics of the image, and the process is complex and has certain subjectivity; the method for manually extracting the features is often a fusion of one or more methods, and the process needs to consume more time for feature extraction and fusion; the manual extraction method usually focuses on a certain aspect of an image, and the characteristics of the image cannot be comprehensively extracted, so that the final target identification has certain limitation; the object recognition has strong dependency on the features of the image.
In view of the trade-off between efficiency and performance and the urgent need of the development trend of various intelligent information processing systems, deep learning is rapidly becoming a research hotspot in the field of computer vision by virtue of strong modeling and data characterization capabilities, and has made breakthrough progress in the fields of image recognition and speech recognition. At present, high-performance earth observation satellites are developing towards intellectualization, and intelligent in-orbit satellite information processing is a key technology to be urgently broken through. The satellite in-orbit information system is a typical embedded system, and has very strict limitations on storage, memory, computing capacity, power consumption and the like, so that the operation of a deep neural network directly on a satellite information processing platform is hardly realized. Because a large amount of calculation and memory are consumed, under the current situation, the deep convolutional neural network can only run on a platform with a general image processor (GPU) and cannot be directly applied to an embedded calculation platform with limited memory, calculation and power consumption. This makes the satellite in-orbit processing system which must rely on the embedded computing platform only adopt the traditional algorithm, not adopt the deep learning algorithm with higher performance to improve the in-orbit processing capability of the system. This computational bottleneck greatly limits the speed of the satellite in-orbit information processing system.
Disclosure of Invention
The invention aims to solve the problem that the existing deep convolutional neural network cannot be directly applied to a general computing platform without GPU (graphics processing unit), and provides a compression and acceleration method of the deep convolutional neural network, a neural network model and application thereof.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
a compression and acceleration method of a deep convolutional neural network comprises the following steps:
1) converting the deep depth convolution neural network into a wide and shallow neural network structure;
2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight;
binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network;
inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training;
during the training process of the binary neural network, the convolution operation is converted into addition and subtraction operation.
Further, the transformation process in the step 1) is specifically as follows:
and shearing and cascading basic convolution units in the deep convolutional neural network structure to change the basic convolution units into the wide and shallow neural network.
Further, the step 2) specifically comprises:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution operation of the neural network after the two-system is converted into addition and subtraction operation in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
Further, the specific process of step 203) includes:
defining a binary value neural network stage;
the weight value and the activation function stage of the binary neural network.
Further, defining the binary value neural network specifically includes:
representing each convolution structure as < I, W >;
wherein I is a set of tensors, and each element I is IlAn input tensor which is the L-th layer of the convolutional neural network, wherein L is 1, and L is the number of layers of the convolutional neural network;
w is a corresponding set of tensors, each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl,KlThe number of weight filters of layer l of the CNN;
represents the convolution operation of I and W,wherein c represents the number of channels, winRepresents the width, hinRepresents height;
Figure BDA0002270435130000042
wherein w is less than or equal to win,h≤hin
Further, the weight and activation function of the binary neural network are specifically as follows:
by using
Figure BDA0002270435130000043
Representing an operation with binary weights, wherein,
Figure BDA0002270435130000044
represents a convolution operation by an addition and an addition operation; b represents a binary filter, B ∈ { +1, -1}c×w×h
Figure BDA0002270435130000045
Represents a scale factor, W is approximately equal to α B;
the binary weights are obtained by the following optimization function:
Figure BDA0002270435130000046
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
Figure BDA0002270435130000047
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (7)
By obtaining the partial derivative of J (B, α) with respect to αn is a constant, sign (W) is used instead of B*Can obtain the product
Figure BDA0002270435130000049
Wherein, WTRepresenting the transpose of the weight W, sign is a generic activation function,
Figure BDA00022704351300000410
expressing the L1 paradigm over w。
The neural network model is obtained by the compression and acceleration method of the deep convolutional neural network.
The application of the neural network model is applied to a satellite embedded computing platform for target identification.
Compared with the prior art, the invention has the following beneficial effects:
the compression and acceleration method of the deep convolutional neural network simplifies and accelerates the convolutional neural network model, and removes the dependence of the algorithm based on the deep convolutional neural network on the hardware structure and the corresponding algorithm (the deep algorithm is operated on a GPU and the acceleration algorithm aiming at the GPU is required) through simplified compression; by accelerating, the occupied memory of the neural network compression model obtained by training is reduced by about one 32 times compared with the original floating point weight theoretically, when the binarization weight is trained by adopting binarization input, the relative lifting speed is obviously improved in the GPU environment under the same condition and the CPU under the same condition, and the binarization weight is obviously superior to the calculation speed of the floating point weight on the CPU under the same condition. The target identification accuracy is reduced by about 10% -15% relative to a standard convolutional neural network, when input data is not binarized and only a binarization weight value is adopted for prediction inference, the identification accuracy is reduced by about 8% -10%, the accuracy is reduced to a certain extent, but the required storage space is obviously reduced, the calculation efficiency is obviously improved, and the method can be applied to mobile equipment with limited storage and limited calculation resources in an embedded mode.
The neural network model obtained by the compression and acceleration method of the deep convolutional neural network reduces the requirements of the neural network on computing resources and storage resources, and can be transplanted to a satellite embedded computing platform with limited computing resources, storage resources and energy consumption resources for target identification.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a block diagram of the present invention for transforming a conventional convolutional neural network structure into a wide and shallow neural network structure;
FIG. 3 is a block diagram of the present invention for transforming a convolutional neural network structure from a wide to a shallow convolutional neural network structure into a binary neural network structure;
FIG. 4 is a weight method of a binarized neural network according to the present invention;
FIG. 5 is a flow chart of neural network training according to the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention is described in further detail below with reference to the accompanying drawings:
the invention provides a compression and acceleration method of a deep convolutional neural network, which can directly apply the improved neural network to a satellite embedded computing platform. Referring to fig. 1, fig. 1 is a flowchart of an embodiment of the present invention, which includes the following steps:
s1, converting a deep-layer deep convolutional neural network into a wide-shallow neural network structure, specifically:
the deep neural network is deformed structurally: cutting and cascading basic convolution units in the deep convolution neural network structure to change the basic convolution units into a wide and shallow neural network; namely, the original series basic convolution units are selectively changed into a multi-stage cascade form, and the structural characteristics of the deep wide and shallow neural network can be continuously expanded to the wide and shallow neural network.
S2, binary system weight value training is carried out until a preset condition is reached, a trained neural network model is obtained, after input data are changed into binary data, a binary system training method is adopted to train weights in a wide and shallow network, and the final weights obtained through the training of the method are used for testing, specifically:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution layer of the neural network after the two-system is processed with addition and subtraction operations in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
S3, applying the neural network model of the training number to an embedded computing platform for target recognition; the invention adopts the process of changing the deep convolution neural network structure into the wide and shallow state, then carrying out binarization on the input image and the network weight, and adopts a specific training method to apply the obtained training model to the reasoning process of the network model.
The convolutional neural network structure YOLO with better accuracy and real-time performance in the aspects of image classification and identification is taken as an acceleration example at present. The data set adopts a self-built ship satellite remote sensing image data set, and the specific implementation steps are as follows:
1) the total number of layers of the new network structure is 10, the number of parallel layers is 2, odd layers of the original 31-layer network structure are reserved, even layers are cut out to enable the odd layers to be directly connected, the even layers are cascaded with the corresponding basic layers, the network is changed from 31 layers to 15 layers, then the cutting and the cascading are carried out according to the method, the network structure can be changed into a 10-layer or more simplified network structure, and the structure of each layer can be correspondingly changed; the process is called widening and shallowing, specifically, the adopted basic network structure unit is shown in fig. 2, fig. 2 is a process of converting a common convolutional neural network structure into a wide and shallow neural network structure module, modularly dividing a deep convolutional neural network, taking all small modules in a block as a basic unit of each layer, then performing shearing cascade connection on the basic units, removing redundant neurons by using a discarding strategy to form a basic unit in a new network structure, and then stacking the basic units in a series connection mode to form the new network structure, wherein after the mode, the number of network layers is reduced, the network width is increased, and the process is called widening and shallowing.
2) Defining a binary-valued neural network
Each convolution structure is represented as<I,W,*>I is a set of tensors, each element I ═ IlAn input tensor which is the L-th layer of the convolutional neural network, wherein L is 1, and L is the number of layers of the convolutional neural network; w is the set of tensors, where each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl;KlThe number of weight filters of layer l of the CNN; represents the convolution operation of I and W,
Figure BDA0002270435130000081
wherein c represents the number of channels, winAnd hinRespectively representing width and height;
Figure BDA0002270435130000082
wherein w is less than or equal to win,h≤hin(ii) a Referring to fig. 3, fig. 3 is a block of the present invention, which is transformed from the convolution neural network structure with the width being increased and the width being decreased into the binary neural network structure, the left side is each basic convolution unit after the width being increased and the width being decreased, and the right side is the block of the basic convolution unit for binarization. Each layer requires binarization for the convolution operation.
3) Binary neural network weight and activation function
Referring to fig. 4, fig. 4 is a weight method of the binarized neural network of the present invention, a rectangular solid on the left side of fig. 4 represents a standard floating point weight W, and the right binarized weight can be obtained after the scaling scale factor α obtained by solving W according to the formula (1) and the binarization filter are calculated.
By using
Figure BDA0002270435130000091
Representing an operation with binary weights, wherein,
Figure BDA0002270435130000092
representing a convolution operation without multiplication, the convolution operation being performed by addition and subtraction; b represents a binary filter, B ∈ { +1, -1}c×w×hBy using
Figure BDA0002270435130000093
Representing the scale factor, W ≈ α B, the binary weight is obtained by the following optimization function:
Figure BDA0002270435130000094
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
Figure BDA0002270435130000095
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (10)
The deviation of J (B, α) from α is obtainedn is a constant, sign (W) is used instead of B*Can obtain the product
Figure BDA0002270435130000097
Thus, weight binarization can be obtained by the optimization equation described above, B*The method can be realized by a sign function, and the scale factor can be obtained by the average value of the absolute values of the weights;
obtaining a binary neural network model;
4) carrying out binarization on each image sample in the data set;
5) inputting the binarized sample data into the binarized neural network model neural network structure for binary weight training, referring to fig. 5, where fig. 5 is a neural network training flow chart of the present invention, and the training process is divided into two steps: forward propagation process, backward propagation and parameter update phases,
set a minipatch input and target: (I, Y) loss function is expressed as
Figure BDA0002270435130000101
The current network weight is WtThe current learning rate is ηtThe total number of layers of the neural network is L, and each iteration process is as follows:
(1) starting from L-1 layer to L layer, each weight in L layer is binarized, e.g., the kth filter in L layer is calculated as follows:
wherein, the weight of each layer is a floating point weight obtained by inputting a normalized training sample into the simplified neural network;
(2) final prediction in forward propagation
Figure BDA0002270435130000103
Except that the convolution operation uses formulae
Figure BDA0002270435130000104
Calculating, and performing standard forward propagation on the rest by using the formula;
(3) formula for calculating partial derivatives in back propagation processI.e. using the binary-valued result weights to derive the partial derivatives instead of the floating-point number Wt
(4) The parameters are updated with a gradient descent algorithm,
Figure BDA0002270435130000106
(5) updating the learning rate parameter, ηt+1=UpdateLearningrate(ηt,t);
(6)
Figure BDA0002270435130000107
Stopping training when a preset error is reached, and returning to the step (5) to repeat training if the preset error is not reached;
6) and (5) using the trained model for testing to predict the neural network.
The acceleration mode mainly comprises two steps, wherein the first step is to simplify the network structure, namely the process of widening and lightening, and the second step is to train the simplified network structure into a binary network. The network structure is simplified mainly for the purpose of reducing the number of network layers, facilitating network training, preventing the problem of gradient explosion, reducing the neural network parameters and reducing the occupied memory; the binarization process is mainly used for compressing the weight and accelerating the test process. The two parts are combined together, so that the parameters can be greatly reduced, and the test reasoning process is accelerated. The network weight compression process is a parameter reduction process, the benefit is that the training weight becomes smaller, the test reasoning is accelerated, and the loss is that the precision is reduced. The more the number of layers is reduced, the smaller the weight is, and the more the precision is reduced. Therefore, under the condition that the precision is reduced acceptably, the network is appropriately simplified and binarized, and the deep neural network is favorably transplanted to an embedded platform only with a CPU.
See table 1, where table 1 shows the conditions of the embodiment and the experimental results, taking image detection as an example, the data set adopts a satellite remote sensing image data set, each image is 11.9MB in size, and the total number of the images is 242, and the experimental process includes training and testing processes. Finally, through experimental results, the network weight change and classification accuracy change after acceleration by the method are observed, and the acceleration condition is observed.
The 31-layer Yolo-Darknet neural network is used as an original network structure, the last layer is a detection layer, and the detection layer is simplified into a 10-layer architecture through the process described by the invention.
The minimum batch size during training is 128, the initial learning rate is 0.01, the momentum is 0.8, and the subsequent learning rate is 0.001. The programming language is lua.
Table 1 conditions of examples and experimental results
Figure BDA0002270435130000111
Figure BDA0002270435130000121
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (8)

1. A compression and acceleration method of a deep convolutional neural network is characterized by comprising the following steps:
1) converting the deep depth convolution neural network into a wide and shallow neural network structure;
2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight;
binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network;
inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training;
during the training process of the binary neural network, the convolution operation is converted into addition and subtraction operation.
2. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 1, wherein the transformation process in step 1) is specifically:
and shearing and cascading basic convolution units in the deep convolutional neural network structure to change the basic convolution units into the wide and shallow neural network.
3. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 1, wherein the step 2) specifically comprises:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution operation of the neural network after the two-system is converted into addition and subtraction operation in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
4. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 3, wherein the specific process of step 203) comprises:
defining a binary value neural network stage;
the weight value and the activation function stage of the binary neural network.
5. The method of compressing and accelerating a deep convolutional neural network as claimed in claim 4, wherein the binary value neural network is defined as:
representing each convolution structure as < I, W >;
wherein I is a set of tensors, and each element I is IlAs a convolutional neural networklAn input tensor of a layer, L ═ 1., L is the number of layers of the convolutional neural network;
w is a corresponding set of tensors, each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl,KlThe number of weight filters of layer l of the CNN;
represents the convolution operation of I and W,
Figure FDA0002270435120000021
wherein c represents the number of channels, winRepresents the width, hinRepresents height;
Figure FDA0002270435120000022
wherein w is less than or equal to win,h≤hin
6. The method of claim 4, wherein the weight and activation function of the binary neural network are specifically:
by using
Figure FDA0002270435120000031
Representing an operation with binary weights, wherein,
Figure FDA0002270435120000032
represents a convolution operation by an addition and an addition operation; b represents a binary filter, B ∈ { +1, -1}c×w×h
Figure FDA0002270435120000033
Represents a scale factor, W is approximately equal to α B;
the binary weights are obtained by the following optimization function:
Figure FDA0002270435120000034
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
Figure FDA0002270435120000035
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (3)
By obtaining the partial derivative of J (B, α) with respect to α
Figure FDA0002270435120000036
n is a constant and is substituted by sign (W)B is substituted by*Can obtain the product
Figure FDA0002270435120000037
Wherein, WTRepresenting the transpose of the weight W, sign is a generic activation function,the L1 paradigm for w is shown.
7. A neural network model obtained by the compression and acceleration method of the deep convolutional neural network as set forth in any one of claims 1 to 6.
8. An application of the neural network model according to claim 7, wherein the application is applied to a satellite embedded computing platform for target recognition.
CN201911103074.7A 2019-11-12 2019-11-12 Compression and acceleration method of deep convolutional neural network, neural network model and application thereof Pending CN110837887A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911103074.7A CN110837887A (en) 2019-11-12 2019-11-12 Compression and acceleration method of deep convolutional neural network, neural network model and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911103074.7A CN110837887A (en) 2019-11-12 2019-11-12 Compression and acceleration method of deep convolutional neural network, neural network model and application thereof

Publications (1)

Publication Number Publication Date
CN110837887A true CN110837887A (en) 2020-02-25

Family

ID=69576270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911103074.7A Pending CN110837887A (en) 2019-11-12 2019-11-12 Compression and acceleration method of deep convolutional neural network, neural network model and application thereof

Country Status (1)

Country Link
CN (1) CN110837887A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738403A (en) * 2020-04-26 2020-10-02 华为技术有限公司 Neural network optimization method and related equipment
CN112150497A (en) * 2020-10-14 2020-12-29 浙江大学 Local activation method and system based on binary neural network
CN112244853A (en) * 2020-10-26 2021-01-22 生物岛实验室 Edge computing node manufacturing method and edge computing node
US20210150313A1 (en) * 2019-11-15 2021-05-20 Samsung Electronics Co., Ltd. Electronic device and method for inference binary and ternary neural networks
CN112950464A (en) * 2021-01-25 2021-06-11 西安电子科技大学 Binary super-resolution reconstruction method without regularization layer
CN113128614A (en) * 2021-04-29 2021-07-16 西安微电子技术研究所 Convolution method based on image gradient, neural network based on directional convolution and classification method
CN113159296A (en) * 2021-04-27 2021-07-23 广东工业大学 Construction method of binary neural network
CN113221908A (en) * 2021-06-04 2021-08-06 深圳龙岗智能视听研究院 Digital identification method and equipment based on deep convolutional neural network
WO2022001364A1 (en) * 2020-06-30 2022-01-06 华为技术有限公司 Method for extracting data features, and related apparatus
WO2024092896A1 (en) * 2022-11-01 2024-05-10 鹏城实验室 Neural network training and reasoning method and device, terminal and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148078A1 (en) * 2014-11-20 2016-05-26 Adobe Systems Incorporated Convolutional Neural Network Using a Binarized Convolution Layer
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108765506A (en) * 2018-05-21 2018-11-06 上海交通大学 Compression method based on successively network binaryzation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148078A1 (en) * 2014-11-20 2016-05-26 Adobe Systems Incorporated Convolutional Neural Network Using a Binarized Convolution Layer
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108765506A (en) * 2018-05-21 2018-11-06 上海交通大学 Compression method based on successively network binaryzation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MOHAMMAD RASTEGARI等: "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 *
张涛等: "改进卷积神经网络模型设计方法", <计算机工程与设计》 *
胡骏飞等: "基于二值化卷积神经网络的手势分类方法研究", 《湖南工业大学学报》 *
谢佳砼: "基于二值的网络加速", 《电子制作》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210150313A1 (en) * 2019-11-15 2021-05-20 Samsung Electronics Co., Ltd. Electronic device and method for inference binary and ternary neural networks
US12039430B2 (en) * 2019-11-15 2024-07-16 Samsung Electronics Co., Ltd. Electronic device and method for inference binary and ternary neural networks
CN111738403A (en) * 2020-04-26 2020-10-02 华为技术有限公司 Neural network optimization method and related equipment
CN111738403B (en) * 2020-04-26 2024-06-07 华为技术有限公司 Neural network optimization method and related equipment
WO2022001364A1 (en) * 2020-06-30 2022-01-06 华为技术有限公司 Method for extracting data features, and related apparatus
CN112150497A (en) * 2020-10-14 2020-12-29 浙江大学 Local activation method and system based on binary neural network
CN112244853A (en) * 2020-10-26 2021-01-22 生物岛实验室 Edge computing node manufacturing method and edge computing node
CN112244853B (en) * 2020-10-26 2022-05-13 生物岛实验室 Edge computing node manufacturing method and edge computing node
CN112950464A (en) * 2021-01-25 2021-06-11 西安电子科技大学 Binary super-resolution reconstruction method without regularization layer
CN112950464B (en) * 2021-01-25 2023-09-01 西安电子科技大学 Binary super-resolution reconstruction method without regularization layer
CN113159296B (en) * 2021-04-27 2024-01-16 广东工业大学 Construction method of binary neural network
CN113159296A (en) * 2021-04-27 2021-07-23 广东工业大学 Construction method of binary neural network
CN113128614B (en) * 2021-04-29 2023-06-16 西安微电子技术研究所 Convolution method based on image gradient, neural network based on direction convolution and classification method
CN113128614A (en) * 2021-04-29 2021-07-16 西安微电子技术研究所 Convolution method based on image gradient, neural network based on directional convolution and classification method
CN113221908A (en) * 2021-06-04 2021-08-06 深圳龙岗智能视听研究院 Digital identification method and equipment based on deep convolutional neural network
CN113221908B (en) * 2021-06-04 2024-04-16 深圳龙岗智能视听研究院 Digital identification method and device based on deep convolutional neural network
WO2024092896A1 (en) * 2022-11-01 2024-05-10 鹏城实验室 Neural network training and reasoning method and device, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN110837887A (en) Compression and acceleration method of deep convolutional neural network, neural network model and application thereof
CN112101190B (en) Remote sensing image classification method, storage medium and computing device
CN109598269A (en) A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution
CN107665364B (en) Neural network method and apparatus
EP3340129B1 (en) Artificial neural network class-based pruning
CN110287969A (en) Mole text image binaryzation system based on figure residual error attention network
Ablavatski et al. Enriched deep recurrent visual attention model for multiple object recognition
CN109753664A (en) A kind of concept extraction method, terminal device and the storage medium of domain-oriented
CN112446888B (en) Image segmentation model processing method and processing device
CN109284761B (en) Image feature extraction method, device and equipment and readable storage medium
CN110119449A (en) A kind of criminal case charge prediction technique based on sequence enhancing capsule net network
CN114283495A (en) Human body posture estimation method based on binarization neural network
CN104036482B (en) Facial image super-resolution method based on dictionary asymptotic updating
Zhao et al. Exploring structural sparsity in CNN via selective penalty
Liu et al. Image retrieval using CNN and low-level feature fusion for crime scene investigation image database
CN114781499A (en) Method for constructing ViT model-based intensive prediction task adapter
CN110610140A (en) Training method, device and equipment of face recognition model and readable storage medium
CN113807366A (en) Point cloud key point extraction method based on deep learning
Liu et al. SuperPruner: automatic neural network pruning via super network
CN109558819B (en) Depth network lightweight method for remote sensing image target detection
Wang et al. Identification of weather phenomena based on lightweight convolutional neural networks
CN115795334A (en) Sequence recommendation data enhancement method based on graph contrast learning
Nie et al. A novel framework using gated recurrent unit for fault diagnosis of rotary machinery with noisy labels
CN114881162A (en) Method, apparatus, device and medium for predicting failure of metering automation master station
CN115546474A (en) Few-sample semantic segmentation method based on learner integration strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200225

RJ01 Rejection of invention patent application after publication