CN110837887A - Compression and acceleration method of deep convolutional neural network, neural network model and application thereof - Google Patents
Compression and acceleration method of deep convolutional neural network, neural network model and application thereof Download PDFInfo
- Publication number
- CN110837887A CN110837887A CN201911103074.7A CN201911103074A CN110837887A CN 110837887 A CN110837887 A CN 110837887A CN 201911103074 A CN201911103074 A CN 201911103074A CN 110837887 A CN110837887 A CN 110837887A
- Authority
- CN
- China
- Prior art keywords
- neural network
- training
- binary
- weight
- convolutional neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a compression and acceleration method of a deep convolutional neural network, a neural network model and application thereof, and belongs to the field of deep convolutional neural networks. The method comprises the following steps: 1) converting the deep depth convolution neural network into a wide and shallow neural network structure; 2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight; binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network; inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training; wherein, in the training process of the neural network after the binarization, the convolution layer carries out addition and subtraction operations. The problem that the existing deep convolutional neural network cannot be applied to an embedded computing platform is solved.
Description
Technical Field
The invention belongs to the field of deep convolutional neural networks, and particularly relates to a compression and acceleration method of a deep convolutional neural network, a neural network model and application of the neural network model.
Background
In-orbit target identification requires a satellite to complete a series of actions such as feature extraction, classification and identification of a target in real time in an in-orbit mode, and meanwhile high accuracy and rapidity are kept. The traditional target identification method generally adopts a method of manually extracting global features and local features, then the extracted features are segmented and the global information of the target is modeled, and then the identification information of the target is given. This method has the following disadvantages: the manual feature extraction requires professional image processing knowledge, a method with good performance and robustness needs to be selected according to the characteristics of the image, and the process is complex and has certain subjectivity; the method for manually extracting the features is often a fusion of one or more methods, and the process needs to consume more time for feature extraction and fusion; the manual extraction method usually focuses on a certain aspect of an image, and the characteristics of the image cannot be comprehensively extracted, so that the final target identification has certain limitation; the object recognition has strong dependency on the features of the image.
In view of the trade-off between efficiency and performance and the urgent need of the development trend of various intelligent information processing systems, deep learning is rapidly becoming a research hotspot in the field of computer vision by virtue of strong modeling and data characterization capabilities, and has made breakthrough progress in the fields of image recognition and speech recognition. At present, high-performance earth observation satellites are developing towards intellectualization, and intelligent in-orbit satellite information processing is a key technology to be urgently broken through. The satellite in-orbit information system is a typical embedded system, and has very strict limitations on storage, memory, computing capacity, power consumption and the like, so that the operation of a deep neural network directly on a satellite information processing platform is hardly realized. Because a large amount of calculation and memory are consumed, under the current situation, the deep convolutional neural network can only run on a platform with a general image processor (GPU) and cannot be directly applied to an embedded calculation platform with limited memory, calculation and power consumption. This makes the satellite in-orbit processing system which must rely on the embedded computing platform only adopt the traditional algorithm, not adopt the deep learning algorithm with higher performance to improve the in-orbit processing capability of the system. This computational bottleneck greatly limits the speed of the satellite in-orbit information processing system.
Disclosure of Invention
The invention aims to solve the problem that the existing deep convolutional neural network cannot be directly applied to a general computing platform without GPU (graphics processing unit), and provides a compression and acceleration method of the deep convolutional neural network, a neural network model and application thereof.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
a compression and acceleration method of a deep convolutional neural network comprises the following steps:
1) converting the deep depth convolution neural network into a wide and shallow neural network structure;
2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight;
binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network;
inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training;
during the training process of the binary neural network, the convolution operation is converted into addition and subtraction operation.
Further, the transformation process in the step 1) is specifically as follows:
and shearing and cascading basic convolution units in the deep convolutional neural network structure to change the basic convolution units into the wide and shallow neural network.
Further, the step 2) specifically comprises:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution operation of the neural network after the two-system is converted into addition and subtraction operation in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
Further, the specific process of step 203) includes:
defining a binary value neural network stage;
the weight value and the activation function stage of the binary neural network.
Further, defining the binary value neural network specifically includes:
representing each convolution structure as < I, W >;
wherein I is a set of tensors, and each element I is IlAn input tensor which is the L-th layer of the convolutional neural network, wherein L is 1, and L is the number of layers of the convolutional neural network;
w is a corresponding set of tensors, each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl,KlThe number of weight filters of layer l of the CNN;
represents the convolution operation of I and W,wherein c represents the number of channels, winRepresents the width, hinRepresents height;
Further, the weight and activation function of the binary neural network are specifically as follows:
by usingRepresenting an operation with binary weights, wherein,represents a convolution operation by an addition and an addition operation; b represents a binary filter, B ∈ { +1, -1}c×w×h;Represents a scale factor, W is approximately equal to α B;
the binary weights are obtained by the following optimization function:
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (7)
By obtaining the partial derivative of J (B, α) with respect to αn is a constant, sign (W) is used instead of B*Can obtain the product
Wherein, WTRepresenting the transpose of the weight W, sign is a generic activation function,expressing the L1 paradigm over w。
The neural network model is obtained by the compression and acceleration method of the deep convolutional neural network.
The application of the neural network model is applied to a satellite embedded computing platform for target identification.
Compared with the prior art, the invention has the following beneficial effects:
the compression and acceleration method of the deep convolutional neural network simplifies and accelerates the convolutional neural network model, and removes the dependence of the algorithm based on the deep convolutional neural network on the hardware structure and the corresponding algorithm (the deep algorithm is operated on a GPU and the acceleration algorithm aiming at the GPU is required) through simplified compression; by accelerating, the occupied memory of the neural network compression model obtained by training is reduced by about one 32 times compared with the original floating point weight theoretically, when the binarization weight is trained by adopting binarization input, the relative lifting speed is obviously improved in the GPU environment under the same condition and the CPU under the same condition, and the binarization weight is obviously superior to the calculation speed of the floating point weight on the CPU under the same condition. The target identification accuracy is reduced by about 10% -15% relative to a standard convolutional neural network, when input data is not binarized and only a binarization weight value is adopted for prediction inference, the identification accuracy is reduced by about 8% -10%, the accuracy is reduced to a certain extent, but the required storage space is obviously reduced, the calculation efficiency is obviously improved, and the method can be applied to mobile equipment with limited storage and limited calculation resources in an embedded mode.
The neural network model obtained by the compression and acceleration method of the deep convolutional neural network reduces the requirements of the neural network on computing resources and storage resources, and can be transplanted to a satellite embedded computing platform with limited computing resources, storage resources and energy consumption resources for target identification.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a block diagram of the present invention for transforming a conventional convolutional neural network structure into a wide and shallow neural network structure;
FIG. 3 is a block diagram of the present invention for transforming a convolutional neural network structure from a wide to a shallow convolutional neural network structure into a binary neural network structure;
FIG. 4 is a weight method of a binarized neural network according to the present invention;
FIG. 5 is a flow chart of neural network training according to the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention is described in further detail below with reference to the accompanying drawings:
the invention provides a compression and acceleration method of a deep convolutional neural network, which can directly apply the improved neural network to a satellite embedded computing platform. Referring to fig. 1, fig. 1 is a flowchart of an embodiment of the present invention, which includes the following steps:
s1, converting a deep-layer deep convolutional neural network into a wide-shallow neural network structure, specifically:
the deep neural network is deformed structurally: cutting and cascading basic convolution units in the deep convolution neural network structure to change the basic convolution units into a wide and shallow neural network; namely, the original series basic convolution units are selectively changed into a multi-stage cascade form, and the structural characteristics of the deep wide and shallow neural network can be continuously expanded to the wide and shallow neural network.
S2, binary system weight value training is carried out until a preset condition is reached, a trained neural network model is obtained, after input data are changed into binary data, a binary system training method is adopted to train weights in a wide and shallow network, and the final weights obtained through the training of the method are used for testing, specifically:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution layer of the neural network after the two-system is processed with addition and subtraction operations in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
S3, applying the neural network model of the training number to an embedded computing platform for target recognition; the invention adopts the process of changing the deep convolution neural network structure into the wide and shallow state, then carrying out binarization on the input image and the network weight, and adopts a specific training method to apply the obtained training model to the reasoning process of the network model.
The convolutional neural network structure YOLO with better accuracy and real-time performance in the aspects of image classification and identification is taken as an acceleration example at present. The data set adopts a self-built ship satellite remote sensing image data set, and the specific implementation steps are as follows:
1) the total number of layers of the new network structure is 10, the number of parallel layers is 2, odd layers of the original 31-layer network structure are reserved, even layers are cut out to enable the odd layers to be directly connected, the even layers are cascaded with the corresponding basic layers, the network is changed from 31 layers to 15 layers, then the cutting and the cascading are carried out according to the method, the network structure can be changed into a 10-layer or more simplified network structure, and the structure of each layer can be correspondingly changed; the process is called widening and shallowing, specifically, the adopted basic network structure unit is shown in fig. 2, fig. 2 is a process of converting a common convolutional neural network structure into a wide and shallow neural network structure module, modularly dividing a deep convolutional neural network, taking all small modules in a block as a basic unit of each layer, then performing shearing cascade connection on the basic units, removing redundant neurons by using a discarding strategy to form a basic unit in a new network structure, and then stacking the basic units in a series connection mode to form the new network structure, wherein after the mode, the number of network layers is reduced, the network width is increased, and the process is called widening and shallowing.
2) Defining a binary-valued neural network
Each convolution structure is represented as<I,W,*>I is a set of tensors, each element I ═ IlAn input tensor which is the L-th layer of the convolutional neural network, wherein L is 1, and L is the number of layers of the convolutional neural network; w is the set of tensors, where each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl;KlThe number of weight filters of layer l of the CNN; represents the convolution operation of I and W,wherein c represents the number of channels, winAnd hinRespectively representing width and height;wherein w is less than or equal to win,h≤hin(ii) a Referring to fig. 3, fig. 3 is a block of the present invention, which is transformed from the convolution neural network structure with the width being increased and the width being decreased into the binary neural network structure, the left side is each basic convolution unit after the width being increased and the width being decreased, and the right side is the block of the basic convolution unit for binarization. Each layer requires binarization for the convolution operation.
3) Binary neural network weight and activation function
Referring to fig. 4, fig. 4 is a weight method of the binarized neural network of the present invention, a rectangular solid on the left side of fig. 4 represents a standard floating point weight W, and the right binarized weight can be obtained after the scaling scale factor α obtained by solving W according to the formula (1) and the binarization filter are calculated.
By usingRepresenting an operation with binary weights, wherein,representing a convolution operation without multiplication, the convolution operation being performed by addition and subtraction; b represents a binary filter, B ∈ { +1, -1}c×w×hBy usingRepresenting the scale factor, W ≈ α B, the binary weight is obtained by the following optimization function:
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (10)
The deviation of J (B, α) from α is obtainedn is a constant, sign (W) is used instead of B*Can obtain the product
Thus, weight binarization can be obtained by the optimization equation described above, B*The method can be realized by a sign function, and the scale factor can be obtained by the average value of the absolute values of the weights;
obtaining a binary neural network model;
4) carrying out binarization on each image sample in the data set;
5) inputting the binarized sample data into the binarized neural network model neural network structure for binary weight training, referring to fig. 5, where fig. 5 is a neural network training flow chart of the present invention, and the training process is divided into two steps: forward propagation process, backward propagation and parameter update phases,
set a minipatch input and target: (I, Y) loss function is expressed asThe current network weight is WtThe current learning rate is ηtThe total number of layers of the neural network is L, and each iteration process is as follows:
(1) starting from L-1 layer to L layer, each weight in L layer is binarized, e.g., the kth filter in L layer is calculated as follows:
wherein, the weight of each layer is a floating point weight obtained by inputting a normalized training sample into the simplified neural network;
(2) final prediction in forward propagationExcept that the convolution operation uses formulaeCalculating, and performing standard forward propagation on the rest by using the formula;
(3) formula for calculating partial derivatives in back propagation processI.e. using the binary-valued result weights to derive the partial derivatives instead of the floating-point number Wt;
(5) updating the learning rate parameter, ηt+1=UpdateLearningrate(ηt,t);
(6)Stopping training when a preset error is reached, and returning to the step (5) to repeat training if the preset error is not reached;
6) and (5) using the trained model for testing to predict the neural network.
The acceleration mode mainly comprises two steps, wherein the first step is to simplify the network structure, namely the process of widening and lightening, and the second step is to train the simplified network structure into a binary network. The network structure is simplified mainly for the purpose of reducing the number of network layers, facilitating network training, preventing the problem of gradient explosion, reducing the neural network parameters and reducing the occupied memory; the binarization process is mainly used for compressing the weight and accelerating the test process. The two parts are combined together, so that the parameters can be greatly reduced, and the test reasoning process is accelerated. The network weight compression process is a parameter reduction process, the benefit is that the training weight becomes smaller, the test reasoning is accelerated, and the loss is that the precision is reduced. The more the number of layers is reduced, the smaller the weight is, and the more the precision is reduced. Therefore, under the condition that the precision is reduced acceptably, the network is appropriately simplified and binarized, and the deep neural network is favorably transplanted to an embedded platform only with a CPU.
See table 1, where table 1 shows the conditions of the embodiment and the experimental results, taking image detection as an example, the data set adopts a satellite remote sensing image data set, each image is 11.9MB in size, and the total number of the images is 242, and the experimental process includes training and testing processes. Finally, through experimental results, the network weight change and classification accuracy change after acceleration by the method are observed, and the acceleration condition is observed.
The 31-layer Yolo-Darknet neural network is used as an original network structure, the last layer is a detection layer, and the detection layer is simplified into a 10-layer architecture through the process described by the invention.
The minimum batch size during training is 128, the initial learning rate is 0.01, the momentum is 0.8, and the subsequent learning rate is 0.001. The programming language is lua.
Table 1 conditions of examples and experimental results
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.
Claims (8)
1. A compression and acceleration method of a deep convolutional neural network is characterized by comprising the following steps:
1) converting the deep depth convolution neural network into a wide and shallow neural network structure;
2) inputting the normalized sample data into a wide and shallow neural network for training to obtain a floating point number weight;
binarizing the floating point number weight and the activation function of the neural network structure to obtain a binarized neural network;
inputting the binarized neural network into training data by using the binarized sample data as training data, and updating parameters until the error between the predicted value and the ground route reaches a preset error, thereby finishing training;
during the training process of the binary neural network, the convolution operation is converted into addition and subtraction operation.
2. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 1, wherein the transformation process in step 1) is specifically:
and shearing and cascading basic convolution units in the deep convolutional neural network structure to change the basic convolution units into the wide and shallow neural network.
3. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 1, wherein the step 2) specifically comprises:
201) standardizing the training sample to obtain a second training sample;
202) inputting the normalized training sample into a neural network which is widened and lightened to train so as to obtain a floating point number weight;
203) binary value of floating point number weight value obtained in training process is 1 and-1, and stored as binary weight value;
binarizing the activation function of the neural network into a binary activation function to obtain a binarized neural network;
204) inputting the training sample after binarization into a neural network after binarization for training, and outputting a predicted value of the training in the current round;
wherein, the convolution operation of the neural network after the two-system is converted into addition and subtraction operation in the training process;
205) calculating the gradient of back propagation by adopting binary weight values, and updating parameters;
206) calculating the error between the predicted value output by the training of the current round and the ground channel, and entering step 207 if the error reaches a preset error; otherwise, repeating steps 204) -206);
207) and finishing the training.
4. The method for compressing and accelerating a deep convolutional neural network as claimed in claim 3, wherein the specific process of step 203) comprises:
defining a binary value neural network stage;
the weight value and the activation function stage of the binary neural network.
5. The method of compressing and accelerating a deep convolutional neural network as claimed in claim 4, wherein the binary value neural network is defined as:
representing each convolution structure as < I, W >;
wherein I is a set of tensors, and each element I is IlAs a convolutional neural networklAn input tensor of a layer, L ═ 1., L is the number of layers of the convolutional neural network;
w is a corresponding set of tensors, each element WlkA kth weight filter, k 1, k, representing the l-th layer of the convolutional neural networkl,KlThe number of weight filters of layer l of the CNN;
represents the convolution operation of I and W,wherein c represents the number of channels, winRepresents the width, hinRepresents height;
6. The method of claim 4, wherein the weight and activation function of the binary neural network are specifically:
by usingRepresenting an operation with binary weights, wherein,represents a convolution operation by an addition and an addition operation; b represents a binary filter, B ∈ { +1, -1}c×w×h;Represents a scale factor, W is approximately equal to α B;
the binary weights are obtained by the following optimization function:
as can be seen from the above formula development and analysis, the binarization filter B can be obtained by a maximization constraint optimization term as follows:
if WiGreater than or equal to 0, then BiNot greater than +1, otherwise BiIs-1, therefore
B*=sign(W) (3)
By obtaining the partial derivative of J (B, α) with respect to αn is a constant and is substituted by sign (W)B is substituted by*Can obtain the product
Wherein, WTRepresenting the transpose of the weight W, sign is a generic activation function,the L1 paradigm for w is shown.
7. A neural network model obtained by the compression and acceleration method of the deep convolutional neural network as set forth in any one of claims 1 to 6.
8. An application of the neural network model according to claim 7, wherein the application is applied to a satellite embedded computing platform for target recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911103074.7A CN110837887A (en) | 2019-11-12 | 2019-11-12 | Compression and acceleration method of deep convolutional neural network, neural network model and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911103074.7A CN110837887A (en) | 2019-11-12 | 2019-11-12 | Compression and acceleration method of deep convolutional neural network, neural network model and application thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110837887A true CN110837887A (en) | 2020-02-25 |
Family
ID=69576270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911103074.7A Pending CN110837887A (en) | 2019-11-12 | 2019-11-12 | Compression and acceleration method of deep convolutional neural network, neural network model and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110837887A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738403A (en) * | 2020-04-26 | 2020-10-02 | 华为技术有限公司 | Neural network optimization method and related equipment |
CN112150497A (en) * | 2020-10-14 | 2020-12-29 | 浙江大学 | Local activation method and system based on binary neural network |
CN112244853A (en) * | 2020-10-26 | 2021-01-22 | 生物岛实验室 | Edge computing node manufacturing method and edge computing node |
US20210150313A1 (en) * | 2019-11-15 | 2021-05-20 | Samsung Electronics Co., Ltd. | Electronic device and method for inference binary and ternary neural networks |
CN112950464A (en) * | 2021-01-25 | 2021-06-11 | 西安电子科技大学 | Binary super-resolution reconstruction method without regularization layer |
CN113128614A (en) * | 2021-04-29 | 2021-07-16 | 西安微电子技术研究所 | Convolution method based on image gradient, neural network based on directional convolution and classification method |
CN113159296A (en) * | 2021-04-27 | 2021-07-23 | 广东工业大学 | Construction method of binary neural network |
CN113221908A (en) * | 2021-06-04 | 2021-08-06 | 深圳龙岗智能视听研究院 | Digital identification method and equipment based on deep convolutional neural network |
WO2022001364A1 (en) * | 2020-06-30 | 2022-01-06 | 华为技术有限公司 | Method for extracting data features, and related apparatus |
WO2024092896A1 (en) * | 2022-11-01 | 2024-05-10 | 鹏城实验室 | Neural network training and reasoning method and device, terminal and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148078A1 (en) * | 2014-11-20 | 2016-05-26 | Adobe Systems Incorporated | Convolutional Neural Network Using a Binarized Convolution Layer |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN108765506A (en) * | 2018-05-21 | 2018-11-06 | 上海交通大学 | Compression method based on successively network binaryzation |
-
2019
- 2019-11-12 CN CN201911103074.7A patent/CN110837887A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148078A1 (en) * | 2014-11-20 | 2016-05-26 | Adobe Systems Incorporated | Convolutional Neural Network Using a Binarized Convolution Layer |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN108765506A (en) * | 2018-05-21 | 2018-11-06 | 上海交通大学 | Compression method based on successively network binaryzation |
Non-Patent Citations (4)
Title |
---|
MOHAMMAD RASTEGARI等: "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 * |
张涛等: "改进卷积神经网络模型设计方法", <计算机工程与设计》 * |
胡骏飞等: "基于二值化卷积神经网络的手势分类方法研究", 《湖南工业大学学报》 * |
谢佳砼: "基于二值的网络加速", 《电子制作》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210150313A1 (en) * | 2019-11-15 | 2021-05-20 | Samsung Electronics Co., Ltd. | Electronic device and method for inference binary and ternary neural networks |
US12039430B2 (en) * | 2019-11-15 | 2024-07-16 | Samsung Electronics Co., Ltd. | Electronic device and method for inference binary and ternary neural networks |
CN111738403A (en) * | 2020-04-26 | 2020-10-02 | 华为技术有限公司 | Neural network optimization method and related equipment |
CN111738403B (en) * | 2020-04-26 | 2024-06-07 | 华为技术有限公司 | Neural network optimization method and related equipment |
WO2022001364A1 (en) * | 2020-06-30 | 2022-01-06 | 华为技术有限公司 | Method for extracting data features, and related apparatus |
CN112150497A (en) * | 2020-10-14 | 2020-12-29 | 浙江大学 | Local activation method and system based on binary neural network |
CN112244853A (en) * | 2020-10-26 | 2021-01-22 | 生物岛实验室 | Edge computing node manufacturing method and edge computing node |
CN112244853B (en) * | 2020-10-26 | 2022-05-13 | 生物岛实验室 | Edge computing node manufacturing method and edge computing node |
CN112950464A (en) * | 2021-01-25 | 2021-06-11 | 西安电子科技大学 | Binary super-resolution reconstruction method without regularization layer |
CN112950464B (en) * | 2021-01-25 | 2023-09-01 | 西安电子科技大学 | Binary super-resolution reconstruction method without regularization layer |
CN113159296B (en) * | 2021-04-27 | 2024-01-16 | 广东工业大学 | Construction method of binary neural network |
CN113159296A (en) * | 2021-04-27 | 2021-07-23 | 广东工业大学 | Construction method of binary neural network |
CN113128614B (en) * | 2021-04-29 | 2023-06-16 | 西安微电子技术研究所 | Convolution method based on image gradient, neural network based on direction convolution and classification method |
CN113128614A (en) * | 2021-04-29 | 2021-07-16 | 西安微电子技术研究所 | Convolution method based on image gradient, neural network based on directional convolution and classification method |
CN113221908A (en) * | 2021-06-04 | 2021-08-06 | 深圳龙岗智能视听研究院 | Digital identification method and equipment based on deep convolutional neural network |
CN113221908B (en) * | 2021-06-04 | 2024-04-16 | 深圳龙岗智能视听研究院 | Digital identification method and device based on deep convolutional neural network |
WO2024092896A1 (en) * | 2022-11-01 | 2024-05-10 | 鹏城实验室 | Neural network training and reasoning method and device, terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837887A (en) | Compression and acceleration method of deep convolutional neural network, neural network model and application thereof | |
CN112101190B (en) | Remote sensing image classification method, storage medium and computing device | |
CN109598269A (en) | A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution | |
CN107665364B (en) | Neural network method and apparatus | |
EP3340129B1 (en) | Artificial neural network class-based pruning | |
CN110287969A (en) | Mole text image binaryzation system based on figure residual error attention network | |
Ablavatski et al. | Enriched deep recurrent visual attention model for multiple object recognition | |
CN109753664A (en) | A kind of concept extraction method, terminal device and the storage medium of domain-oriented | |
CN112446888B (en) | Image segmentation model processing method and processing device | |
CN109284761B (en) | Image feature extraction method, device and equipment and readable storage medium | |
CN110119449A (en) | A kind of criminal case charge prediction technique based on sequence enhancing capsule net network | |
CN114283495A (en) | Human body posture estimation method based on binarization neural network | |
CN104036482B (en) | Facial image super-resolution method based on dictionary asymptotic updating | |
Zhao et al. | Exploring structural sparsity in CNN via selective penalty | |
Liu et al. | Image retrieval using CNN and low-level feature fusion for crime scene investigation image database | |
CN114781499A (en) | Method for constructing ViT model-based intensive prediction task adapter | |
CN110610140A (en) | Training method, device and equipment of face recognition model and readable storage medium | |
CN113807366A (en) | Point cloud key point extraction method based on deep learning | |
Liu et al. | SuperPruner: automatic neural network pruning via super network | |
CN109558819B (en) | Depth network lightweight method for remote sensing image target detection | |
Wang et al. | Identification of weather phenomena based on lightweight convolutional neural networks | |
CN115795334A (en) | Sequence recommendation data enhancement method based on graph contrast learning | |
Nie et al. | A novel framework using gated recurrent unit for fault diagnosis of rotary machinery with noisy labels | |
CN114881162A (en) | Method, apparatus, device and medium for predicting failure of metering automation master station | |
CN115546474A (en) | Few-sample semantic segmentation method based on learner integration strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200225 |
|
RJ01 | Rejection of invention patent application after publication |