CN112288085A - Convolutional neural network acceleration method and system - Google Patents
Convolutional neural network acceleration method and system Download PDFInfo
- Publication number
- CN112288085A CN112288085A CN202011147836.6A CN202011147836A CN112288085A CN 112288085 A CN112288085 A CN 112288085A CN 202011147836 A CN202011147836 A CN 202011147836A CN 112288085 A CN112288085 A CN 112288085A
- Authority
- CN
- China
- Prior art keywords
- neural network
- result
- convolution
- convolutional neural
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 59
- 230000001133 acceleration Effects 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000004913 activation Effects 0.000 claims abstract description 71
- 238000004364 calculation method Methods 0.000 claims abstract description 40
- 238000013528 artificial neural network Methods 0.000 abstract description 17
- 238000001994 activation Methods 0.000 description 56
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a convolutional neural network acceleration method and a convolutional neural network acceleration system, which comprise the steps of taking an image to be subjected to characteristic analysis as input to activate and input into a convolutional neural network, decomposing a weight vector of a filter in the convolutional neural network, and obtaining a symbol vector corresponding to a weight in the filter; performing convolution operation on the symbol vector and the input activation vector to obtain a first convolution result, performing convolution operation on the compensation factor and the input activation vector to obtain a second convolution result, and adding the first convolution result and the second convolution result to obtain a prediction result; and when the convolution neural network executes convolution calculation, skipping 0 value related operation according to the prediction result to obtain a convolution result. The invention can predict the sparsity of output activation to guide the original neural network operation to skip the operation related to the 0 value, thereby reducing the calculation amount of the original network, saving the calculation resource, reducing the power consumption and improving the performance.
Description
Technical Field
The invention relates to a computer system structure, in particular to a convolutional neural network acceleration method and system for predicting sparse activation data based on weight symbols.
Background
The neural network has advanced performance in the aspects of image detection, voice recognition and natural language processing, the neural network model is complicated along with application complexity, a plurality of challenges are provided for traditional hardware, and in order to relieve the pressure of hardware resources, the sparse network has good advantages in the aspects of calculation, storage, power consumption requirements and the like. Many algorithms and accelerators for accelerating a sparse network have appeared, such as a CPU-oriented sparse-blas library, a GPU-oriented custare library, and the like, which accelerate the execution of the sparse network to some extent, and for a dedicated accelerator, have advanced expressive power in terms of performance, power consumption, and the like.
The neural network model becomes large and deep along with the complexity of application, which provides many challenges for traditional hardware, and in order to relieve the pressure of hardware resources, the sparse network has good advantages in the aspects of calculation, storage, power consumption requirements and the like.
In most Deep Neural Networks (DNN), a rectifying linear unit (RELU) is widely used for outputting of a network layer, negative value activation data are forcibly output to be 0, meanwhile, for the weight of the neural networks, based on the redundancy characteristic of the weight data, methods such as pruning and the like are used for setting some weights to be 0, and the methods result in that a large amount of 0 value output activation data and weights are generated in the neural networks, so that weight sparseness and activation sparseness exist in the sparse networks, and about 50% of sparseness exists in modern DNN models. The operation of the neural network is mainly multiplication and addition operation, and the multiplication of the 0 value data and any value is 0, so the operations can be regarded as invalid operations, the execution of the invalid operations occupies computing resources, the waste of the computing resources and the power consumption is caused, the execution time of the network is prolonged, and the performance of the network is reduced.
Disclosure of Invention
Aiming at the situation that a large amount of sparse data exists in the neural network, the invention discloses a prediction device of sparse activation data, which predicts the sparsity of the activation data in advance by using smaller prediction overhead so as to guide the operation of the original neural network. The execution of the neural network is thus divided into two phases, a prediction phase and an execution phase. In the prediction stage, the weight symbols and the input activation data are used for executing network operation, meanwhile, compensation factors are added to reduce the loss of reasoning accuracy, and the prediction result of the output activation data is generated. In the execution stage, only relevant neural network operation with positive output activation predicted value is executed by using the predicted output activation data, and relevant neural network operation with negative activation predicted value is removed. Finally, the calculation amount of sparse network operation is reduced, the power consumption is reduced, and the execution performance of the network is improved.
Specifically, in order to overcome the defects in the prior art, the present invention provides a convolutional neural network acceleration method and system, wherein the convolutional neural network acceleration method comprises:
step 2, performing convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, performing convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adding the first convolution result and the second convolution result to obtain a prediction result;
and 2, skipping 0 value related operation according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.
The convolutional neural network acceleration method and system, wherein the step 2 comprises: and judging whether a numerical value less than or equal to 0 exists in the prediction result, if so, acquiring the vector position of the numerical value less than or equal to 0 in the prediction result, skipping the calculation process related to the vector position when performing convolution calculation to obtain an activation output result, and setting the numerical value at the vector position in the activation output result to zero to obtain the convolution result.
The convolutional neural network acceleration method and system, wherein the step 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.
The convolutional neural network acceleration method and system are characterized in that the value range of the compensation factor is larger than 0 and smaller than 1.
The acceleration method and system of the convolutional neural network are characterized in that the convolution calculation is shown as the following formula:
O=∑I*W
w is the filter weight, I is the input activation, and O is the convolution calculation result.
The invention also provides a convolutional neural network acceleration system and a convolutional neural network acceleration system, wherein the convolutional neural network acceleration system comprises:
the method comprises the following steps that a module 1 takes an image to be subjected to characteristic analysis as input activation to be input into a convolutional neural network, and decomposes a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to a weight in the filter;
the module 2 executes convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, executes convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adds the first convolution result and the second convolution result to obtain a prediction result;
and the module 2 skips 0 value related operation according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.
The convolutional neural network acceleration system and system, wherein, this module 2 includes: and judging whether a numerical value less than or equal to 0 exists in the prediction result, if so, acquiring the vector position of the numerical value less than or equal to 0 in the prediction result, skipping the calculation process related to the vector position when performing convolution calculation to obtain an activation output result, and setting the numerical value at the vector position in the activation output result to zero to obtain the convolution result.
The convolutional neural network acceleration system and system, wherein, this module 1 includes: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.
The convolutional neural network acceleration system and the convolutional neural network acceleration system are characterized in that the value range of the compensation factor is larger than 0 and smaller than 1.
The convolutional neural network acceleration system and system, wherein the convolution calculation is shown as the following formula:
O=∑I*W
w is the filter weight, I is the input activation, and O is the convolution calculation result.
The technical progress of the application of the invention is to provide a prediction method and a prediction system, which can predict the sparsity of output activation to guide the original neural network operation to skip the operation related to the 0 value, thereby reducing the calculation amount of the original network, saving the calculation resources, reducing the power consumption and improving the performance.
Drawings
FIG. 1 is a diagram of a predictor and actuator framework based on weight notation;
FIG. 2 is a detailed block diagram of a predictor and actuator based on weight notation;
FIG. 3 is a flow chart of a prediction phase;
fig. 4 is a flow chart of an execution phase.
Detailed Description
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
The prediction method comprises the following steps:
the convolutional layer calculation formula of the neural network is shown as the following formula:
O=∑I*W
=∑I*(Wmsh<<m+Wlsb)
=∑I*(Wmsb*2m+Wlsb)
=∑I*2m*(Wmsh+Wlsb*2-m)
=∑2m*(I*Wmsb+I*Wlsb*2-m)
=∑2m*(I*Wmsb+I*Wlsb*2-m)
≈∑2m*(I*Wmsb+I*W1*a)
convolution layer filterThe weight (W) is mapped to the input feature map (I) to extract input feature information. Since the filter weight can be decomposed into high bits (W)msb) And low (W)lsb) The convolution operation is therefore divided into two parts, input and high order filter computation and input and low order filter computation. W1It is an all 1 matrix, and the size of the matrix is the same as the size of the input activation. 0 is the convolution calculation result. In the neural network, the filter weight is a known parameter, the input image is also called input activation, and the output characteristic diagram of the first layer is obtained after the convolution of the first layer, and is also called output activation. The output profile of the first layer is also the input to the next convolutional layer. The neural network is executed one layer at a time, and the calculation result of the previous layer is the input of the next layer.
The prediction based on the sign bit of the weight value only uses the most significant bit (sign bit) of the weight value to execute convolution operation so as to determine that the final output result is 0 or not 0, and simultaneously, a compensation factor a is used for compensating the precision loss of the final result. The determination of the compensation factor requires the neural network operation by setting different values, which range from 0 to 1, but the different values have different effects on the result precision, and the best compensation factor should have the least effect on the result precision. Assuming a is 0.5, by performing I WmsbAnd I W10.5, if the sum of the two is negative, the activation value is 0 after the output passes through the activation function (RELU), otherwise, the activation value is positive. Based on the prediction result, only the relevant convolution operation with the prediction result being a positive value is selected in the original convolution calculation, and the relevant convolution operation with the prediction result being a negative value is skipped.
Based on this convolution operation, two phases are completed, a prediction phase and an execution phase. In the prediction stage, the prediction device uses the sign bit of the weight value and the compensation factor to predict the output activation, and the execution stage selects to execute the operation of non-0 output activation according to the prediction result. The prediction device is shown in fig. 1:
the detailed prediction and execution device based on weight symbolic prediction is shown in fig. 2. Convolution operation is performed with the input activations using the sign of the weight in the filter, wherein if the weight is positive, the sign is 0, the weight is negative, the sign is-1, and since 0 is multiplied by any number to be 0, only the operation with the sign of the weight of-1 is performed through the weight index, and the associated input activation is indexed by the weight sign index. At the same time, the input activates the operation of the execution and compensation factor a. And adding the results of the two to obtain a predicted output symbol, wherein if the predicted output symbol is a negative value, the output activation value becomes 0 after the value passes through the activation function, and if the predicted output symbol is a positive value, the output activation value remains unchanged after the value passes through the activation function. Based on this, the Index constraint unit calculates the correlation Index of the non-0 output activation according to the sign of the predicted value, weight Index, input act Index. According to the index information, the execution stage executes the operation of non-0 output activation, and directly outputs the 0 value for 0 output activation.
The process of predicting the output activation based on the weight sign is explained in detail in connection with the execution of the convolution, in this example, the filter size is 2 × 2, the input activation (Ifmap) size is 4 × 4, and the compensation factor is assumed to be 0.5.
The method comprises the following steps: obtaining the symbol of each weight according to the weight of the filter, wherein the symbol is 0 when the weight is greater than 0, and the symbol is-1 when the weight is less than 0;
step two: the sign of the weight and the input activation (Ifmap) perform convolution operations, wherein only the weight of-1 performs a multiply-add operation with the corresponding Ifmap, as shown in fig. 3, and the result of the convolution is-0.7, -1, -0.8, -0.3. Meanwhile, the convolution operation is also executed by the compensation factor and the input activation (Ifmap), and the convolution result is 0.8, 0.8, 0.85 and 0.95;
step three: the results obtained from the respective steps are added to obtain the predicted output data, which are shown in FIG. 3 as 0.1, -0.2, 0.05, 0.65.
Step four: in the execution stage, the operation of predicting that the output activation value is negative is skipped according to the prediction result. In fig. 3, the result-0.2 is a negative value, so the convolution operation associated with this value can be skipped during the execution stage, and as shown in fig. 4, the actuator only needs to perform convolution operations of 3 output activation values, which are 0.045, 0.145 and 0.475 respectively. And the output activation value with the predicted output value of-0.2 directly outputs 0.
As shown in fig. 4, it is determined whether a value less than or equal to 0 exists in the prediction result, if so, a vector position of the value less than or equal to 0 in the prediction result is obtained, a calculation process related to the vector position is skipped when performing convolution calculation, that is, a calculation process of a white fill color in a skip map is performed, only convolution of a gray fill color is calculated to obtain an activation output result, and a value located at the vector position in the activation output result is set to zero to obtain a convolution result.
The following is a system example corresponding to the above method example, and the present implementation system can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in the present implementation system, and are not described herein again for the sake of reducing repetition. Accordingly, the related-art details mentioned in the present embodiment system can also be applied to the above-described embodiments.
The invention also provides a convolutional neural network acceleration system and a convolutional neural network acceleration system, wherein the convolutional neural network acceleration system comprises:
the method comprises the following steps that a module 1 takes an image to be subjected to characteristic analysis as input activation to be input into a convolutional neural network, and decomposes a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to a weight in the filter;
the module 2 executes convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, executes convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adds the first convolution result and the second convolution result to obtain a prediction result;
and the module 2 skips 0 value related operation according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.
The convolutional neural network acceleration system and system, wherein, this module 2 includes: and judging whether a numerical value less than or equal to 0 exists in the prediction result, if so, acquiring the vector position of the numerical value less than or equal to 0 in the prediction result, skipping the calculation process related to the vector position when performing convolution calculation to obtain an activation output result, and setting the numerical value at the vector position in the activation output result to zero to obtain the convolution result.
The convolutional neural network acceleration system and system, wherein, this module 1 includes: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.
The convolutional neural network acceleration system and the convolutional neural network acceleration system are characterized in that the value range of the compensation factor is larger than 0 and smaller than 1.
The convolutional neural network acceleration system and system, wherein the convolution calculation is shown as the following formula:
O=∑I*W
w is the filter weight, I is the input activation, and O is the convolution calculation result.
Claims (10)
1. A convolutional neural network acceleration method and system are characterized by comprising the following steps:
step 1, inputting an image to be subjected to characteristic analysis as input activation into a convolutional neural network, decomposing a weight vector of a filter in the convolutional neural network, and obtaining a symbol vector corresponding to a weight in the filter;
step 2, performing convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, performing convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adding the first convolution result and the second convolution result to obtain a prediction result;
and 2, skipping 0 value related operation according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.
2. The convolutional neural network acceleration method and system as claimed in claim 1, wherein the step 2 comprises: and judging whether a numerical value less than or equal to 0 exists in the prediction result, if so, acquiring the vector position of the numerical value less than or equal to 0 in the prediction result, skipping the calculation process related to the vector position when performing convolution calculation to obtain an activation output result, and setting the numerical value at the vector position in the activation output result to zero to obtain the convolution result.
3. The convolutional neural network acceleration method and system as claimed in claim 1, wherein the step 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.
4. The convolutional neural network acceleration method and system of claim 1, wherein the range of values of the compensation factor is greater than 0 and less than 1.
5. The convolutional neural network acceleration method and system as claimed in claim 1, wherein the convolution calculation is shown by the following formula:
O=∑I*W
w is the filter weight, I is the input activation, and O is the convolution calculation result.
6. A convolutional neural network acceleration system and system, characterized by comprising:
the method comprises the following steps that a module 1 takes an image to be subjected to characteristic analysis as input activation to be input into a convolutional neural network, and decomposes a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to a weight in the filter;
the module 2 executes convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, executes convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adds the first convolution result and the second convolution result to obtain a prediction result;
and the module 2 skips 0 value related operation according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.
7. The convolutional neural network acceleration system and system as claimed in claim 1, wherein the module 2 comprises: and judging whether a numerical value less than or equal to 0 exists in the prediction result, if so, acquiring the vector position of the numerical value less than or equal to 0 in the prediction result, skipping the calculation process related to the vector position when performing convolution calculation to obtain an activation output result, and setting the numerical value at the vector position in the activation output result to zero to obtain the convolution result.
8. The convolutional neural network acceleration system and system as claimed in claim 1, wherein the module 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.
9. The convolutional neural network acceleration system and system of claim 1, wherein the compensation factor has a value range greater than 0 and less than 1.
10. The convolutional neural network acceleration system and system of claim 1, wherein the convolution calculation is represented by the following formula:
O=∑I*W
w is the filter weight, I is the input activation, and O is the convolution calculation result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011147836.6A CN112288085B (en) | 2020-10-23 | 2020-10-23 | Image detection method and system based on convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011147836.6A CN112288085B (en) | 2020-10-23 | 2020-10-23 | Image detection method and system based on convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112288085A true CN112288085A (en) | 2021-01-29 |
CN112288085B CN112288085B (en) | 2024-04-09 |
Family
ID=74423769
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011147836.6A Active CN112288085B (en) | 2020-10-23 | 2020-10-23 | Image detection method and system based on convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112288085B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203617A (en) * | 2016-06-27 | 2016-12-07 | 哈尔滨工业大学深圳研究生院 | A kind of acceleration processing unit based on convolutional neural networks and array structure |
CN107506828A (en) * | 2016-01-20 | 2017-12-22 | 南京艾溪信息科技有限公司 | Computing device and method |
US20180157969A1 (en) * | 2016-12-05 | 2018-06-07 | Beijing Deephi Technology Co., Ltd. | Apparatus and Method for Achieving Accelerator of Sparse Convolutional Neural Network |
CN110991631A (en) * | 2019-11-28 | 2020-04-10 | 福州大学 | Neural network acceleration system based on FPGA |
CN111368699A (en) * | 2020-02-28 | 2020-07-03 | 交叉信息核心技术研究院(西安)有限公司 | Convolutional neural network pruning method based on patterns and pattern perception accelerator |
CN111738435A (en) * | 2020-06-22 | 2020-10-02 | 上海交通大学 | Online sparse training method and system based on mobile equipment |
-
2020
- 2020-10-23 CN CN202011147836.6A patent/CN112288085B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107506828A (en) * | 2016-01-20 | 2017-12-22 | 南京艾溪信息科技有限公司 | Computing device and method |
CN106203617A (en) * | 2016-06-27 | 2016-12-07 | 哈尔滨工业大学深圳研究生院 | A kind of acceleration processing unit based on convolutional neural networks and array structure |
US20180157969A1 (en) * | 2016-12-05 | 2018-06-07 | Beijing Deephi Technology Co., Ltd. | Apparatus and Method for Achieving Accelerator of Sparse Convolutional Neural Network |
CN110991631A (en) * | 2019-11-28 | 2020-04-10 | 福州大学 | Neural network acceleration system based on FPGA |
CN111368699A (en) * | 2020-02-28 | 2020-07-03 | 交叉信息核心技术研究院(西安)有限公司 | Convolutional neural network pruning method based on patterns and pattern perception accelerator |
CN111738435A (en) * | 2020-06-22 | 2020-10-02 | 上海交通大学 | Online sparse training method and system based on mobile equipment |
Non-Patent Citations (1)
Title |
---|
WEIZHI XU 等: "《Blocking and sparsity for optimization of convolution calculation algorithm on GPUs》", 《ARXIV》 * |
Also Published As
Publication number | Publication date |
---|---|
CN112288085B (en) | 2024-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109740731B (en) | Design method of self-adaptive convolution layer hardware accelerator | |
JP2018109947A (en) | Device and method for increasing processing speed of neural network, and application of the same | |
US8019594B2 (en) | Method and apparatus for progressively selecting features from a large feature space in statistical modeling | |
CN111695671A (en) | Method and device for training neural network and electronic equipment | |
US8019593B2 (en) | Method and apparatus for generating features through logical and functional operations | |
US8533653B2 (en) | Support apparatus and method for simplifying design parameters during a simulation process | |
CN102724506B (en) | JPEG (joint photographic experts group)_LS (laser system) general coding hardware implementation method | |
Chen et al. | Approximate softmax functions for energy-efficient deep neural networks | |
CN115936248A (en) | Attention network-based power load prediction method, device and system | |
CN112288085B (en) | Image detection method and system based on convolutional neural network | |
CN114830137A (en) | Method and system for generating a predictive model | |
Nehmeh et al. | Integer word-length optimization for fixed-point systems | |
CN112215349B (en) | Sparse convolutional neural network acceleration method and device based on data flow architecture | |
CN113031952B (en) | Method, device and storage medium for determining execution code of deep learning model | |
CN115496181A (en) | Chip adaptation method, device, chip and medium of deep learning model | |
Lin et al. | A design framework for hardware approximation of deep neural networks | |
CN112395832B (en) | Text quantitative analysis and generation method and system based on sequence-to-sequence | |
JP7235171B2 (en) | Verification system and determination system for fixed-point arithmetic bit width | |
CN108345938A (en) | A kind of neural network processor and its method including bits switch device | |
US20230205957A1 (en) | Information processing circuit and method for designing information processing circuit | |
Ling et al. | A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs | |
US20220413806A1 (en) | Information processing circuit and method of designing information processing circuit | |
CN112015472B (en) | Sparse convolutional neural network acceleration method and system based on data flow architecture | |
CN117454948B (en) | FP32 model conversion method suitable for domestic hardware | |
US20230385600A1 (en) | Optimizing method and computing apparatus for deep learning network and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |