CN109389212A - A kind of restructural activation quantization pond system towards low-bit width convolutional neural networks - Google Patents
A kind of restructural activation quantization pond system towards low-bit width convolutional neural networks Download PDFInfo
- Publication number
- CN109389212A CN109389212A CN201811646433.9A CN201811646433A CN109389212A CN 109389212 A CN109389212 A CN 109389212A CN 201811646433 A CN201811646433 A CN 201811646433A CN 109389212 A CN109389212 A CN 109389212A
- Authority
- CN
- China
- Prior art keywords
- activation
- quantization
- restructural
- pond
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004913 activation Effects 0.000 title claims abstract description 76
- 238000013139 quantization Methods 0.000 title claims abstract description 76
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 30
- 238000000034 method Methods 0.000 claims abstract description 16
- 238000005457 optimization Methods 0.000 claims abstract description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000004148 unit process Methods 0.000 claims description 3
- 238000002620 method output Methods 0.000 claims description 2
- 238000000516 activation analysis Methods 0.000 claims 1
- 238000013461 design Methods 0.000 abstract description 10
- 230000008859 change Effects 0.000 abstract description 5
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 238000005265 energy consumption Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000007634 remodeling Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Image Processing (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
Abstract
Restructural activation quantization pond system towards low-bit width convolutional neural networks of the invention includes: several restructural activation quantization pond processing units, for executing activation, quantization, pondization operation, and executes operating mode and activate-quantify operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode;Storage unit controller, for controlling the data transmission of restructural activation quantization pond unit and storage unit under different configurations;Storage unit, for convolutional layer result data needed for scratch poolization operation.Software optimization design can reduce redundant computation and not change original function by the way that several steps such as the activation of low-bit width convolutional neural networks, quantization are reduced to a step.The utility model has the advantages that reducing hardware resource area by being mapped in tri- activation, quantization, Chi Hua steps on same hardware cell in restructural mode;Using the method for hardware and software co-optimization, have the characteristics that area is small, low in energy consumption, flexibility is high.
Description
Technical field
The invention belongs to the hardware-accelerated fields of intelligent algorithm, more particularly to one kind is towards low-bit width convolutional neural networks
Restructural activation quantify pond system.
Background technique
Low-bit width convolutional neural networks are typically expressed as 4bit and quantization convolutional neural networks below, are different from tradition
Convolutional neural networks, its weight and image input data can be indicated only with several bit, such as binaryzation network, three
Value network and other low-bit widths quantify neural network.The weight and image input data of binaryzation network can be only with 0 or 1 tables
Show;The weight of three-valued network is only used and 0 or 1 is indicated, image input data characterization is -1,0 or 1;In many other low-bit widths
Quantify in neural network, commonly uses certain bit combination mode to express certain number, such as " 01 " of 2bit indicates numerical value 0.5.
In low-bit width convolutional neural networks, in addition to convolutional layer, active coating, the pond layer for including in traditional network, can also
Specially one quantization operation of design, the image output data of generation is quantified again to the bit wide set originally.
It is more and more for the hardware design of this low-bit width convolutional neural networks in recent years, the common processing of convolutional layer
Process is that sequence executes following operation: convolution, batch standardization, activation, quantization and pond, some convolutional layers do not have pondization operation;
The complete usual process flow of articulamentum is that sequence executes following operation: full connection, batch standardization, activation and quantization.But it is such
Serial operation can reduce treatment effeciency, bring additional hardware spending, cannot meet the needs of practical application well.
Summary of the invention
It is an object of the invention to overcome the deficiency of the above prior art, one kind is provided and effectively improves activation, quantization, pond
Change the flexibility of operation, reduce power consumption, reduces the restructural activation quantization towards low-bit width convolutional neural networks of hardware spending
Pond system, is specifically realized by the following technical scheme:
The restructural activation towards low-bit width convolutional neural networks quantifies pond system, receives convolutional layer number of results
According to the system includes:
Several restructural activation quantify pond processing unit, for executing activation, quantization, pondization operation, and execute
Operating mode activates-quantifies operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode;
Storage unit controller, for controlling the number of restructural activation quantization pond unit and storage unit under different configurations
According to transmission;
Storage unit, for convolutional layer result data needed for scratch poolization operation.
It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, can
Reconstruct activation quantization pond processing unit data under activation-quantization operating mode are transferred to restructural from convolution processing unit
Activation quantization pond processing unit directly exports result data after treatment;Under activation-quantization-pond operating mode
The convolutional layer result data is received, storage unit is first deposited into, under the control of storage unit controller, is passed to restructural sharp
Quantization pond processing unit processes living, processing result are still stored back to storage unit.
It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute
The activation primitive such as formula (1) in restructural activation quantization pond processing unit is stated,
xo=min (abs (xi),1) (1)
Wherein, xiRepresent process of convolution it is complete after data, xoIndicate the activation value after activation.
Quantization function such as formula (2) in restructural activation quantization pond processing unit,
Wherein, k indicates the bit bit wide after quantization, x at thisiIndicate activation value, xoQuantized value after indicating quantization.
And corresponding Chi Huahe size is 2x 2, such as formula (3):
xo(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)) (3)
Wherein, i, j are illustrated respectively in the coordinate position in single channel input picture, x at thisiIndicate quantized value, xoIndicate pond
Pond value after change.
It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute
The workflow for stating system includes the following steps:
Operating mode is determined first;If operating mode is activation-quantization, by analyzing different low bit convolutional Neural nets
Activation primitive, the quantization method of network determine the series of characteristics or parameter of activation primitive, quantization method;Then activation letter is determined
Several and quantization method output area crossing redundancy part, by this simplified partial;
If operating mode is activation-quantization-pond, analysis cell core on the basis of activation-quantization algorithm optimization is needed
Size;After optimization, pondization operation is incorporated into activation-quantization operation, forms new activation-quantization-pondization operation.
It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute
It states storage unit and supports ping-pong operation, the incoming data of a part of storage unit storage convolutional layer, another part storage unit is deposited
Data needed for putting restructural activation quantization pond processing unit.
It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute
Stating restructural activation quantization pond processing unit includes three phases unit, distinguish first stage unit, second stage unit with
And phase III unit, each stage unit include comparator, gate and register, the comparison in first stage unit
The input of device two is external image input data and threshold value 3, and the input of comparator two in second stage unit is single from the first stage
The data and threshold value 2 of member output, the input of comparator two in phase III unit for the data that exports from second stage unit with
Threshold value 1.
Advantages of the present invention is as follows:
Restructural activation quantization pond system towards low-bit width convolutional neural networks of the invention is mainly for low-bit width
Convolutional neural networks characteristic, realize multiple types activation quantization pond soft or hard piece optimization;Spirit possessed by the design method
It is active high, computation complexity is low, area is small, the characteristics such as low in energy consumption.
Detailed description of the invention
Fig. 1 is the module diagram that the restructural activation towards low-bit width convolutional neural networks quantifies pond system.
Fig. 2 is restructural activation quantization pond processing unit schematic diagram.
Fig. 3 is restructural activation quantization pond processing unit configuration schematic diagram.
Fig. 4 is system operation schematic diagram in the case where activation quantifies operating mode.
Fig. 5 is system operation schematic diagram in the case where quantifying pond chemical industry operation mode.
Specific embodiment
The present invention program is described in detail with reference to the accompanying drawing.
If the restructural activation quantization pond system towards low-bit width convolutional neural networks of Fig. 1, the present embodiment include more
A restructural activation quantization pond processing unit, storage unit controller and storage unit;Restructural activation quantifies Chi Huadan
Member is for executing activation, quantization, pondization operation;Storage unit controller is for controlling restructural activation quantization pond under different configurations
Change the data transmission of unit and storage unit;Storage unit is for convolutional layer result data needed for scratch poolization operation.
In Fig. 1, dotted line indicates that data flow under activation-quantization operating mode, data are transferred to from convolution processing unit
Restructural activation quantifies pond processing unit, after treatment, directly exports result data;Solid line indicates activation-quantization-pond
Operating mode under data flow, convolutional layer result data is transferred to the module, is first deposited into storage unit, in storage unit
Under the control of controller, restructural activation quantization pond processing unit processes are passed to, processing result is still stored back to storage unit;It should
Storage unit supports ping-pong operation, to guarantee that implementation procedure will not be interrupted.
Below with reference to a specific low-bit width convolutional neural networks, the specific application method of the design method is illustrated.It should
Low-bit width convolutional neural networks image input data bit wide is 2bit, weight 1bit;Activation primitive and quantization function are respectively such as
Under:
xo=min (abs (xi),1) (4)
And Chi Huahe size is 2x 2, it is as follows:
xo(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)) (6)
System optimization software algorithm first, according to the parameter k=2 of specific low-bit width convolutional neural networks.Root
According to analysis, activation primitive output area can be obtained in [0,1], the input range of quantization function is the output area of activation primitive;
The threshold value of the quantization function is 1/6,1/2,5/6, after being quantified by quantization function, and output will fall in 0,1/3,2/3,1 this four
Numerically;Activation and quantization function can be replaced by following series of comparisons, if input is greater than 5/6, quantized value takes 1;If
Input is greater than 1/2 and is less than or equal to 5/6, then quantized value takes 2/3;If input is greater than 1/6 and is less than or equal to 1/2, quantized value takes 1/
3;If x is less than or equal to 1/6, quantized value takes 0.
The system optimizes hardware components, because the bit wide of the image input data of the network is 2bit, institute
With the number of stages of the multistage unit stream treatment framework for 3, as shown in Figure 2;It include a comparison during each stage unit is equal
Device, several gates and register.The input of comparator two in stage unit 1 is external image input data and threshold value 3, stage
The input of comparator two in unit 2 is the data and threshold value 2 exported from stage unit 1, and the comparator two in stage unit 3 inputs
For the data and threshold value 1 exported from stage unit 2.Configuration words in figure are used to configure the operating mode of the unit, configuration words 1
When the cell operation mode be activation-quantization-pond, configuration words be 0 when, the cell operation under activation-quantization mode, have
The numerical value configuration of body is as shown in Figure 3.
The unit under activation-quantization-Chi Huagong operation mode specifically executes process, referring to fig. 4.It is operated above comparator
When number is greater than lower section operand, otherwise comparator output 1 exports 0;Assuming that (a, b, c, d) is the four of one 2x2 subregion of image
A pixel value, the relationship of 3 threshold values and this four values are as follows: a > 3 > b of threshold value > 2 > c of threshold value > 1 > d of threshold value;At the moment 1, pixel value a
It is compared with threshold value 3, compares 1 and set 1, be stored in enabled 1, quantized value 4 is then the quantized result of a, and there are in register 1;
In the moment 2, pixel b enters stage unit 1 because enabled 1 is set to 1 in last moment, no matter b and threshold value 3
Between size relation, 1 will be remained by comparing 1, therefore the value in register 1 can't change, at the same time, in register
Quantized value 4 in 1 is transmitted into stage unit 2, and is compared with threshold value 2, although threshold value 2 is greater than threshold value 1, gating 22
It is controlled by enabled 1 from stage unit 1, still remains 0, there are the ratios that the number in register 2 is also a upper stage unit
Compared with result threshold value 3;And so on, when output enable signal sets high, the threshold value 3 being stored in register 3 is used as the subregion
The result of pondization operation will be exported;Gray background then represents the stage unit being turned off in Fig. 4.
The unit under activation-quantization operating mode specifically executes process, referring to Fig. 5.After configuration words set 0, comparison signal
It will not be influenced by the enable signal of the stage unit last moment, but still will receive the enable signal of a stage unit
Control;Because it needs the number to each input to export after quantifying, not compared with activation-quantization-pond mode
Need to turn off the relatively decimal of subsequent input in subregion.
If the stage, in some stage unit, input is greater than threshold value, enable signal will set 1, with from two sides in length and breadth
Always subsequent operation is turned off, it is longitudinal to indicate remaining image input data, laterally indicate processing present image input data
Operation inside Remaining Stages unit.
The system is optimized from two angles of software and hardware, using multistage unit stream treatment framework as base
Plinth is support with Reconfiguration Technologies, is guiding with stage shutdown Low-power Technology, can make the power consumption drop of activation quantization pond module
Low, area is reduced, flexibility is promoted.
The above, is only presently preferred embodiments of the present invention, is not that the invention has other forms of limitations, any ripe
Know the equivalent reality that professional and technical personnel was changed or be modified as equivalent variations possibly also with the technology contents of the disclosure above
Apply example.But without departing from the technical solutions of the present invention, to the above embodiments according to the technical essence of the invention
Any simple modification, equivalent variations and remodeling, still fall within the protection scope of technical solution of the present invention.
Claims (6)
1. a kind of restructural activation towards low-bit width convolutional neural networks quantifies pond system, reception convolutional layer result data its
It is characterized in that: including:
Several restructural activation quantify pond processing unit, for executing activation, quantization, pondization operation, and execute work
Mode activation-quantization operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode;
Storage unit controller is passed for controlling the data of restructural activation quantization pond unit and storage unit under different configurations
It is defeated;
Storage unit, for convolutional layer result data needed for scratch poolization operation.
2. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special
Sign is: restructural activation quantization pond processing unit data under activation-quantization operating mode are passed from convolution processing unit
It is defeated to quantify pond processing unit to restructural activation, after treatment, directly export result data;In activation-quantization-pond
The convolutional layer result data is received under operating mode, is first deposited into storage unit, under the control of storage unit controller, is passed
Enter restructural activation quantization pond processing unit processes, processing result is still stored back to storage unit.
3. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special
Sign is: the activation primitive such as formula (1) in the restructural activation quantization pond processing unit,
xo=min (abs (xi), 1), (1)
Wherein, xiRepresent process of convolution it is complete after data, xoIndicate the activation value after activation.
Quantization function such as formula (2) in restructural activation quantization pond processing unit,
Wherein, k indicates the bit bit wide after quantization, x at thisiIndicate activation value, xoQuantized value after indicating quantization.And it is corresponding
Chi Huahe size is 2x 2, such as formula (3):
xo(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)), (3)
Wherein, i, j are illustrated respectively in the coordinate position in single channel input picture, x at thisiIndicate quantized value, xoIndicate Chi Huahou
Pond value.
4. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special
Sign is: the workflow of the system includes the following steps:
Operating mode is determined first;If operating mode is activation-quantization, by analyzing different low bit convolutional neural networks
Activation primitive, quantization method determine the series of characteristics or parameter of activation primitive, quantization method;Then determine activation primitive and
The crossing redundancy part of quantization method output area, which is realized, to be simplified;If operating mode is activation-quantization-pond, need in activation-
Analysis cell core size on the basis of the algorithm optimization of quantization;After optimization, pondization operation is incorporated into activation-quantization operation, shape
Activation-quantization of Cheng Xin-pondization operation.
5. the restructural activation according to claim 4 towards low-bit width convolutional neural networks quantifies pond system, special
Sign is: the storage unit supports ping-pong operation, the incoming data of a part of storage unit storage convolutional layer, another part to deposit
Data needed for storage unit stores restructural activation quantization pond processing unit.
6. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special
Sign is: the restructural activation quantization pond processing unit includes three phases unit, respectively first stage unit, second-order
Segment unit and phase III unit, each stage unit include comparator, gate and register, first stage unit
In the input of comparator two be external image input data and threshold value 3, the input of comparator two in second stage unit is from the
The data and threshold value 2 of one stage unit output, the input of comparator two in phase III unit is exports from second stage unit
Data and threshold value 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811646433.9A CN109389212B (en) | 2018-12-30 | 2018-12-30 | Reconfigurable activation quantization pooling system for low-bit-width convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811646433.9A CN109389212B (en) | 2018-12-30 | 2018-12-30 | Reconfigurable activation quantization pooling system for low-bit-width convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109389212A true CN109389212A (en) | 2019-02-26 |
CN109389212B CN109389212B (en) | 2022-03-25 |
Family
ID=65430886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811646433.9A Active CN109389212B (en) | 2018-12-30 | 2018-12-30 | Reconfigurable activation quantization pooling system for low-bit-width convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109389212B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110222815A (en) * | 2019-04-26 | 2019-09-10 | 上海酷芯微电子有限公司 | Configurable activation primitive device and method suitable for deep learning hardware accelerator |
CN110390385A (en) * | 2019-06-28 | 2019-10-29 | 东南大学 | A kind of general convolutional neural networks accelerator of configurable parallel based on BNRP |
CN110718211A (en) * | 2019-09-26 | 2020-01-21 | 东南大学 | Keyword recognition system based on hybrid compressed convolutional neural network |
WO2020172829A1 (en) * | 2019-02-27 | 2020-09-03 | 华为技术有限公司 | Method and apparatus for processing neural network model |
CN111767204A (en) * | 2019-04-02 | 2020-10-13 | 杭州海康威视数字技术股份有限公司 | Overflow risk detection method, device and equipment |
CN113762496A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for reducing inference operation complexity of low-bit convolutional neural network |
CN114169513A (en) * | 2022-02-11 | 2022-03-11 | 深圳比特微电子科技有限公司 | Neural network quantization method and device, storage medium and electronic equipment |
WO2023004800A1 (en) * | 2021-07-30 | 2023-02-02 | 华为技术有限公司 | Neural network post-processing method and apparatus, chip, electronic device, and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017124645A1 (en) * | 2016-01-20 | 2017-07-27 | 北京中科寒武纪科技有限公司 | Apparatus for processing floating point number |
CN108364061A (en) * | 2018-02-13 | 2018-08-03 | 北京旷视科技有限公司 | Arithmetic unit, operation execute equipment and operation executes method |
CN108510067A (en) * | 2018-04-11 | 2018-09-07 | 西安电子科技大学 | The convolutional neural networks quantization method realized based on engineering |
CN108647779A (en) * | 2018-04-11 | 2018-10-12 | 复旦大学 | A kind of low-bit width convolutional neural networks Reconfigurable Computation unit |
-
2018
- 2018-12-30 CN CN201811646433.9A patent/CN109389212B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017124645A1 (en) * | 2016-01-20 | 2017-07-27 | 北京中科寒武纪科技有限公司 | Apparatus for processing floating point number |
CN108364061A (en) * | 2018-02-13 | 2018-08-03 | 北京旷视科技有限公司 | Arithmetic unit, operation execute equipment and operation executes method |
CN108510067A (en) * | 2018-04-11 | 2018-09-07 | 西安电子科技大学 | The convolutional neural networks quantization method realized based on engineering |
CN108647779A (en) * | 2018-04-11 | 2018-10-12 | 复旦大学 | A kind of low-bit width convolutional neural networks Reconfigurable Computation unit |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112189205A (en) * | 2019-02-27 | 2021-01-05 | 华为技术有限公司 | Neural network model processing method and device |
WO2020172829A1 (en) * | 2019-02-27 | 2020-09-03 | 华为技术有限公司 | Method and apparatus for processing neural network model |
CN111767204A (en) * | 2019-04-02 | 2020-10-13 | 杭州海康威视数字技术股份有限公司 | Overflow risk detection method, device and equipment |
CN111767204B (en) * | 2019-04-02 | 2024-05-28 | 杭州海康威视数字技术股份有限公司 | Spill risk detection method, device and equipment |
CN110222815B (en) * | 2019-04-26 | 2021-09-07 | 上海酷芯微电子有限公司 | Configurable activation function device and method suitable for deep learning hardware accelerator |
CN110222815A (en) * | 2019-04-26 | 2019-09-10 | 上海酷芯微电子有限公司 | Configurable activation primitive device and method suitable for deep learning hardware accelerator |
CN110390385B (en) * | 2019-06-28 | 2021-09-28 | 东南大学 | BNRP-based configurable parallel general convolutional neural network accelerator |
CN110390385A (en) * | 2019-06-28 | 2019-10-29 | 东南大学 | A kind of general convolutional neural networks accelerator of configurable parallel based on BNRP |
CN110718211A (en) * | 2019-09-26 | 2020-01-21 | 东南大学 | Keyword recognition system based on hybrid compressed convolutional neural network |
CN113762496A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for reducing inference operation complexity of low-bit convolutional neural network |
CN113762496B (en) * | 2020-06-04 | 2024-05-03 | 合肥君正科技有限公司 | Method for reducing low-bit convolutional neural network reasoning operation complexity |
WO2023004800A1 (en) * | 2021-07-30 | 2023-02-02 | 华为技术有限公司 | Neural network post-processing method and apparatus, chip, electronic device, and storage medium |
CN114169513A (en) * | 2022-02-11 | 2022-03-11 | 深圳比特微电子科技有限公司 | Neural network quantization method and device, storage medium and electronic equipment |
CN114169513B (en) * | 2022-02-11 | 2022-05-24 | 深圳比特微电子科技有限公司 | Neural network quantization method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109389212B (en) | 2022-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109389212A (en) | A kind of restructural activation quantization pond system towards low-bit width convolutional neural networks | |
KR102098713B1 (en) | Heterogenous Processor Architecture to Integrate CNN and RNN Neural Networks on a Single Chip | |
CN103019656B (en) | The multistage parallel single instruction multiple data array processing system of dynamic reconstruct | |
CN102665049A (en) | Programmable visual chip-based visual image processing system | |
US20190044535A1 (en) | Systems and methods for compressing parameters of learned parameter systems | |
Park et al. | Holistic sparsecnn: Forging the trident of accuracy, speed, and size | |
CN110531996B (en) | Particle swarm optimization-based computing task unloading method in multi-micro cloud environment | |
CN110443214B (en) | RISC-V based face recognition acceleration circuit system and acceleration method | |
CN108765264A (en) | Image U.S. face method, apparatus, equipment and storage medium | |
CN109359556A (en) | A kind of method for detecting human face and system based on low-power-consumption embedded platform | |
Oh et al. | A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor | |
CN109002885A (en) | A kind of convolutional neural networks pond unit and pond calculation method | |
CN105403769B (en) | A kind of circuit structure and its control method based on FFT Short Time Fourier Analysis | |
CN112559197A (en) | Convolution calculation data reuse method based on heterogeneous many-core processor | |
CN109687875A (en) | A kind of time series data processing method | |
CN108960246A (en) | A kind of binary conversion treatment device and method for image recognition | |
CN116362503B (en) | Electric power regulating method and system based on artificial intelligence | |
CN106990913A (en) | A kind of distributed approach of extensive streaming collective data | |
CN115983343A (en) | YOLOv4 convolutional neural network lightweight method based on FPGA | |
Liu et al. | Tcp-net: Minimizing operation counts of binarized neural network inference | |
CN113034343A (en) | Parameter-adaptive hyperspectral image classification GPU parallel method | |
CN104598205A (en) | Sorting system and method for dataflow of function block diagram | |
Dai et al. | Memory-Efficient Batch Normalization By One-Pass Computation for On-Device Training | |
WO2021036668A1 (en) | Global pooling method for neural network and many-core system | |
CN110096739A (en) | Model generating method, generating means and the terminal device of finite state machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |