CN110232441A - A kind of stacking-type based on unidirectional systolic arrays is from encoding system and method - Google Patents
A kind of stacking-type based on unidirectional systolic arrays is from encoding system and method Download PDFInfo
- Publication number
- CN110232441A CN110232441A CN201910528794.1A CN201910528794A CN110232441A CN 110232441 A CN110232441 A CN 110232441A CN 201910528794 A CN201910528794 A CN 201910528794A CN 110232441 A CN110232441 A CN 110232441A
- Authority
- CN
- China
- Prior art keywords
- data
- unidirectional
- control module
- module
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003491 array Methods 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 title claims description 17
- 238000013528 artificial neural network Methods 0.000 claims abstract description 10
- 238000004891 communication Methods 0.000 claims abstract description 4
- 210000002569 neuron Anatomy 0.000 claims description 18
- 230000004913 activation Effects 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 5
- 241001269238 Data Species 0.000 claims description 3
- 230000006399 behavior Effects 0.000 claims description 3
- 230000010349 pulsation Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 abstract description 2
- 239000010410 layer Substances 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 210000004218 nerve net Anatomy 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 210000000352 storage cell Anatomy 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Complex Calculations (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Stacking-type based on unidirectional systolic arrays of the invention is from the hardware realization of encryption algorithm reasoning, including signal control module, input/output control module, data address generation module and computing array module;Signal control module: receiving commencing signal, controls each intermodule communication, generates end signal;Input/output control module: reading the data of the outer DDR of piece when input and is stored in on-chip SRAM by ad hoc fashion, and on-chip SRAM data are write back DDR by ad hoc fashion when output;Data address generation module: source data or result data address are generated;Computing array module: the reasoning operation of neural network algorithm is carried out in a manner of unidirectional systolic arrays.Present invention support batch processing, support water operation realize that part calculates hiding, the speed-up ratio height of time and memory access time by ping-pong operation, and scalability is good.
Description
Technical field
The present invention relates to field of artificial intelligence more particularly to a kind of stacking-type based on unidirectional systolic arrays are self-editing
Code system and method.
Background technique
Stack noise reduction self-encoding encoder is typical standard neural network, has two main points, one be it is a series of from
Dynamic encoder, the other is multilayer perceptron (MLP).The reasoning process of stack noise reduction self-encoding encoder is actually equivalent to more
The feed forward process of layer perceptron, if the output of certain layer of j-th of neuron is y in networkj, operand has n, i-th of operand
For xi, respective weights wij, it is biased to bi, then have:
For such computation-intensive algorithm, powerful calculation power is needed to be supported.Before 2007, it is limited to work as
When the factors such as network size and data volume, general cpu chip can provide enough calculating power.Later, fast with GPU
Speed development, parallel computation characteristic adapts to the requirement of intelligent algorithm big data parallel computation just, therefore GPU becomes master
Stream.Structurally, there is the transistor of accounting 70% to be used to construct Cache (Cache) and control unit in CPU, patrol
It is few to collect arithmetic element (ALU module), it is difficult to meet the calculation power demand of intelligent algorithm;The far super CPU of the computing capability of GPU,
But the hardware configuration of GPU does not have programmability, if intelligent algorithm varies widely, GPU can not be configured flexibly firmly
Part structure.In addition, the energy consumption of GPU and CPU is all bigger.
Nowadays, the appearance with more and more application scenarios with advances in technology, people are to artificial intelligence chip
Demand is gradually promoted, and Artificial Intelligence Development faces new problem, for example pilotless automobile needs real-time, extremely low delay
Reaction, this characteristic determine that we cannot use big power consumption, high-cost GPU.
How under acceptable power consumption, cost limitation, solves the problems, such as the huge calculation amount of deep learning, make nerve net
Network performance is more preferable, power consumption is lower, a scalability more preferably current manual's intelligence big technical problem.
Summary of the invention
Present invention aims to overcome that existing technical problem makes full use of and deposits to improve neural network computing efficiency
Resource and computing resource are stored up, the calculating speed of reasoning is accelerated, provides a kind of stacking-type based on unidirectional systolic arrays from encoding
System is specifically realized by the following technical scheme:
The stacking-type based on unidirectional systolic arrays is from coded system, comprising:
Signal control module: receiving commencing signal, controls each intermodule communication, generates end signal;
Memory module: including DDR memory outside piece and on-chip SRAM memory;
Input/output control module: the data and sequence that piece outer DDR memory is read when input are stored in on-chip SRAM storage
The data sequence of on-chip SRAM memory is write back DDR chip external memory when output by device;
Data address generation module: the address of source data or result data is generated;
Computing array module: the reasoning operation of neural network algorithm is carried out in a manner of unidirectional systolic arrays.
The stacking-type based on unidirectional systolic arrays from coded system it is further design be, the neural network
All results of algorithm share same set of storage resource, and the storage location that the intermediate result of algorithm generation occupies can cover.
The stacking-type based on unidirectional systolic arrays from coded system it is further design be, computing array module
It include: the unidirectional systolic arrays that scale is 32x32, each independent computing unit includes 16 fixed-point multiplication devices, adds in the array
Musical instruments used in a Buddhist or Taoist mass, divider, support Relu function calculating linear activation primitive computing unit and support tanh function and
The nonlinear activation function computing unit that sigmoid function calculates realizes the calculating multiplied accumulating with neural network activation primitive.
The stacking-type based on unidirectional systolic arrays from coded system it is further design be, the unidirectional pulsation
The unidirectional microseismic data transmission mode that the mode of array is pulsation between using column, in the ranks broadcasts, specifically: operand is with behavior
Unit simultaneous transmission is to each computing unit in a column, and weight is to arrange each calculating list sequentially entered in a column for unit
Member supports the multiple multiplexing of weight and operand.
Using the stacking-type based on unidirectional systolic arrays from coded system from coding method, including walk as follows
It is rapid:
Step 1) signal control module receives algorithm commencing signal, controls input/output control module for input data
It is transferred in SRAM memory in a particular order from DDR memory;
Step 2) controls data address generation module generating source data address to signal control module, according to source data
The operand stored in SRAM memory is passed to computing array module by location, is generated and is passed to input data useful signal;
Step 3) computing array module receives the input data useful signal and reads in operand from SRAM memory
Afterwards, start to carry out ANN Reasoning calculating, in calculating process: for each column, different neuron respective weights are from top to bottom
It flows in each computing unit;For every a line, it is broadcast to each calculating of computing array from left to right with batch input data
In unit, calculating process is completed in each computing unit;
Step 4) computing array module generates output data useful signal, and signal control module is controlled after receiving the signal
Data address generation module processed generates result data address, and result data is passed to SRAM memory according to result data address
In;
Step 5) signal control module control input/output control module writes result data from on-chip SRAM memory
Enter in the outer DDR memory of piece, generate end signal, completes the calculating of primary complete ANN Reasoning.
The further design from coding method is that input data includes operand and weight in the step 1),
Each address bit of storage unit can store 4 16 fixed-point datas in SRAM memory, and operand and weight are storing
Sequential storage in unit.
Beneficial effects of the present invention:
Stacking-type based on unidirectional systolic arrays of the invention supports nerve net from the hardware realization of encryption algorithm reasoning
Network layers number and neuronal quantity are configurable, support the selection of three kinds of different interlayer activation primitives, support flowing water and table tennis behaviour
Make, support batch processing and it is flexible in application, scalability is good.
Detailed description of the invention
Fig. 1 is typical stacking-type autoencoder network model schematic.
Fig. 2 is schematic diagram of the stacking-type based on unidirectional systolic arrays from coded system.
Fig. 3 is unidirectional systolic arrays data flow schematic diagram.
Fig. 4 is operand storage mode schematic diagram.
Fig. 5 is weight storage mode schematic diagram.
Fig. 6 is output storage mode schematic diagram.
Specific embodiment
The present invention is described in detail with reference to the accompanying drawing.
As shown in Figure 1, the present embodiment, by taking typical standard neural network as an example, the connection type before layer and layer is complete
Connection, each neuron receive the input from upper one layer of all neuron, each neuron and all nerves of next layer
Member is connected, and input is transmitted by the connection of Weight and the biasing of each neuron, and the output of neuron is by current
Neuron weight, the biasing of Current neural member and the output of upper one layer of neuron determine.
The stacking-type based on unidirectional systolic arrays of the present embodiment is from coded system mainly by signal control module, input
Output control module, data address generation module and computing array module composition.
Relationship between each module is referring to fig. 2, wherein signal control module is responsible for receiving commencing signal, controls each module
Between communicate, generate end signal.It specifically including: receiving commencing signal, control input/output control module is passed to source data,
Data address generation module generating source data address is controlled, source data is read from storage unit by incoming calculating battle array according to address
Column module carries out operation, after operation, receives output data useful signal, control data address generation module generates result
Data address, control input/output control module spread out of result data, generate end signal.
Input/output control module is responsible for the communication between on-chip SRAM and piece external storage DDR, specifically includes to receive and ask
After seeking signal, reads the data of the outer DDR of piece and be passed to on-chip SRAM by specific regular and sequence, used for computing array.Entirely
After portion calculates, end signal is received, reads DDR outside the data and incoming piece of on-chip SRAM by specific rule and sequence.
Data address generation module: it before calculating, generates source data (including operand and weight) address and exports, count
After calculation, generates output data (i.e. result data) address and export;
The computing array module design structure of unidirectional systolic arrays, the array are completed all of ANN Reasoning and are multiplied
Accumulating operation specifically includes and receives the input data useful signal from signal control module, starts ANN Reasoning fortune
It calculates, calculating finishes, produce output result useful signal and input signal control module.
A concrete case is provided below in conjunction with Fig. 3 to realize.In the case, the reasoning and calculation module of neural network by
The unidirectional systolic arrays composition of one 32x32, memory module is by 128 data storage cells and 32 constant storage unit groups
At.Wherein, it is 64 that data storage cell, which is bit wide, and depth is the SRAM of 8k;Constant storage unit is that bit wide is 64, deep
Degree is the SRAM of 1k.Computational accuracy uses 16 fixed-point numbers, operand 128, and hidden layer neuron number is 32, lot number
(batch) it is set as 3.
The specific steps of the present embodiment are as follows:
Step 1) signal processing module receives algorithm commencing signal, controls input/output control module for input data
It is transferred in SRAM in a particular order from DDR.Wherein, the storage of input data (including operand and weight) in sram
Mode is as shown in Figure 5 and Figure 6, and each address bit of storage unit (64) can store 4 16 fixed-point datas, operand
With weight sequential storage in the memory unit.
After step 2) is to step 1) data end of transmission, signal processing module controls data address generation module and generates
The operand stored in SRAM is passed to computing array module according to source data address, generates and be passed to input by source data address
Data valid signal.
Step 3) computing array module receives input data useful signal and after the operand that SRAM is read in, and starts
ANN Reasoning calculating is carried out, calculating process data flow, referring to fig. 4:
Set X(1),X(2),X(3)Respectively batch 1, batch 2, the operand of batch 3, the operand of three batches is all
It is the vector of length 128;W1, W2, W3..., W32Respectively hidden layer neuron 1, neuron 2, neuron 3 ... ..., neuron
Weight corresponding to 32, they are all the vectors that length is 128;The computing unit of i-th row jth column is expressed as PE (i, j).Example
Such as, W1=(W1_1,W1_2,W1_3,…,W1_128)。
When calculating, for each column, different neuron respective weights flow to each computing unit (MLU) from top to bottom
In;It for every a line, is broadcast in each computing unit from left to right with batch input data, calculating process is single in each calculating
It is completed in member, the main step that calculates is to multiply accumulating.
By taking the MLU (1,1) of the first row as an example, input data (operand)By row sequence into
Enter systolic arrays, and is broadcasted to same a line, respective weights W1_1,W1_2,W1_3,…,W1_128Column major order enters array, and to same
One column flowing, operand and weight multiply accumulating operation in the inner completion of MLU (1,1), and operand and multiplying accumulating for weight complete it
Afterwards, operation result and the corresponding biasing b of current MLU1It will do it add operation and pass through activation primitive computing unit (AU), i.e., it is complete
The calculating exported at the 1st neuron.Similarly, (1,2) MLU, MLU (1,3) to MLU (1,32) complete the 2nd to the 32nd successively
The calculating of neuron output.The MLU of MLU and the third line for the second row, calculating process is identical with the MLU of the first row, but
Postponing a clock cycle obtains result.
So far, computing array can complete whole calculating of ANN Reasoning.
Step 4) computing array module generates output data useful signal, and signal control module is controlled after receiving the signal
Data address generation module processed generates result data address, and result data is passed to SRAM according to address
Step 5) signal control module control input/output control module module is responsible for result from on-chip SRAM memory
In the outer DDR memory of middle write-in piece, end signal is generated, completes the calculating of primary complete ANN Reasoning.
More than, it is merely preferred embodiments of the present invention, but scope of protection of the present invention is not limited thereto, appoints
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of, all by what those familiar with the art
It is covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of protection of the claims
It is quasi-.
Claims (6)
1. a kind of stacking-type based on unidirectional systolic arrays is from coded system, characterized by comprising:
Signal control module: receiving commencing signal, controls each intermodule communication, generates end signal;
Memory module: including DDR memory outside piece and on-chip SRAM memory;
Input/output control module: the data and sequence that piece outer DDR memory is read when input are stored in on-chip SRAM memory, defeated
The result data sequence of on-chip SRAM memory is write back into DDR chip external memory when out;
Data address generation module: the address of source data or result data is generated;
Computing array module: the reasoning operation of neural network algorithm is carried out in a manner of unidirectional systolic arrays.
2. the stacking-type according to claim 1 based on unidirectional systolic arrays is from coded system, it is characterised in that: the mind
Same set of storage resource is shared through all results of network algorithm, and the storage location that the intermediate result of algorithm generation occupies can be covered
Lid.
3. the stacking-type according to claim 1 based on unidirectional systolic arrays is from coded system, it is characterised in that: calculate battle array
Column module includes: the unidirectional systolic arrays that scale is 32x32, and each independent computing unit includes 16 fixed-point multiplications in the array
Device, adder, divider, support Relu function calculating linear activation primitive computing unit and support tanh function and
The nonlinear activation function computing unit that sigmoid function calculates realizes the calculating multiplied accumulating with neural network activation primitive.
4. the stacking-type according to claim 1 based on unidirectional systolic arrays is from coded system, it is characterised in that: the list
To the mode of systolic arrays be using pulsation, the in the ranks unidirectional microseismic data transmission mode broadcasted between column, specifically: operand with
Behavior unit simultaneous transmission is to each computing unit in a column, and weight is to arrange each calculating sequentially entered in a column for unit
Unit supports the multiple multiplexing of weight and operand.
5. using the stacking-type based on unidirectional systolic arrays as described in claim 1-4 from coded system from coding method,
It is characterized by comprising following steps:
Step 1) signal control module receives algorithm commencing signal, and control input/output control module is by input data from DDR
Memory is transferred in SRAM memory in a particular order;
Step 2 waits for that signal control module controls data address generation module generating source data address, will according to source data address
The operand stored in SRAM memory is passed to computing array module, generates and is passed to input data useful signal;
Step 3) computing array module receives the input data useful signal and after SRAM memory reading operand, opens
Begin to carry out ANN Reasoning calculating, in calculating process: for each column, different neuron respective weights flow to from top to bottom
In each computing unit;For every a line, it is broadcast in each computing unit of computing array from left to right with batch input data,
Calculating process is completed in each computing unit;
Step 4) computing array module generates output data useful signal, and signal control module controls data after receiving the signal
Address generating module generates result data address, and result data is passed in SRAM memory according to result data address;
Step 5) signal control module controls input/output control module and piece is written from on-chip SRAM memory in result data
In outer DDR memory, end signal is generated, completes the calculating of primary complete ANN Reasoning.
6. according to claim 5 from coding method, it is characterised in that: input data includes operand in the step 1)
And weight, each address bit of storage unit can store 4 16 fixed-point datas, operand and weight in SRAM memory
Sequential storage in the memory unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910528794.1A CN110232441B (en) | 2019-06-18 | 2019-06-18 | Stack type self-coding system and method based on unidirectional pulsation array |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910528794.1A CN110232441B (en) | 2019-06-18 | 2019-06-18 | Stack type self-coding system and method based on unidirectional pulsation array |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110232441A true CN110232441A (en) | 2019-09-13 |
CN110232441B CN110232441B (en) | 2023-05-09 |
Family
ID=67859718
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910528794.1A Active CN110232441B (en) | 2019-06-18 | 2019-06-18 | Stack type self-coding system and method based on unidirectional pulsation array |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110232441B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689123A (en) * | 2019-09-27 | 2020-01-14 | 南京大学 | Long-short term memory neural network forward acceleration system and method based on pulse array |
CN111401522A (en) * | 2020-03-12 | 2020-07-10 | 上海交通大学 | Variable speed pulsating array speed control method and variable speed pulsating array micro-frame |
CN111401532A (en) * | 2020-04-28 | 2020-07-10 | 南京宁麒智能计算芯片研究院有限公司 | Convolutional neural network reasoning accelerator and acceleration method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140163664A1 (en) * | 2006-11-21 | 2014-06-12 | David S. Goldsmith | Integrated system for the ballistic and nonballistic infixion and retrieval of implants with or without drug targeting |
CN104319773A (en) * | 2014-11-25 | 2015-01-28 | 常熟市五爱电器设备有限公司 | Solar energy and electric supply flexible complementation power supply system |
CN108710943A (en) * | 2018-05-21 | 2018-10-26 | 南京大学 | A kind of multilayer feedforward neural network Parallel Accelerator |
-
2019
- 2019-06-18 CN CN201910528794.1A patent/CN110232441B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140163664A1 (en) * | 2006-11-21 | 2014-06-12 | David S. Goldsmith | Integrated system for the ballistic and nonballistic infixion and retrieval of implants with or without drug targeting |
CN104319773A (en) * | 2014-11-25 | 2015-01-28 | 常熟市五爱电器设备有限公司 | Solar energy and electric supply flexible complementation power supply system |
CN108710943A (en) * | 2018-05-21 | 2018-10-26 | 南京大学 | A kind of multilayer feedforward neural network Parallel Accelerator |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689123A (en) * | 2019-09-27 | 2020-01-14 | 南京大学 | Long-short term memory neural network forward acceleration system and method based on pulse array |
CN111401522A (en) * | 2020-03-12 | 2020-07-10 | 上海交通大学 | Variable speed pulsating array speed control method and variable speed pulsating array micro-frame |
CN111401522B (en) * | 2020-03-12 | 2023-08-15 | 上海交通大学 | Pulsation array variable speed control method and variable speed pulsation array micro-frame system |
CN111401532A (en) * | 2020-04-28 | 2020-07-10 | 南京宁麒智能计算芯片研究院有限公司 | Convolutional neural network reasoning accelerator and acceleration method |
Also Published As
Publication number | Publication date |
---|---|
CN110232441B (en) | 2023-05-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111897579B (en) | Image data processing method, device, computer equipment and storage medium | |
Yin et al. | A high energy efficient reconfigurable hybrid neural network processor for deep learning applications | |
CN109948774B (en) | Neural network accelerator based on network layer binding operation and implementation method thereof | |
CN107301456B (en) | Deep neural network multi-core acceleration implementation method based on vector processor | |
CN107578095B (en) | Neural computing device and processor comprising the computing device | |
CN109447241B (en) | Dynamic reconfigurable convolutional neural network accelerator architecture for field of Internet of things | |
CN108805266A (en) | A kind of restructural CNN high concurrents convolution accelerator | |
CN110232441A (en) | A kind of stacking-type based on unidirectional systolic arrays is from encoding system and method | |
CN107239823A (en) | A kind of apparatus and method for realizing sparse neural network | |
CN107423816B (en) | Multi-calculation-precision neural network processing method and system | |
CN107229967A (en) | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA | |
CN109409510B (en) | Neuron circuit, chip, system and method thereof, and storage medium | |
CN110390383A (en) | A kind of deep neural network hardware accelerator based on power exponent quantization | |
CN107689948A (en) | Efficient data memory access managing device applied to neural network hardware acceleration system | |
CN111325321A (en) | Brain-like computing system based on multi-neural network fusion and execution method of instruction set | |
CN110222818B (en) | Multi-bank row-column interleaving read-write method for convolutional neural network data storage | |
CN109472356A (en) | A kind of accelerator and method of restructural neural network algorithm | |
CN101717817B (en) | Method for accelerating RNA secondary structure prediction based on stochastic context-free grammar | |
CN111105023B (en) | Data stream reconstruction method and reconfigurable data stream processor | |
CN109934336A (en) | Neural network dynamic based on optimum structure search accelerates platform designing method and neural network dynamic to accelerate platform | |
CN113076521B (en) | Reconfigurable architecture method based on GPGPU and computing system | |
CN110991630A (en) | Convolutional neural network processor for edge calculation | |
CN108960414A (en) | Method for realizing single broadcast multiple operations based on deep learning accelerator | |
CN109993275A (en) | A kind of signal processing method and device | |
CN111860773B (en) | Processing apparatus and method for information processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |