CN106250103A - A kind of convolutional neural networks cyclic convolution calculates the system of data reusing - Google Patents
A kind of convolutional neural networks cyclic convolution calculates the system of data reusing Download PDFInfo
- Publication number
- CN106250103A CN106250103A CN201610633040.9A CN201610633040A CN106250103A CN 106250103 A CN106250103 A CN 106250103A CN 201610633040 A CN201610633040 A CN 201610633040A CN 106250103 A CN106250103 A CN 106250103A
- Authority
- CN
- China
- Prior art keywords
- data
- array
- convolution
- module
- reusing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 125000004122 cyclic group Chemical group 0.000 title claims abstract description 35
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims abstract description 52
- 238000000034 method Methods 0.000 claims abstract description 26
- 230000008569 process Effects 0.000 claims abstract description 25
- 239000011159 matrix material Substances 0.000 claims abstract description 17
- 230000005540 biological transmission Effects 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 27
- 230000005055 memory storage Effects 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 4
- 230000009471 action Effects 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 230000008878 coupling Effects 0.000 claims 1
- 238000010168 coupling process Methods 0.000 claims 1
- 238000005859 coupling reaction Methods 0.000 claims 1
- 238000003491 array Methods 0.000 description 6
- 238000012795 verification Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 241001522296 Erithacus rubecula Species 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
- G06F17/153—Multidimensional correlation or convolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a kind of convolutional neural networks cyclic convolution towards coarseness reconfigurable system and calculate the system of data reusing, including master controller and link control module, input data reusing module, convolution loop calculation process array, data transmission path four part.During convolution loop computing, it is in the nature multiple two dimension input data matrix the biggest with multiple two dimension modulus matrix multiples, typically these matrix sizes, is multiplied and occupies the most of the time of whole convolutional calculation.The present invention utilizes coarse-grained reconfigurable array system to complete convolutional calculation process, after receiving convolution algorithm request instruction, the mode utilizing depositor round is fully excavated convolution loop and is calculated the input data reusability of process, improve data user rate and reduce bandwidth memory access pressure, and designed array element is configurable, convolution algorithm when different cyclic convolution scale and step-length can be completed.
Description
Technical field
The present invention relates to imbedded reconfigurable design field, a kind of convolution god towards coarseness reconfigurable system
Calculate the system of data reusing through network cyclic convolution, can be used for high-performance reconfigurable system, it is achieved convolutional neural networks is carried out
Big number of cycles convolution algorithm, uses data with existing as far as possible, is reused data, improves arithmetic speed, reduces digital independent
Bandwidth pressure.
Background technology
Reconfigurable processor architecture is a kind of preferably application acceleration platform, owing to hardware configuration can be according to program
Data flow diagram reorganize, reconfigurable arrays has been demonstrated that it has good performance for scientific algorithm or multimedia application
Improvement.
Convolution algorithm has purposes widely in image processing field, such as in image filtering, image enhaucament, graphical analysis
Will use convolution algorithm Deng when processing, image convolution computing essence is a kind of matrix operations, is characterized in that operand is big, and
Data-reusing rate is high, is extremely difficult to the requirement of real-time by computed in software image convolution.
Convolutional neural networks is as a kind of feedforward compensator, it is possible to automatically learn there being label data in a large number
And therefrom extract complex characteristic, the advantage of convolutional neural networks is to have only to input picture is carried out less pretreatment with regard to energy
Enough from pixel image, identify visual pattern, and to there being more diverse identification object also to have preferable recognition effect, with
Time convolutional neural networks identification ability be not easily susceptible to the distortion of image or the impact of simple geometry conversion.The most refreshing as multilamellar
Through an important directions of network research, the focus of convolutional neural networks always research for many years.
Convolution mask is placed on the upper left corner of image lattice, then convolution mask must be with the segmentation in the upper left corner in image lattice
Matrix overlaps.Their coincidence item correspondence is multiplied, the most all sues for peace, just obtained first result points.Then, then will
Convolution mask moves to right string, can obtain second result points.The most so, convolution mask travels through one time in image lattice,
The convolution of a two field picture can be obtained the most completely.The reusability of data is the highest, but the caching of traditional approach or direct from outside
Directly read, owing to being limited by digital independent bandwidth, and there is no configurable arrays, complete multilamellar convolution loop computing,
Inefficient.
Summary of the invention
Goal of the invention: for problems of the prior art with not enough, the present invention provides a kind of and can weigh towards coarseness
The convolutional neural networks cyclic convolution of construction system calculates the system of data reusing, can accelerate wanting of big quantity convolutional calculation
Ask, reduce the pressure to broadband, and convolution algorithm array is configurable.The calculated performance of convolutional neural networks provides with hardware
Taking of source is convolutional neural networks needs two aspects trading off in coarseness reconfigurable architecture realizes, based on can
The design object of the convolutional neural networks of reconstruction processing array is to meet on the premise of application performance requires, making full use of and can weigh
The calculating resource that structure array provides and storage resource, utilize input image data to reuse structure, utilize in cyclic convolution computing
Height reuses rate, in addition the configurability of coarse-grained reconfigurable array, in digital independent bandwidth, in the case of calculating resource limit,
Complete convolutional calculation, reach one the most compromise.
Technical scheme: a kind of convolutional neural networks cyclic convolution towards coarseness reconfigurable system calculates data reusing
System, passes including master controller and link control module, input data reusing module, convolution loop calculation process array and data
Transmission path.
Described master controller and link control module, complete the reception of extraneous convolution algorithm request, computing array configuration letter
Breath loads, and result of calculation returns and monitoring to circular flow state, control external memory storage and input data reusing module it
Between data transmission.
Described input data reusing module, be connect outer input data memorizer and cyclic convolution calculation process array it
Between data reusing module, complete input data reusing, wherein module top half is image array width quantity FIFO, lower half
Part is image array width quantity shift register.FIFO constantly loads input data from extraneous memorizer, respectively correspondence volume
The long-pending string calculated, when shift register moves according to convolution step-length, FIFO is that wherein string changed by shift register, the completeest
Become a convolution algorithm, reach the effect of data reusing.Shift register is used for utilizing top half FIFO part to provide and updates
Adjacent region data.Owing to multiple shift registers use annular addressing mode, the data from FIFO will always replace annular shifting
Data the oldest in bit register, are transferred to computing array data afterwards and complete convolution algorithm.
This module realizes specifically comprising the following steps that
Data once input S (1≤S < maximum image matrix width), and individual 32 bit data are to FIFO, when convolution algorithm was used
Data in one depositor, FIFO will be transferred to shift register the data of oneself, and shift register need to update string K (1
≤ K < maximum image matrix width, K is this convolutional calculation convolution kernel matrix width) individual 32 bit data, add that original K-1 arranges
Data, shift register is transferred to convolutional calculation matrix K*K data, continues afterwards to move according to step-length backward, same
String need to be updated, it is achieved enter to input data reusing.
Described cyclic convolution calculation process array, obtains required input data from input data reusing module, completes volume
Long-pending calculating, and the function after having calculated, data sent.
Described data transmission path, has been master controller and interface control module, and cyclic convolution calculation process array is defeated
Enter the data transmission channel between data reusing module.
Further, master controller and link control module include main control and connect controller, connect controller and prefetch
Judge and data reusing configuration control action, prefetch judge should for judging convolution algorithm to be carried out time required data the most accurate
Standby in place, if data are in place, cyclic convolution calculation process array performs convolution loop and calculates, if it did not, that waits for number
According in place.Data in caching are read by external memory storage, and the present invention uses direct memory access mode to read, when needing
When wanting external data to input, master controller sends to outside memory read data order, and master controller is not the most to storage afterwards
Reading is controlled, connect controller can send out one stop signal to master controller, master controller is abandoned address bus, data
Bus and the right to use about control bus, when the data of input data reusing module need to update, just by connecting controller,
Directly read the data in external memory.
Cyclic convolution calculation process array include array configure module, including array configuration module, storage processing unit and
Calculation processing unit, this module application is when matched data reuse module, according to convolutional calculation scale and step-length, array configuration mould
Computing array is configured by block, utilize array can calculating resource, has calculated every time and has reconfigured array the most afterwards, counted
Calculate processing unit to be adjusted according to calculating scale, carry out convolution algorithm next time.
Described convolution algorithm processes array Configuration Control Unit, after interface control module loads configuration information, and computing battle array
Arranging the size according to cyclic convolution circulation scale and step information, can make convolved image matrix size variable is to maximum from 1
Exploitation between image array width, computing array can be reconfigured by convolution algorithm each time, and convolution kernel is advised
When mould is less, convolution array is also available with whole convolutional calculation matrix, shortens the total duration of convolutional calculation with this.
Storage computing unit structure storage instruction and data reusing module tight association, it is in the driving of loop control parts
Under, from address queue, take address or be directly calculated address by address generating unit, sending reading to data reusing module
Request of data, returns data and writes in data queue, under the control of loop ends parts, and data in read shift register.
Calculation processing unit realizes the calculating in data flow process and selects function, and circulation subscript is constantly from depositor
Obtaining data in group, and pass the data to calculation processing unit array, calculation processing unit array closes according to fixing connection
System carries out computing, and the result of computing stores the position specified.
The application of cyclic convolution calculation process array continues pile line operation, and this operation cyclic mapping configures module to array,
Array configuration module configures the initial value of loop control variable, final value and step value, and the execution of cyclic program need not outside control
System, constitutes streamline link, completes cyclic convolution scheduling on streamline between each computing array unit.
Accompanying drawing explanation
Fig. 1 is the coarse-grained reconfigurable array system assumption diagram of convolutional calculation in the embodiment of the present invention;
Fig. 2 is input data reusing module data robin scheduling hardware structure diagram in the embodiment of the present invention;
Fig. 3 is the structured flowchart of storage processing unit in coarseness restructural convolutional calculation array in the embodiment of the present invention;
Fig. 4 is the structured flowchart of coarseness restructural convolutional calculation array computation processing unit in the embodiment of the present invention;
Fig. 5 be in the embodiment of the present invention cyclic convolution at the existing flow chart of reconfigurable arrays interior-excess.
Detailed description of the invention
Below in conjunction with specific embodiment, it is further elucidated with the present invention, it should be understood that these embodiments are merely to illustrate the present invention
Rather than restriction the scope of the present invention, after having read the present invention, the those skilled in the art's various equivalences to the present invention
The amendment of form all falls within the application claims limited range.
Convolutional neural networks cyclic convolution towards coarseness reconfigurable system calculates the system of data reusing, including master control
Device processed and link control module, input data reusing module, convolution loop calculation process array and data transmission path.
Master controller and link control module, complete the reception of extraneous convolution algorithm request, and computing array configuration information adds
Carry, result of calculation return and the monitoring to circular flow state, control number between external memory storage and input data reusing module
According to transmission.
Input data reusing module, is to connect between outer input data memorizer and cyclic convolution calculation process array
Data reusing module, wherein module top half is image array width quantity FIFO, and the latter half is image array width number
Amount shift register.
Cyclic convolution calculation process array, obtains required input data from input data reusing module, completes convolution meter
Calculate, and the function after having calculated, data sent.
Data transmission path, has been master controller and interface control module, cyclic convolution calculation process array, inputs number
According to the data transmission channel between reuse module.
Master controller and link control module include main control and connect controller, connect controller and prefetch judgement and number
According to reusing configuration control action, prefetch judge should for judging convolution algorithm to be carried out time required data whether prepare in place,
If data are in place, cyclic convolution calculation process array performs convolution loop and calculates, if it did not, that to wait for data in place.
Data in caching are read by external memory storage, and the present invention uses direct memory access mode to read, when needs outside
During data input, master controller sends to outside memory read data order, afterwards master controller the most storage is not read into
Row control, connect controller can send out one stop signal to master controller, master controller abandon to address bus, data/address bus and
About controlling the right to use of bus, when the data of input data reusing module need to update, just by connecting controller, directly read
Take the data in external memory.
As it is shown in figure 1, concrete computing array figure and the coarse-grained reconfigurable array figure of data stream.Configurable PE unit accounts for
According to main part, also in that reconfigurable arrays has been the concrete part of convolutional calculation, remainder primarily to
The instruction transmission started and terminate is come in.As seen in Figure 1, in configurable arrays, storage processing unit is directly connected to defeated
Entering data reusing module (such as Fig. 2), according to step-length and convolution kernel size values, input data reusing module is by needed for convolution algorithm
Data stream transmitting is to calculation processing unit, and router configuration data stream arrives each calculating by internet route and processes single
Unit, is simultaneously connected with controller and undertakes a convolutional calculation and complete, spread out of by data message, and calculation processing unit is joined again
Put, start the newest computing.
The data robin scheduling hardware chart of input data reusing module is as in figure 2 it is shown, (K is volume with convolution kernel size as K*K
Long-pending core width) as a example by, between external memory storage and shift register, adding FIFO, data once input S 32 bit data
To FIFO, data in convolution algorithm used a depositor, FIFO will be transferred to shift register the data of oneself, moves
Bit register need to update string K 32 bit data, adds original K-1 column data, and shift register is transferred to volume K*K data
Long-pending calculating matrix, such input image data is reused structure, is provided support for high efficiency convolution algorithm.
As it is shown on figure 3, correspondence is the structured flowchart of storage processing unit, when input channel receives address signal,
Now the most corresponding storage processing unit position in an array, these storage processing unit complete the life of the address of corresponding data
Become, generate address and corresponding will can use the data in input image data reuse module, now data are exported to calculating
Processing unit.The generation of loop control operational data corresponding address, and the end of convolution algorithm, synchronize computed information
It is transferred in external memory storage.And cycle criterion structure data not to or not enough time, terminate current operation, information passed to
External memory storage, carries out data renewal.
As shown in Figure 4, corresponding is the structure chart of calculation processing unit, and calculation processing unit is receiving input data
Time, application internal multiplier and adder complete convolution algorithm, complete once-through operation, according to Configuration Control Unit, reconfigure fortune
Calculation processing unit required for calculation, completes configurable control, when outer loop size, during step-length conversion, is still able to very well
Complete computing.
In conjunction with Fig. 1, Fig. 2, the concrete steps that convolution loop calculates are as it is shown in figure 5, comprise the steps:
1) if needing coarse-grained reconfigurable array system to complete a large amount of convolution algorithm, first have to this convolution control volume
System sends request, when primary processor receives request, will send instruction to connecting processing unit;
2) connect processing unit and first determine whether that in input data reusing module, desired data right and wrong are the most in place, without
Waiting signal will be sent, with directmemoryaccess, buffer is carried out data transmission simultaneously;
3) after data are the most continuous, the operational order that notice is waiting, control circulation and start, convolution loop calculation process battle array
Configuring control unit in row to configure array, the memory access configuration module in computing array will calculate position residing for number play
Putting, computing array carries out convolutional calculation to the data of this position afterwards, and flowing water is carried out the most rearwards.
4) Y (maximum image matrix width) individual FIFO caching by directly storage reading manner continuous renewal depositor in
With crossing data, when entering back into this position, data complete to update, uninterruptedly carry out computing, arrive without each convolution algorithm
External memory goes to access data.
5) connect controller control circulation to complete, when calculating completes, final data is exported in external memory storage, specifically
Convolution algorithm array completes.
When specifically carrying out big number of cycles convolution algorithm, when computation resources are limited, the method for application data reusing, add
Upper configurable reconfigurable arrays, streamline completes convolution algorithm, and we improve operation efficiency and speed.It is provided with having a competition
Test, respectively contrast verification system A, contrast verification system B.Wherein, contrast verification system A, the most traditional does not supports that array is joined
The reconfigurable system put and reuse.Contrast verification system B, support data pre-fetching the most proposed by the invention and the restructural reused
System.Choosing the input data matrix of 16x16, the convolution matrix of 3x3, step-length is 1, is provided with 10 input data, 10 volumes
Long-pending weight matrix, carries out convolution algorithm simultaneously.Test result indicate that, contrast verification system B can obtain contrast verification system A
The performance boost of average 1.76 times.
Claims (5)
1. the convolutional neural networks cyclic convolution towards coarseness reconfigurable system calculates a system for data reusing, its feature
It is: include that master controller and link control module, input data reusing module, convolution loop calculation process array and data pass
Transmission path;
Described master controller and link control module, complete the reception of extraneous convolution algorithm request, and computing array configuration information adds
Carry, result of calculation return and the monitoring to circular flow state, control number between external memory storage and input data reusing module
According to transmission;
Described input data reusing module, is to connect between outer input data memorizer and cyclic convolution calculation process array
Data reusing module, wherein module top half is image array width quantity FIFO, and the latter half is image array width number
Amount shift register;
Described cyclic convolution calculation process array, obtains required input data from input data reusing module, completes convolution meter
Calculate, and the function after having calculated, data sent.
2. data transmission path described in, has been master controller and interface control module, cyclic convolution calculation process array, input
Data transmission channel between data reusing module.
3. as claimed in claim 1 towards the convolutional neural networks cyclic convolution calculating data reusing of coarseness reconfigurable system
System, it is characterised in that: master controller and link control module include main control and connect controller, connect controller have pre-
Take judgement and data reusing configuration control action, prefetch judge should for judging convolution algorithm to be carried out time required data whether
Preparing in place, if data are in place, cyclic convolution calculation process array performs convolution loop and calculates, if it did not, that waits for
Data are in place;Data in caching are read by external memory storage, use direct memory access mode to read, when outside needs
During portion's data input, master controller sends to outside memory read data order, and storage is not read by master controller afterwards
Be controlled, connect controller can send out one stop signal to master controller, master controller is abandoned address bus, data/address bus
With about control bus the right to use, input data reusing module data need update time, just by connection controller, directly
Read the data in external memory.
4. as claimed in claim 1 towards the convolutional neural networks cyclic convolution calculating data reusing of coarseness reconfigurable system
System, it is characterised in that: cyclic convolution calculation process array include array configure module, including array configure module, storage
Processing unit and calculation processing unit, this module application when coupling input data reusing module, according to convolutional calculation scale and
Step-length, computing array is configured by array configuration module, utilize array can calculating resource, calculated once every time after
Reconfiguring array, calculation processing unit is adjusted according to calculating scale, carries out convolution algorithm next time;Cyclic convolution computing
Processing array application and continue pile line operation, this operation cyclic mapping to array configures module, and array configuration module configures to be followed
The initial value of ring control variable, final value and step value, the execution of cyclic program need not external control, each computing array unit it
Between constitute streamline link, complete cyclic convolution scheduling on streamline.
5. as claimed in claim 1 towards the convolutional neural networks cyclic convolution calculating data reusing of coarseness reconfigurable system
System, it is characterised in that: described input data reusing module realize specifically comprise the following steps that
Data once input S 32 bit data to FIFO, data in convolution algorithm used a depositor, FIFO will from
Oneself data are transferred to shift register, and shift register need to update string K 32 bit data, adds original K-1 column data, moves
Bit register is transferred to convolutional calculation matrix K*K data, continues afterwards to move according to step-length backward, and same need to update one
Row, it is achieved enter to input data reusing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610633040.9A CN106250103A (en) | 2016-08-04 | 2016-08-04 | A kind of convolutional neural networks cyclic convolution calculates the system of data reusing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610633040.9A CN106250103A (en) | 2016-08-04 | 2016-08-04 | A kind of convolutional neural networks cyclic convolution calculates the system of data reusing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106250103A true CN106250103A (en) | 2016-12-21 |
Family
ID=58079364
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610633040.9A Pending CN106250103A (en) | 2016-08-04 | 2016-08-04 | A kind of convolutional neural networks cyclic convolution calculates the system of data reusing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106250103A (en) |
Cited By (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106775599A (en) * | 2017-01-09 | 2017-05-31 | 南京工业大学 | Many computing unit coarseness reconfigurable systems and method of recurrent neural network |
CN106844294A (en) * | 2016-12-29 | 2017-06-13 | 华为机器有限公司 | Convolution algorithm chip and communication equipment |
CN107103754A (en) * | 2017-05-10 | 2017-08-29 | 华南师范大学 | A kind of road traffic condition Forecasting Methodology and system |
CN107229598A (en) * | 2017-04-21 | 2017-10-03 | 东南大学 | A kind of low power consumption voltage towards convolutional neural networks is adjustable convolution computing module |
CN107590085A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of dynamic reconfigurable array data path and its control method with multi-level buffer |
CN107635138A (en) * | 2017-10-19 | 2018-01-26 | 珠海格力电器股份有限公司 | Image processing apparatus |
CN107832262A (en) * | 2017-10-19 | 2018-03-23 | 珠海格力电器股份有限公司 | Convolution algorithm method and device |
CN107862650A (en) * | 2017-11-29 | 2018-03-30 | 中科亿海微电子科技(苏州)有限公司 | The method of speed-up computation two dimensional image CNN convolution |
CN108009126A (en) * | 2017-12-15 | 2018-05-08 | 北京中科寒武纪科技有限公司 | A kind of computational methods and Related product |
CN108182471A (en) * | 2018-01-24 | 2018-06-19 | 上海岳芯电子科技有限公司 | A kind of convolutional neural networks reasoning accelerator and method |
CN108198125A (en) * | 2017-12-29 | 2018-06-22 | 深圳云天励飞技术有限公司 | A kind of image processing method and device |
CN108241890A (en) * | 2018-01-29 | 2018-07-03 | 清华大学 | A kind of restructural neural network accelerated method and framework |
WO2018137177A1 (en) * | 2017-01-25 | 2018-08-02 | 北京大学 | Method for convolution operation based on nor flash array |
CN108564524A (en) * | 2018-04-24 | 2018-09-21 | 开放智能机器(上海)有限公司 | A kind of convolutional calculation optimization method of visual pattern |
CN108595379A (en) * | 2018-05-08 | 2018-09-28 | 济南浪潮高新科技投资发展有限公司 | A kind of parallelization convolution algorithm method and system based on multi-level buffer |
CN108596331A (en) * | 2018-04-16 | 2018-09-28 | 浙江大学 | A kind of optimization method of cell neural network hardware structure |
CN108665063A (en) * | 2018-05-18 | 2018-10-16 | 南京大学 | Two-way simultaneous for BNN hardware accelerators handles convolution acceleration system |
CN108681984A (en) * | 2018-07-26 | 2018-10-19 | 珠海市微半导体有限公司 | A kind of accelerating circuit of 3*3 convolution algorithms |
CN108701015A (en) * | 2017-11-30 | 2018-10-23 | 深圳市大疆创新科技有限公司 | For the arithmetic unit of neural network, chip, equipment and correlation technique |
CN108717571A (en) * | 2018-06-01 | 2018-10-30 | 阿依瓦(北京)技术有限公司 | A kind of acceleration method and device for artificial intelligence |
CN108764182A (en) * | 2018-06-01 | 2018-11-06 | 阿依瓦(北京)技术有限公司 | A kind of acceleration method and device for artificial intelligence of optimization |
WO2018232615A1 (en) * | 2017-06-21 | 2018-12-27 | 华为技术有限公司 | Signal processing method and device |
CN109272112A (en) * | 2018-07-03 | 2019-01-25 | 北京中科睿芯科技有限公司 | A kind of data reusing command mappings method, system and device towards neural network |
CN109284475A (en) * | 2018-09-20 | 2019-01-29 | 郑州云海信息技术有限公司 | A kind of matrix convolution computing module and matrix convolution calculation method |
CN109375952A (en) * | 2018-09-29 | 2019-02-22 | 北京字节跳动网络技术有限公司 | Method and apparatus for storing data |
CN109460813A (en) * | 2018-09-10 | 2019-03-12 | 中国科学院深圳先进技术研究院 | Accelerated method, device, equipment and the storage medium that convolutional neural networks calculate |
CN109711533A (en) * | 2018-12-20 | 2019-05-03 | 西安电子科技大学 | Convolutional neural networks module based on FPGA |
CN109754359A (en) * | 2017-11-01 | 2019-05-14 | 腾讯科技(深圳)有限公司 | A kind of method and system that the pondization applied to convolutional neural networks is handled |
CN109816093A (en) * | 2018-12-17 | 2019-05-28 | 北京理工大学 | A kind of one-way convolution implementation method |
CN109992541A (en) * | 2017-12-29 | 2019-07-09 | 深圳云天励飞技术有限公司 | A kind of data method for carrying, Related product and computer storage medium |
CN110069444A (en) * | 2019-06-03 | 2019-07-30 | 南京宁麒智能计算芯片研究院有限公司 | A kind of computing unit, array, module, hardware system and implementation method |
CN110325963A (en) * | 2017-02-28 | 2019-10-11 | 微软技术许可有限责任公司 | The multi-functional unit for programmable hardware node for Processing with Neural Network |
CN110377874A (en) * | 2019-07-23 | 2019-10-25 | 江苏鼎速网络科技有限公司 | Convolution algorithm method and system |
CN110383237A (en) * | 2017-02-28 | 2019-10-25 | 德克萨斯仪器股份有限公司 | Reconfigurable matrix multiplier system and method |
CN110413561A (en) * | 2018-04-28 | 2019-11-05 | 北京中科寒武纪科技有限公司 | Data accelerate processing system |
WO2019231254A1 (en) * | 2018-05-30 | 2019-12-05 | Samsung Electronics Co., Ltd. | Processor, electronics apparatus and control method thereof |
CN110705687A (en) * | 2019-09-05 | 2020-01-17 | 北京三快在线科技有限公司 | Convolution neural network hardware computing device and method |
WO2020051751A1 (en) * | 2018-09-10 | 2020-03-19 | 中国科学院深圳先进技术研究院 | Convolution neural network computing acceleration method and apparatus, device, and storage medium |
CN111045958A (en) * | 2018-10-11 | 2020-04-21 | 展讯通信(上海)有限公司 | Acceleration engine and processor |
WO2020077565A1 (en) * | 2018-10-17 | 2020-04-23 | 北京比特大陆科技有限公司 | Data processing method and apparatus, electronic device, and computer readable storage medium |
CN111095242A (en) * | 2017-07-24 | 2020-05-01 | 特斯拉公司 | Vector calculation unit |
CN111176727A (en) * | 2017-07-20 | 2020-05-19 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111291880A (en) * | 2017-10-30 | 2020-06-16 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111465924A (en) * | 2017-12-12 | 2020-07-28 | 特斯拉公司 | System and method for converting matrix input to vectorized input for a matrix processor |
US10733742B2 (en) | 2018-09-26 | 2020-08-04 | International Business Machines Corporation | Image labeling |
CN111523642A (en) * | 2020-04-10 | 2020-08-11 | 厦门星宸科技有限公司 | Data reuse method, operation method and device and chip for convolution operation |
CN109800867B (en) * | 2018-12-17 | 2020-09-29 | 北京理工大学 | Data calling method based on FPGA off-chip memory |
CN111859797A (en) * | 2020-07-14 | 2020-10-30 | Oppo广东移动通信有限公司 | Data processing method and device and storage medium |
WO2021007037A1 (en) * | 2019-07-09 | 2021-01-14 | MemryX Inc. | Matrix data reuse techniques in processing systems |
US10928456B2 (en) | 2017-08-17 | 2021-02-23 | Samsung Electronics Co., Ltd. | Method and apparatus for estimating state of battery |
CN112992248A (en) * | 2021-03-12 | 2021-06-18 | 西安交通大学深圳研究院 | PE (provider edge) calculation unit structure of FIFO (first in first out) -based variable-length cyclic shift register |
US11176427B2 (en) | 2018-09-26 | 2021-11-16 | International Business Machines Corporation | Overlapping CNN cache reuse in high resolution and streaming-based deep learning inference engines |
CN114780910A (en) * | 2022-06-16 | 2022-07-22 | 千芯半导体科技(北京)有限公司 | Hardware system and calculation method for sparse convolution calculation |
WO2022179075A1 (en) * | 2021-02-26 | 2022-09-01 | 成都商汤科技有限公司 | Data processing method and apparatus, computer device and storage medium |
US11694074B2 (en) | 2018-09-07 | 2023-07-04 | Samsung Electronics Co., Ltd. | Integrated circuit that extracts data, neural network processor including the integrated circuit, and neural network device |
CN116842307A (en) * | 2023-08-28 | 2023-10-03 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment, chip and storage medium |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001090927A1 (en) * | 2000-05-19 | 2001-11-29 | Philipson Lars H G | Method and device in a convolution process |
CN102208005A (en) * | 2011-05-30 | 2011-10-05 | 华中科技大学 | 2-dimensional (2-D) convolver |
CN104077233A (en) * | 2014-06-18 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Single-channel convolution layer and multi-channel convolution layer handling method and device |
CN105681628A (en) * | 2016-01-05 | 2016-06-15 | 西安交通大学 | Convolution network arithmetic unit, reconfigurable convolution neural network processor and image de-noising method of reconfigurable convolution neural network processor |
-
2016
- 2016-08-04 CN CN201610633040.9A patent/CN106250103A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001090927A1 (en) * | 2000-05-19 | 2001-11-29 | Philipson Lars H G | Method and device in a convolution process |
CN102208005A (en) * | 2011-05-30 | 2011-10-05 | 华中科技大学 | 2-dimensional (2-D) convolver |
CN104077233A (en) * | 2014-06-18 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Single-channel convolution layer and multi-channel convolution layer handling method and device |
CN105681628A (en) * | 2016-01-05 | 2016-06-15 | 西安交通大学 | Convolution network arithmetic unit, reconfigurable convolution neural network processor and image de-noising method of reconfigurable convolution neural network processor |
Non-Patent Citations (2)
Title |
---|
窦勇等: "支持循环自动流水线的粗粒度可重构阵列体系结构", 《中国科学E辑:信息科学》 * |
陆志坚: "基于FPGA的卷积神经网络并行结构研究", 《中国博士学位论文全文数据库,信息科技辑》 * |
Cited By (92)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844294B (en) * | 2016-12-29 | 2019-05-03 | 华为机器有限公司 | Convolution algorithm chip and communication equipment |
CN106844294A (en) * | 2016-12-29 | 2017-06-13 | 华为机器有限公司 | Convolution algorithm chip and communication equipment |
CN106775599A (en) * | 2017-01-09 | 2017-05-31 | 南京工业大学 | Many computing unit coarseness reconfigurable systems and method of recurrent neural network |
WO2018137177A1 (en) * | 2017-01-25 | 2018-08-02 | 北京大学 | Method for convolution operation based on nor flash array |
US11309026B2 (en) | 2017-01-25 | 2022-04-19 | Peking University | Convolution operation method based on NOR flash array |
CN110325963A (en) * | 2017-02-28 | 2019-10-11 | 微软技术许可有限责任公司 | The multi-functional unit for programmable hardware node for Processing with Neural Network |
CN110383237A (en) * | 2017-02-28 | 2019-10-25 | 德克萨斯仪器股份有限公司 | Reconfigurable matrix multiplier system and method |
US11663450B2 (en) | 2017-02-28 | 2023-05-30 | Microsoft Technology Licensing, Llc | Neural network processing with chained instructions |
CN110383237B (en) * | 2017-02-28 | 2023-05-26 | 德克萨斯仪器股份有限公司 | Reconfigurable matrix multiplier system and method |
CN110325963B (en) * | 2017-02-28 | 2023-05-23 | 微软技术许可有限责任公司 | Multifunctional unit for programmable hardware nodes for neural network processing |
CN107229598A (en) * | 2017-04-21 | 2017-10-03 | 东南大学 | A kind of low power consumption voltage towards convolutional neural networks is adjustable convolution computing module |
CN107103754A (en) * | 2017-05-10 | 2017-08-29 | 华南师范大学 | A kind of road traffic condition Forecasting Methodology and system |
WO2018232615A1 (en) * | 2017-06-21 | 2018-12-27 | 华为技术有限公司 | Signal processing method and device |
CN111176727A (en) * | 2017-07-20 | 2020-05-19 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111176727B (en) * | 2017-07-20 | 2022-05-31 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111221578A (en) * | 2017-07-20 | 2020-06-02 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111221578B (en) * | 2017-07-20 | 2022-07-15 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111095242A (en) * | 2017-07-24 | 2020-05-01 | 特斯拉公司 | Vector calculation unit |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
CN111095242B (en) * | 2017-07-24 | 2024-03-22 | 特斯拉公司 | Vector calculation unit |
US10928456B2 (en) | 2017-08-17 | 2021-02-23 | Samsung Electronics Co., Ltd. | Method and apparatus for estimating state of battery |
CN107590085B (en) * | 2017-08-18 | 2018-05-29 | 浙江大学 | A kind of dynamic reconfigurable array data path and its control method with multi-level buffer |
CN107590085A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of dynamic reconfigurable array data path and its control method with multi-level buffer |
CN107832262A (en) * | 2017-10-19 | 2018-03-23 | 珠海格力电器股份有限公司 | Convolution algorithm method and device |
CN107635138A (en) * | 2017-10-19 | 2018-01-26 | 珠海格力电器股份有限公司 | Image processing apparatus |
CN111291880A (en) * | 2017-10-30 | 2020-06-16 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
CN111291880B (en) * | 2017-10-30 | 2024-05-14 | 上海寒武纪信息科技有限公司 | Computing device and computing method |
US11537857B2 (en) | 2017-11-01 | 2022-12-27 | Tencent Technology (Shenzhen) Company Limited | Pooling processing method and system applied to convolutional neural network |
US11734554B2 (en) | 2017-11-01 | 2023-08-22 | Tencent Technology (Shenzhen) Company Limited | Pooling processing method and system applied to convolutional neural network |
CN109754359A (en) * | 2017-11-01 | 2019-05-14 | 腾讯科技(深圳)有限公司 | A kind of method and system that the pondization applied to convolutional neural networks is handled |
CN107862650B (en) * | 2017-11-29 | 2021-07-06 | 中科亿海微电子科技(苏州)有限公司 | Method for accelerating calculation of CNN convolution of two-dimensional image |
CN107862650A (en) * | 2017-11-29 | 2018-03-30 | 中科亿海微电子科技(苏州)有限公司 | The method of speed-up computation two dimensional image CNN convolution |
CN108701015A (en) * | 2017-11-30 | 2018-10-23 | 深圳市大疆创新科技有限公司 | For the arithmetic unit of neural network, chip, equipment and correlation technique |
CN111465924A (en) * | 2017-12-12 | 2020-07-28 | 特斯拉公司 | System and method for converting matrix input to vectorized input for a matrix processor |
CN111465924B (en) * | 2017-12-12 | 2023-11-17 | 特斯拉公司 | System and method for converting matrix input into vectorized input for matrix processor |
CN108009126A (en) * | 2017-12-15 | 2018-05-08 | 北京中科寒武纪科技有限公司 | A kind of computational methods and Related product |
CN108198125B (en) * | 2017-12-29 | 2021-10-08 | 深圳云天励飞技术有限公司 | Image processing method and device |
CN109992541A (en) * | 2017-12-29 | 2019-07-09 | 深圳云天励飞技术有限公司 | A kind of data method for carrying, Related product and computer storage medium |
CN108198125A (en) * | 2017-12-29 | 2018-06-22 | 深圳云天励飞技术有限公司 | A kind of image processing method and device |
CN108182471A (en) * | 2018-01-24 | 2018-06-19 | 上海岳芯电子科技有限公司 | A kind of convolutional neural networks reasoning accelerator and method |
CN108182471B (en) * | 2018-01-24 | 2022-02-15 | 上海岳芯电子科技有限公司 | Convolutional neural network reasoning accelerator and method |
CN108241890B (en) * | 2018-01-29 | 2021-11-23 | 清华大学 | Reconfigurable neural network acceleration method and architecture |
CN108241890A (en) * | 2018-01-29 | 2018-07-03 | 清华大学 | A kind of restructural neural network accelerated method and framework |
CN108596331A (en) * | 2018-04-16 | 2018-09-28 | 浙江大学 | A kind of optimization method of cell neural network hardware structure |
CN108564524A (en) * | 2018-04-24 | 2018-09-21 | 开放智能机器(上海)有限公司 | A kind of convolutional calculation optimization method of visual pattern |
CN110413561A (en) * | 2018-04-28 | 2019-11-05 | 北京中科寒武纪科技有限公司 | Data accelerate processing system |
CN110413561B (en) * | 2018-04-28 | 2021-03-30 | 中科寒武纪科技股份有限公司 | Data acceleration processing system |
CN108595379A (en) * | 2018-05-08 | 2018-09-28 | 济南浪潮高新科技投资发展有限公司 | A kind of parallelization convolution algorithm method and system based on multi-level buffer |
CN108665063B (en) * | 2018-05-18 | 2022-03-18 | 南京大学 | Bidirectional parallel processing convolution acceleration system for BNN hardware accelerator |
CN108665063A (en) * | 2018-05-18 | 2018-10-16 | 南京大学 | Two-way simultaneous for BNN hardware accelerators handles convolution acceleration system |
WO2019231254A1 (en) * | 2018-05-30 | 2019-12-05 | Samsung Electronics Co., Ltd. | Processor, electronics apparatus and control method thereof |
US11244027B2 (en) | 2018-05-30 | 2022-02-08 | Samsung Electronics Co., Ltd. | Processor, electronics apparatus and control method thereof |
CN108717571B (en) * | 2018-06-01 | 2020-09-15 | 阿依瓦(北京)技术有限公司 | Acceleration method and device for artificial intelligence |
CN108764182B (en) * | 2018-06-01 | 2020-12-08 | 阿依瓦(北京)技术有限公司 | Optimized acceleration method and device for artificial intelligence |
CN108717571A (en) * | 2018-06-01 | 2018-10-30 | 阿依瓦(北京)技术有限公司 | A kind of acceleration method and device for artificial intelligence |
CN108764182A (en) * | 2018-06-01 | 2018-11-06 | 阿依瓦(北京)技术有限公司 | A kind of acceleration method and device for artificial intelligence of optimization |
CN109272112B (en) * | 2018-07-03 | 2021-08-27 | 北京中科睿芯科技集团有限公司 | Data reuse instruction mapping method, system and device for neural network |
CN109272112A (en) * | 2018-07-03 | 2019-01-25 | 北京中科睿芯科技有限公司 | A kind of data reusing command mappings method, system and device towards neural network |
CN108681984A (en) * | 2018-07-26 | 2018-10-19 | 珠海市微半导体有限公司 | A kind of accelerating circuit of 3*3 convolution algorithms |
CN108681984B (en) * | 2018-07-26 | 2023-08-15 | 珠海一微半导体股份有限公司 | Acceleration circuit of 3*3 convolution algorithm |
US11694074B2 (en) | 2018-09-07 | 2023-07-04 | Samsung Electronics Co., Ltd. | Integrated circuit that extracts data, neural network processor including the integrated circuit, and neural network device |
CN109460813A (en) * | 2018-09-10 | 2019-03-12 | 中国科学院深圳先进技术研究院 | Accelerated method, device, equipment and the storage medium that convolutional neural networks calculate |
WO2020051751A1 (en) * | 2018-09-10 | 2020-03-19 | 中国科学院深圳先进技术研究院 | Convolution neural network computing acceleration method and apparatus, device, and storage medium |
CN109284475A (en) * | 2018-09-20 | 2019-01-29 | 郑州云海信息技术有限公司 | A kind of matrix convolution computing module and matrix convolution calculation method |
CN109284475B (en) * | 2018-09-20 | 2021-10-29 | 郑州云海信息技术有限公司 | Matrix convolution calculating device and matrix convolution calculating method |
US10733742B2 (en) | 2018-09-26 | 2020-08-04 | International Business Machines Corporation | Image labeling |
US11176427B2 (en) | 2018-09-26 | 2021-11-16 | International Business Machines Corporation | Overlapping CNN cache reuse in high resolution and streaming-based deep learning inference engines |
CN109375952B (en) * | 2018-09-29 | 2021-01-26 | 北京字节跳动网络技术有限公司 | Method and apparatus for storing data |
CN109375952A (en) * | 2018-09-29 | 2019-02-22 | 北京字节跳动网络技术有限公司 | Method and apparatus for storing data |
CN111045958A (en) * | 2018-10-11 | 2020-04-21 | 展讯通信(上海)有限公司 | Acceleration engine and processor |
CN111045958B (en) * | 2018-10-11 | 2022-09-16 | 展讯通信(上海)有限公司 | Acceleration engine and processor |
WO2020077565A1 (en) * | 2018-10-17 | 2020-04-23 | 北京比特大陆科技有限公司 | Data processing method and apparatus, electronic device, and computer readable storage medium |
CN109800867B (en) * | 2018-12-17 | 2020-09-29 | 北京理工大学 | Data calling method based on FPGA off-chip memory |
CN109816093B (en) * | 2018-12-17 | 2020-12-04 | 北京理工大学 | Single-path convolution implementation method |
CN109816093A (en) * | 2018-12-17 | 2019-05-28 | 北京理工大学 | A kind of one-way convolution implementation method |
CN109711533A (en) * | 2018-12-20 | 2019-05-03 | 西安电子科技大学 | Convolutional neural networks module based on FPGA |
CN109711533B (en) * | 2018-12-20 | 2023-04-28 | 西安电子科技大学 | Convolutional neural network acceleration system based on FPGA |
CN110069444A (en) * | 2019-06-03 | 2019-07-30 | 南京宁麒智能计算芯片研究院有限公司 | A kind of computing unit, array, module, hardware system and implementation method |
WO2021007037A1 (en) * | 2019-07-09 | 2021-01-14 | MemryX Inc. | Matrix data reuse techniques in processing systems |
US11537535B2 (en) | 2019-07-09 | 2022-12-27 | Memryx Incorporated | Non-volatile memory based processors and dataflow techniques |
CN110377874B (en) * | 2019-07-23 | 2023-05-02 | 江苏鼎速网络科技有限公司 | Convolution operation method and system |
CN110377874A (en) * | 2019-07-23 | 2019-10-25 | 江苏鼎速网络科技有限公司 | Convolution algorithm method and system |
CN110705687A (en) * | 2019-09-05 | 2020-01-17 | 北京三快在线科技有限公司 | Convolution neural network hardware computing device and method |
CN111523642B (en) * | 2020-04-10 | 2023-03-28 | 星宸科技股份有限公司 | Data reuse method, operation method and device and chip for convolution operation |
CN111523642A (en) * | 2020-04-10 | 2020-08-11 | 厦门星宸科技有限公司 | Data reuse method, operation method and device and chip for convolution operation |
CN111859797A (en) * | 2020-07-14 | 2020-10-30 | Oppo广东移动通信有限公司 | Data processing method and device and storage medium |
WO2022179075A1 (en) * | 2021-02-26 | 2022-09-01 | 成都商汤科技有限公司 | Data processing method and apparatus, computer device and storage medium |
CN112992248A (en) * | 2021-03-12 | 2021-06-18 | 西安交通大学深圳研究院 | PE (provider edge) calculation unit structure of FIFO (first in first out) -based variable-length cyclic shift register |
CN114780910A (en) * | 2022-06-16 | 2022-07-22 | 千芯半导体科技(北京)有限公司 | Hardware system and calculation method for sparse convolution calculation |
CN114780910B (en) * | 2022-06-16 | 2022-09-06 | 千芯半导体科技(北京)有限公司 | Hardware system and calculation method for sparse convolution calculation |
CN116842307A (en) * | 2023-08-28 | 2023-10-03 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment, chip and storage medium |
CN116842307B (en) * | 2023-08-28 | 2023-11-28 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment, chip and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106250103A (en) | A kind of convolutional neural networks cyclic convolution calculates the system of data reusing | |
JP7430203B2 (en) | System and method for matrix multiplication instructions using floating point operations with specified bias | |
CN111291880B (en) | Computing device and computing method | |
CN108268943B (en) | Hardware accelerator engine | |
CN109376861B (en) | Apparatus and method for performing full connectivity layer neural network training | |
CN104899182B (en) | A kind of Matrix Multiplication accelerated method for supporting variable partitioned blocks | |
JP6960700B2 (en) | Multicast Network On-Chip Convolutional Neural Network Hardware Accelerator and Its Behavior | |
CN108108809B (en) | Hardware architecture for reasoning and accelerating convolutional neural network and working method thereof | |
CN108416436B (en) | Method and system for neural network partitioning using multi-core processing module | |
CN104054108B (en) | Can dynamic configuration streamline preprocessor | |
CN103221918B (en) | IC cluster processing equipments with separate data/address bus and messaging bus | |
CA3051990A1 (en) | Accelerated deep learning | |
US11544525B2 (en) | Systems and methods for artificial intelligence with a flexible hardware processing framework | |
CN109740748B (en) | Convolutional neural network accelerator based on FPGA | |
CN105468568B (en) | Efficient coarseness restructurable computing system | |
CN109711533A (en) | Convolutional neural networks module based on FPGA | |
CN106294278B (en) | Adaptive hardware for dynamic reconfigurable array computing system is pre-configured controller | |
CN105912501A (en) | SM4-128 encryption algorithm implementation method and system based on large-scale coarseness reconfigurable processor | |
WO2018057294A1 (en) | Combined world-space pipeline shader stages | |
CN115136123A (en) | Tile subsystem and method for automated data flow and data processing within an integrated circuit architecture | |
CN110991619A (en) | Neural network processor, chip and electronic equipment | |
CN109657794A (en) | A kind of distributed deep neural network performance modelling method of queue based on instruction | |
CN102446342B (en) | Reconfigurable binary arithmetical unit, reconfigurable binary image processing system and basic morphological algorithm implementation method thereof | |
CN115860066A (en) | Neural network reasoning pipeline multiplexing method based on batch processing | |
CN110503179A (en) | Calculation method and Related product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161221 |
|
RJ01 | Rejection of invention patent application after publication |