CN109800867A - A kind of data calling method based on FPGA chip external memory - Google Patents

A kind of data calling method based on FPGA chip external memory Download PDF

Info

Publication number
CN109800867A
CN109800867A CN201811545237.2A CN201811545237A CN109800867A CN 109800867 A CN109800867 A CN 109800867A CN 201811545237 A CN201811545237 A CN 201811545237A CN 109800867 A CN109800867 A CN 109800867A
Authority
CN
China
Prior art keywords
fifo
data
characteristic pattern
group
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811545237.2A
Other languages
Chinese (zh)
Other versions
CN109800867B (en
Inventor
龙腾
魏鑫
陈禾
陈磊
陈亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201811545237.2A priority Critical patent/CN109800867B/en
Publication of CN109800867A publication Critical patent/CN109800867A/en
Application granted granted Critical
Publication of CN109800867B publication Critical patent/CN109800867B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Complex Calculations (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The present invention provides a kind of data calling method based on FPGA chip external memory, the data of characteristic pattern are stored in sequence in fifo line by line, each read-write operation, preceding M fifo exports first currently stored data, first currently stored data are written back to the data trailer of fifo storage more corresponding than the number that it numbers small L-M by M fifo afterwards, simultaneously, first data of characteristic pattern L+1 row are written to the data trailer of the L-1 fifo storage, by the data trailer of first data write-in l-th fifo storage of characteristic pattern L+2 row, when exporting fifo constantly sequentially data outside fifo group, it is medium to be read that the remaining data of characteristic pattern are sequentially written to fifo group again, data traversal until completing whole picture characteristic pattern;Therefore, the present invention does not go directly to call the data of FPGA chip external memory, avoids complicated address jump, substantially increases the efficiency for calling FPGA chip external memory data.

Description

A kind of data calling method based on FPGA chip external memory
Technical field
The invention belongs to image classification identification technology field more particularly to a kind of data tune based on FPGA chip external memory Use method.
Background technique
Over nearly 5 years, convolutional neural networks achieve good effect in fields such as image characteristics extraction, Classification and Identifications.By Flexible and changeable in convolutional neural networks framework, present convolutional neural networks mainly pass through the software desk Implementations such as CPU and GPU. It is more and more prominent for the demand of system real time, low-power consumption but in present engineer application, therefore utilize hardware platform The calculating of convolutional neural networks is accelerated and achievees the purpose that reduce system power dissipation, has become convolutional neural networks in work Research hotspot problem in Cheng Yingyong.Field programmable gate array (FPGA) is exactly one of up-and-coming solution. But the on piece storage resource of FPGA is difficult to meet the storage of image data in convolutional neural networks, parameter and intermediate result, Therefore when accelerating the calculating of convolutional neural networks using FPGA, need to call the storage resource outside FPGA piece to meet system Storage demand.Therefore, the reasonable calling problem for studying the piece external storage of FPGA becomes the emphasis of current research.
Due to the reasonable calling of FPGA piece external storage, the convolutional neural networks based on FPGA design can be made to calculate single Member, farthest accelerates convolutional calculation, improves system the characteristics of giving full play to parallel computation in convolutional neural networks algorithm Handling capacity.Therefore, the calling optimization problem of FPGA piece external storage has become the following convolutional neural networks and realizes acceleration on FPGA Calculate one of the important research direction of development.
In the existing optimization method called about FPGA piece external storage, it is mainly based upon the structure of FPGA chip external memory The optimization stored.Data divide Bank to store in the chip external memory of FPGA, and the main thinking of current method is by convolution Different input feature vector diagram datas being stored in the different Bank of FPGA chip external memory as far as possible in neural network, but in this way Method need continually to jump location access chip external memory, read-write efficiency is low, is especially encountering large-scale convolutional Neural When network query function, it is more unable to satisfy the efficiency requirements of data call.
Summary of the invention
To solve the above problems, the present invention provides a kind of data calling method based on FPGA chip external memory, without straight The data for calling FPGA chip external memory are connect, complicated address jump is avoided, substantially increases and call FPGA piece external storage The efficiency of device data.
A kind of data calling method based on FPGA chip external memory is applied to convolutional neural networks, comprising the following steps:
S1: in FPGA on-chip memory be arranged fifo group, wherein fifo group include L fifo, then by each fifo according to Secondary number is 1 to L, and determines the number M for needing the fifo of output data to outside fifo group simultaneously, specific:
L=2 × kernel+Stride × (N-2)
M=kernel+Stride × (N-1)
Wherein, kernel is preset convolution kernel size, and Stride is the step-length of sliding window employed in convolutional calculation, N Group number for the Sliding window data for needing while generating, wherein N >=2;
S2: the preceding L row data in FPGA chip external memory in characteristic pattern are stored in line by line in fifo group, wherein each Fifo stores the data line of characteristic pattern, and the depth depth of fifo is greater than the size of characteristic pattern;
S3: being written and read each fifo in fifo group, wherein the read-write operation specifically:
For the preceding M fifo from front number, first data output fifo group of each fifo storage is outer to be used as convolution The Sliding window data of neural network, while second data becomes first data;For rear M fifo, each fifo reciprocal The data trailer of first data write-in of storage fifo storage more corresponding than the number that it numbers small L-M, meanwhile, by characteristic pattern The data trailer of the L-1 fifo storage is written in first data of L+1 row, by first data of characteristic pattern L+2 row The data trailer of l-th fifo storage is written, completes the update of each fifo in fifo group;
S4: repeating step S3, each fifo in updated fifo group re-started read-write operation, until completing characteristic pattern The traversal of all data.
The utility model has the advantages that
The present invention provides a kind of data calling method based on FPGA chip external memory, the data of characteristic pattern is pressed line by line suitable Sequence is stored in fifo, and M fifo exports first currently stored data before each read-write operation, and rear M fifo will be current First data of storage can write the data trailer of fifo storage more corresponding than the number that it numbers small L-M, meanwhile, by feature The data trailer of the L-1 fifo storage is written in first data of figure L+1 row, by first number of characteristic pattern L+2 row According to the data trailer of write-in l-th fifo storage, so that when fifo constantly sequentially exports data outside fifo group, characteristic pattern Remaining data are sequentially written to that fifo group is medium to be read again, the data traversal until completing whole picture characteristic pattern;Therefore, this hair Bright to construct fifo group using fifo in FPGA piece, according to data sequence requirement needed for convolutional calculation, each fifo will be stored in The data of the whole picture characteristic pattern of FPGA chip external memory are output to the convolutional calculation unit outside group one by one, then at this by FPGA piece During external memory to FPGA on-chip memory data call, the data for calling FPGA chip external memory are not gone directly, Complicated address jump is avoided, the efficiency for calling FPGA chip external memory data is substantially increased.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the data calling method based on FPGA chip external memory provided by the invention;
Fig. 2 is provided by the invention when not being written and read, and the data of each fifo store schematic diagram in fifo group;
Fig. 3 is the data storage schematic diagram of each fifo in fifo group after progress first time read-write operation provided by the invention;
Fig. 4 is the data storage schematic diagram of each fifo in fifo group after second of read-write operation of progress provided by the invention;
Fig. 5 is the data storage schematic diagram of each fifo in fifo group after progress third time read-write operation provided by the invention.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described.
Embodiment one
Referring to Fig. 1, which is a kind of stream of data calling method based on FPGA chip external memory provided in this embodiment Cheng Tu.A kind of data calling method based on FPGA chip external memory is applied to convolutional neural networks, especially convolutional Neural net Network uses the mode of sliding window to size for the process of the carry out data extraction of the characteristic pattern of S × S in calculating, comprising the following steps:
S1: fifo group is set in FPGA on-chip memory, wherein fifo group includes L fifo (first input First output, First Input First Output), it is then 1 to L by each fifo number consecutively, and determine and need simultaneously to fifo group The number M of the fifo of outer output data, specific:
L=2 × kernel+Stride × (N-2) (1)
M=kernel+Stride × (N-1) (2)
Wherein, kernel is preset convolution kernel size, and Stride is the step-length of sliding window employed in convolutional calculation, N Group number for the Sliding window data for needing while generating, wherein N >=2.
It should be noted that in a computer, First Input First Output is a kind of traditional sequentially execution method, it is introduced into Instruction is first completed and retires from office, and Article 2 instruction is and then just executed.
S2: the preceding L row data in FPGA chip external memory in characteristic pattern are stored in line by line in fifo group, wherein each Fifo stores the data line of characteristic pattern, and the depth depth of fifo is greater than the size S of characteristic pattern.
S3: being written and read each fifo in fifo group, wherein the read-write operation specifically:
For the preceding M fifo from front number, first data output fifo group of each fifo storage is outer to be used as convolution The Sliding window data of neural network, for rear M fifo reciprocal, first data write-in of each fifo storage is smaller than its number The data trailer of the corresponding fifo storage of the number of L-M, meanwhile, by first data write-in L-1 of characteristic pattern L+1 row The data trailer of fifo storage, it is complete by the data trailer of first data write-in l-th fifo storage of characteristic pattern L+2 row At the update of each fifo in fifo group.
It should be noted that first data write-in of each fifo storage is than its number for rear M fifo reciprocal First data in the L-M+1 fifo are exactly written back into the by the data trailer of the number corresponding fifo storage of small L-M The data trailer of one fifo storage, first data in the L-M+2 fifo are written back into the data that second fifo is stored Tail portion, and so on, until first data in l-th fifo are written back into m-th fifo storing data tail portion.
It should be noted that in the physical store of actual FPGA on-chip memory, each fifo it is currently stored One data output fifo group outside after, since fifo follows the storage strategy of first in first out, then the data meeting that is stored in each fifo The storage location of itself is successively moved forward one, i.e. second data become first data, and third data become Two data, and so on, it to the last vacates for one, first for rear M fifo reciprocal, each fifo could being stored The data trailer of the number corresponding fifo storage of small L-M is numbered in data write-in than it, meanwhile, by the of characteristic pattern L+1 row The data trailer of the L-1 fifo storage is written in one data, and l-th is written in first data of characteristic pattern L+2 row The data trailer of fifo storage.
S4: repeating step S3, each fifo in updated fifo group re-started read-write operation, until completing characteristic pattern The traversal of all data.
Embodiment two
Based on above embodiments, the present embodiment is with the size of characteristic pattern for 15 × 15, and the size of convolution kernel is 3 × 3, feature The step-length Stride of sliding window is 1 when figure convolutional calculation and the number of FPGA on piece convolutional calculation unit is 2, that is, is needed simultaneously For the sliding window number N=2 of processing, a kind of FPGA chip external memory call method is described in detail.
Step 1: determining the number L of fifo in fifo group
According to the step-length (Stride) of sliding window when the size (kernel) of convolution kernel in convolutional calculation, characteristic pattern convolutional calculation And number (N) of FPGA on piece convolutional calculation unit these three parameters determine the number L of fifo in each fifo group, meet with Lower formula:
L=2 × 3+2 × (2-2)=6
That is, there is 6 fifo in fifo group.
Step 2: determining the number M for needing the fifo of output data to outside fifo group simultaneously
The case where according to assuming, the step-length of sliding window when the size (kernel) of convolution kernel is 3, characteristic pattern convolutional calculation (Stride) number (N) for being 1, FPGA on piece convolutional calculation unit is 2, so can be determined according to these three parameters each Fifo group needs while exporting the data in M fifo, meets the following formula:
M=3+ (2-1) × 1=4
That is, fifo read-write operation needs to export the data of 4 fifo to outside fifo group simultaneously every time.
Step 3: determining the depth of fifo
According to formula: depth >=size is it is recognised that the depth of each fifo is chosen as 16.
Step 4: 6 row data before in characteristic pattern are stored in line by line in fifo group, wherein each fifo stores feature The data line of figure.
Referring to fig. 2, which is provided in this embodiment when not being written and read, the data storage of each fifo in fifo group Schematic diagram.Wherein, the number of each fifo is followed successively by 1~6 from top to bottom in fifo group.It is assumed that the number of the first eight row of the characteristic pattern of input It is respectively 1 to 120 according to number, before carrying out fifo read-write operation, the fifo in fifo group is respectively written into input feature vector Fig. 1 extremely The data of 6 rows, wherein the data of the second row are written in the data for the fifo write-in the first row that number is 1, the fifo that number is 2, according to Secondary to analogize, the data storage condition of each fifo is as shown in Figure 2 in fifo group.
Referring to Fig. 3, which is the data of each fifo in fifo group after progress first time read-write operation provided in this embodiment Store schematic diagram.The first data output fifo group stored in 4 fifo that number is 1 to 4 is outer, i.e. characteristic pattern number is 1, 16,31,46 4 data are exported simultaneously outside fifo group, are stored in two convolutional calculation units of FPGA on piece;Number is 3 to 6 4 fifo in first data write-in for storing the data trailer of small 2 number corresponding fifo storage is numbered than it, In, the data trailer for the fifo storage that the first data 31 write-in number stored in the fifo that number is 3 is 1, number is 4 The data trailer of the fifo for being 2 storage is numbered in first data 46 write-in stored in fifo, is stored in the fifo that number is 5 The data trailer of the fifo for being 3 storage, first data 76 stored in the fifo that number is 6 are numbered in first write-in of data 61 The data trailer for the fifo storage that write-in number is 4;Meanwhile first data 91 write-in number of the 7th row of characteristic pattern is 5 The data trailer of fifo storage, the data trailer for the fifo storage that first data 106 write-in number of characteristic pattern eighth row is 6, The update of each fifo in fifo group is completed, as shown in Figure 3.
Referring to fig. 4 and Fig. 5, respectively provided in this embodiment to carry out second, after third time read-write operation, in fifo group The data of each fifo store schematic diagram.Wherein, each fifo is stored in fifo group data branch mode and first time read-write operation Similar, the present embodiment does not repeat this.
It can be seen that a kind of FPGA chip external memory call method provided in this embodiment, line by line by the data of characteristic pattern It is stored in sequence in fifo, M fifo exports first currently stored data before each read-write operation, and rear M fifo will First currently stored data can write the data trailer of fifo storage more corresponding than the number that it numbers small L-M, meanwhile, it will The data trailer of the L-1 fifo storage is written in first data of characteristic pattern L+1 row, by the first of characteristic pattern L+2 row The data trailer of a data write-in l-th fifo storage, so that fifo constantly sequentially output data fifo group outside when, feature Scheme remaining data and be sequentially written to that fifo group is medium to be read again, until the data traversal of completion whole picture characteristic pattern;Therefore, originally Embodiment constructs fifo group using fifo in FPGA piece, and according to data sequence requirement needed for convolutional calculation, each fifo will be whole The data of width characteristic pattern are output to the convolutional calculation unit outside group one by one, wherein convolutional calculation unit also belongs to FPGA piece memory Reservoir, then at this by FPGA chip external memory to during FPGA on-chip memory data call, do not go directly to call The data of FPGA chip external memory avoid complicated address jump, substantially increase and call FPGA chip external memory data Efficiency.
In addition, the calling optimization method of existing FPGA chip external memory is easy by input feature vector figure in convolutional calculation Several influences can also encounter the problem of jumping location access when the number of input feature vector figure is greater than the bank number of chip external memory, and The method of the present embodiment is not influenced by input feature vector figure number, and different convolutional neural networks Structure Calculations can be flexibly met It needs.
Furthermore the calling optimization method of existing FPGA chip external memory is difficult to meet need in convolutional neural networks calculating It will be according to different convolution kernel sizes, different characteristic figure sliding window step-length and different convolutional calculation unit number flexible configuration convolution meters The requirement to count according to input, and the method for the present embodiment can determine of fifo in fifo group by formula (1) and formula (2) Number L, and the number M of the fifo of output data to outside fifo group simultaneously is needed, so as to adjust of fifo in each fifo group Number completes flexible configuration.
Certainly, the invention may also have other embodiments, without deviating from the spirit and substance of the present invention, ripe Various corresponding changes and modifications can be made according to the present invention certainly by knowing those skilled in the art, but these it is corresponding change and Deformation all should fall within the scope of protection of the appended claims of the present invention.

Claims (1)

1. a kind of data calling method based on FPGA chip external memory is applied to convolutional neural networks, which is characterized in that including Following steps:
S1: fifo group is set in FPGA on-chip memory, wherein fifo group includes L fifo, then successively compiles each fifo Number it is 1 to L, and determines the number M for needing the fifo of output data to outside fifo group simultaneously, specific:
L=2 × kernel+Stride × (N-2)
M=kernel+Stride × (N-1)
Wherein, kernel is preset convolution kernel size, and Stride is the step-length of sliding window employed in convolutional calculation, and N is to need The group number for the Sliding window data to generate simultaneously, wherein N >=2;
S2: the preceding L row data in FPGA chip external memory in characteristic pattern are stored in line by line in fifo group, wherein each fifo The data line of characteristic pattern is stored, and the depth depth of fifo is greater than the size of characteristic pattern;
S3: being written and read each fifo in fifo group, wherein the read-write operation specifically:
For the preceding M fifo from front number, first data output fifo group of each fifo storage is outer to be used as convolutional Neural The Sliding window data of network, while second data becomes first data;For rear M fifo reciprocal, each fifo storage The write-in of first data the data trailer of the number corresponding fifo storage of small L-M is numbered than it, meanwhile, by characteristic pattern L+ The data trailer of the L-1 fifo storage is written in first data of 1 row, and first data of characteristic pattern L+2 row are written The data trailer of l-th fifo storage, completes the update of each fifo in fifo group;
S4: repeating step S3, each fifo in updated fifo group re-started read-write operation, all until completing characteristic pattern The traversal of data.
CN201811545237.2A 2018-12-17 2018-12-17 Data calling method based on FPGA off-chip memory Active CN109800867B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811545237.2A CN109800867B (en) 2018-12-17 2018-12-17 Data calling method based on FPGA off-chip memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811545237.2A CN109800867B (en) 2018-12-17 2018-12-17 Data calling method based on FPGA off-chip memory

Publications (2)

Publication Number Publication Date
CN109800867A true CN109800867A (en) 2019-05-24
CN109800867B CN109800867B (en) 2020-09-29

Family

ID=66556986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811545237.2A Active CN109800867B (en) 2018-12-17 2018-12-17 Data calling method based on FPGA off-chip memory

Country Status (1)

Country Link
CN (1) CN109800867B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488305A (en) * 2020-12-22 2021-03-12 西北工业大学 Neural network storage organization structure and configurable management method thereof
WO2021232843A1 (en) * 2020-05-22 2021-11-25 浪潮电子信息产业股份有限公司 Image data storage method, image data processing method and system, and related apparatus

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170236053A1 (en) * 2015-12-29 2017-08-17 Synopsys, Inc. Configurable and Programmable Multi-Core Architecture with a Specialized Instruction Set for Embedded Application Based on Neural Networks
CN107392309A (en) * 2017-09-11 2017-11-24 东南大学—无锡集成电路技术研究所 A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
CN107862650A (en) * 2017-11-29 2018-03-30 中科亿海微电子科技(苏州)有限公司 The method of speed-up computation two dimensional image CNN convolution
CN108229645A (en) * 2017-04-28 2018-06-29 北京市商汤科技开发有限公司 Convolution accelerates and computation processing method, device, electronic equipment and storage medium
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms
CN108717571A (en) * 2018-06-01 2018-10-30 阿依瓦(北京)技术有限公司 A kind of acceleration method and device for artificial intelligence
CN108764182A (en) * 2018-06-01 2018-11-06 阿依瓦(北京)技术有限公司 A kind of acceleration method and device for artificial intelligence of optimization
US20180341621A1 (en) * 2017-05-23 2018-11-29 Korea University Research And Business Foundation Bi-directional fifo memory and convolution processing device using the same

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915322B (en) * 2015-06-09 2018-05-01 中国人民解放军国防科学技术大学 A kind of hardware-accelerated method of convolutional neural networks
CN106228240B (en) * 2016-07-30 2020-09-01 复旦大学 Deep convolution neural network implementation method based on FPGA
CN106250103A (en) * 2016-08-04 2016-12-21 东南大学 A kind of convolutional neural networks cyclic convolution calculates the system of data reusing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170236053A1 (en) * 2015-12-29 2017-08-17 Synopsys, Inc. Configurable and Programmable Multi-Core Architecture with a Specialized Instruction Set for Embedded Application Based on Neural Networks
CN108229645A (en) * 2017-04-28 2018-06-29 北京市商汤科技开发有限公司 Convolution accelerates and computation processing method, device, electronic equipment and storage medium
US20180341621A1 (en) * 2017-05-23 2018-11-29 Korea University Research And Business Foundation Bi-directional fifo memory and convolution processing device using the same
CN107392309A (en) * 2017-09-11 2017-11-24 东南大学—无锡集成电路技术研究所 A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
CN107862650A (en) * 2017-11-29 2018-03-30 中科亿海微电子科技(苏州)有限公司 The method of speed-up computation two dimensional image CNN convolution
CN108717571A (en) * 2018-06-01 2018-10-30 阿依瓦(北京)技术有限公司 A kind of acceleration method and device for artificial intelligence
CN108764182A (en) * 2018-06-01 2018-11-06 阿依瓦(北京)技术有限公司 A kind of acceleration method and device for artificial intelligence of optimization
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021232843A1 (en) * 2020-05-22 2021-11-25 浪潮电子信息产业股份有限公司 Image data storage method, image data processing method and system, and related apparatus
EP4156079A4 (en) * 2020-05-22 2024-03-27 Inspur Electronic Information Industry Co., Ltd Image data storage method, image data processing method and system, and related apparatus
CN112488305A (en) * 2020-12-22 2021-03-12 西北工业大学 Neural network storage organization structure and configurable management method thereof
CN112488305B (en) * 2020-12-22 2023-04-18 西北工业大学 Neural network storage device and configurable management method thereof

Also Published As

Publication number Publication date
CN109800867B (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN107169563B (en) Processing system and method applied to two-value weight convolutional network
CN108108809B (en) Hardware architecture for reasoning and accelerating convolutional neural network and working method thereof
CN104915322B (en) A kind of hardware-accelerated method of convolutional neural networks
CN108665059A (en) Convolutional neural networks acceleration system based on field programmable gate array
US8959135B2 (en) Data structure for tiling and packetizing a sparse matrix
CN103049241B (en) A kind of method improving CPU+GPU isomery device calculated performance
CN106951395A (en) Towards the parallel convolution operations method and device of compression convolutional neural networks
CN110390384A (en) A kind of configurable general convolutional neural networks accelerator
CN107341544A (en) A kind of reconfigurable accelerator and its implementation based on divisible array
CN110390383A (en) A kind of deep neural network hardware accelerator based on power exponent quantization
CN109344965A (en) Arithmetic unit and method
CN107392309A (en) A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
US20120144130A1 (en) Optimizing Output Vector Data Generation Using A Formatted Matrix Data Structure
CN110826719A (en) Quantum program processing method and device, storage medium and electronic device
CN103336758A (en) Sparse matrix storage method CSRL (Compressed Sparse Row with Local Information) and SpMV (Sparse Matrix Vector Multiplication) realization method based on same
CN110163356A (en) A kind of computing device and method
CN110135584A (en) Extensive Symbolic Regression method and system based on self-adaptive parallel genetic algorithm
CN102207904B (en) Device and method for being emulated to reconfigurable processor
CN113222133B (en) FPGA-based compressed LSTM accelerator and acceleration method
CN110580519B (en) Convolution operation device and method thereof
CN105739951A (en) GPU-based L1 minimization problem fast solving method
CN109800867A (en) A kind of data calling method based on FPGA chip external memory
CN102508803A (en) Matrix transposition memory controller
CN111429974B (en) Molecular dynamics simulation short-range force parallel optimization method on super computer platform
Chang et al. A memory-optimized and energy-efficient CNN acceleration architecture based on FPGA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant