CN110084363B - 一种基于fpga平台的深度学习模型加速方法 - Google Patents
一种基于fpga平台的深度学习模型加速方法 Download PDFInfo
- Publication number
- CN110084363B CN110084363B CN201910400924.3A CN201910400924A CN110084363B CN 110084363 B CN110084363 B CN 110084363B CN 201910400924 A CN201910400924 A CN 201910400924A CN 110084363 B CN110084363 B CN 110084363B
- Authority
- CN
- China
- Prior art keywords
- deep learning
- learning model
- fpga
- model
- hardware
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000013136 deep learning model Methods 0.000 title claims abstract description 20
- 230000001133 acceleration Effects 0.000 title claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000005520 cutting process Methods 0.000 claims description 3
- 230000010076 replication Effects 0.000 claims description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 2
- 125000004122 cyclic group Chemical group 0.000 claims 6
- 101100498818 Arabidopsis thaliana DDR4 gene Proteins 0.000 claims 2
- 238000004364 calculation method Methods 0.000 abstract description 7
- 238000005457 optimization Methods 0.000 abstract description 6
- 230000005540 biological transmission Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 abstract description 4
- 238000013500 data storage Methods 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 10
- 230000015572 biosynthetic process Effects 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 9
- 238000012360 testing method Methods 0.000 description 8
- 238000013508 migration Methods 0.000 description 4
- 230000005012 migration Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000007334 memory performance Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Neurology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
Description
Resourse | DSP | BRAM | LUT | FF |
Used | 2240 | 1024 | 186251 | 205704 |
Available | 2520 | 1824 | 274080 | 548160 |
Utilization | 88.9% | 56.1% | 68% | 37.5% |
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910400924.3A CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910400924.3A CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110084363A CN110084363A (zh) | 2019-08-02 |
CN110084363B true CN110084363B (zh) | 2023-04-25 |
Family
ID=67420182
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910400924.3A Active CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110084363B (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516795B (zh) * | 2019-08-28 | 2022-05-10 | 北京达佳互联信息技术有限公司 | 一种为模型变量分配处理器的方法、装置及电子设备 |
CN110516796A (zh) * | 2019-08-28 | 2019-11-29 | 西北工业大学 | 一种面向嵌入式平台的分组卷积过程优化方法 |
CN110738311A (zh) * | 2019-10-14 | 2020-01-31 | 哈尔滨工业大学 | 基于高层次综合的lstm网络加速方法 |
CN112101537B (zh) * | 2020-09-17 | 2021-08-03 | 广东高云半导体科技股份有限公司 | Cnn加速器和电子设备 |
CN113780553B (zh) * | 2021-09-09 | 2023-11-07 | 中山大学 | 一种基于高层次综合工具的深度学习模型优化方法及系统 |
CN114754801B (zh) * | 2022-06-16 | 2022-08-26 | 北京理工导航控制科技股份有限公司 | 一种基于神经网络对光纤陀螺零偏温度补偿方法、装置及存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228240A (zh) * | 2016-07-30 | 2016-12-14 | 复旦大学 | 基于fpga的深度卷积神经网络实现方法 |
CN106228238A (zh) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | 现场可编程门阵列平台上加速深度学习算法的方法和系统 |
CN107862374A (zh) * | 2017-10-30 | 2018-03-30 | 中国科学院计算技术研究所 | 基于流水线的神经网络处理系统和处理方法 |
CN108520300A (zh) * | 2018-04-09 | 2018-09-11 | 郑州云海信息技术有限公司 | 一种深度学习网络的实现方法和装置 |
CN109583006A (zh) * | 2018-10-16 | 2019-04-05 | 浙江工业大学 | 一种基于循环切割和重排的现场可编程门阵列卷积层的动态优化方法 |
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
CN109740731A (zh) * | 2018-12-15 | 2019-05-10 | 华南理工大学 | 一种自适应卷积层硬件加速器设计方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10802992B2 (en) * | 2016-08-12 | 2020-10-13 | Xilinx Technology Beijing Limited | Combining CPU and special accelerator for implementing an artificial neural network |
US20180189641A1 (en) * | 2017-01-04 | 2018-07-05 | Stmicroelectronics S.R.L. | Hardware accelerator engine |
-
2019
- 2019-05-15 CN CN201910400924.3A patent/CN110084363B/zh active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (zh) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | 现场可编程门阵列平台上加速深度学习算法的方法和系统 |
CN106228240A (zh) * | 2016-07-30 | 2016-12-14 | 复旦大学 | 基于fpga的深度卷积神经网络实现方法 |
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
CN107862374A (zh) * | 2017-10-30 | 2018-03-30 | 中国科学院计算技术研究所 | 基于流水线的神经网络处理系统和处理方法 |
CN108520300A (zh) * | 2018-04-09 | 2018-09-11 | 郑州云海信息技术有限公司 | 一种深度学习网络的实现方法和装置 |
CN109583006A (zh) * | 2018-10-16 | 2019-04-05 | 浙江工业大学 | 一种基于循环切割和重排的现场可编程门阵列卷积层的动态优化方法 |
CN109740731A (zh) * | 2018-12-15 | 2019-05-10 | 华南理工大学 | 一种自适应卷积层硬件加速器设计方法 |
Also Published As
Publication number | Publication date |
---|---|
CN110084363A (zh) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110084363B (zh) | 一种基于fpga平台的深度学习模型加速方法 | |
CN110378468B (zh) | 一种基于结构化剪枝和低比特量化的神经网络加速器 | |
CN111459877B (zh) | 基于FPGA加速的Winograd YOLOv2目标检测模型方法 | |
CN111416743B (zh) | 一种卷积网络加速器、配置方法及计算机可读存储介质 | |
KR102572757B1 (ko) | 집약성을 개선하기 위한 머신 학습 모델들의 수정 | |
CN111967468A (zh) | 一种基于fpga的轻量级目标检测神经网络的实现方法 | |
CN113220457A (zh) | 模型部署方法、模型部署装置、终端设备及可读存储介质 | |
CN113222133B (zh) | 一种基于fpga的压缩lstm加速器及加速方法 | |
CN113392973B (zh) | 一种基于fpga的ai芯片神经网络加速方法 | |
CN114580636B (zh) | 基于三目标联合优化的神经网络轻量化部署方法 | |
CN111563582A (zh) | 一种在fpga上实现及优化加速卷积神经网络的方法 | |
CN116755876A (zh) | 一种大模型混合并行训练加速方法和系统 | |
CN115186806A (zh) | 一种支持跨节点自动微分的分布式图神经网络训练方法 | |
CN112200310B (zh) | 智能处理器、数据处理方法及存储介质 | |
CN110648768B (zh) | 一种pom海洋模式优化方法及装置 | |
CN115130672B (zh) | 一种软硬件协同优化卷积神经网络计算的方法及装置 | |
CN109271344B (zh) | 基于申威芯片架构并行文件读取的数据预处理方法 | |
KR102508635B1 (ko) | 딥 러닝 모델 학습 방법 및 학습기 | |
CN109992413A (zh) | 一种面向宽度优先搜索算法的加速装置、方法及存储介质 | |
CN109756908B (zh) | 无线网络缓存策略的优化方法/系统、存储介质及设备 | |
CN116188239B (zh) | 多请求并发的gpu图随机游走优化实现方法及系统 | |
CN113469327B (zh) | 执行转数提前的集成电路装置 | |
CN111861860B (zh) | 一种面向ai智能soc芯片的图像加速处理系统 | |
CN113469328B (zh) | 执行转数穿过的装置、板卡、方法及可读存储介质 | |
CN110362399B (zh) | 一种适用于云存储副本布局的植物根系优化方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240527 Address after: Room 24, Floor 2, Unit 1, Building 1, No. 73, Section 2, Second Ring Road West, Qingyang District, Chengdu, 610000, Sichuan Patentee after: Aegis Defense Technology (Chengdu) Co.,Ltd. Country or region after: China Address before: 610041 floor 5, building 1, No. 21, Gaopeng Avenue, high tech Zone, Chengdu, Sichuan Patentee before: Electric Coreda (Chengdu) Technology Co.,Ltd. Country or region before: China |