CN110084363A - 一种基于fpga平台的深度学习模型加速方法 - Google Patents
一种基于fpga平台的深度学习模型加速方法 Download PDFInfo
- Publication number
- CN110084363A CN110084363A CN201910400924.3A CN201910400924A CN110084363A CN 110084363 A CN110084363 A CN 110084363A CN 201910400924 A CN201910400924 A CN 201910400924A CN 110084363 A CN110084363 A CN 110084363A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- learning model
- fpga
- hardware
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000013136 deep learning model Methods 0.000 title claims abstract description 18
- 230000001133 acceleration Effects 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 6
- 238000002054 transplantation Methods 0.000 claims abstract description 4
- 230000015572 biosynthetic process Effects 0.000 claims description 8
- 238000003786 synthesis reaction Methods 0.000 claims description 8
- 238000013139 quantization Methods 0.000 claims description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 2
- 101100498818 Arabidopsis thaliana DDR4 gene Proteins 0.000 claims 2
- 125000004122 cyclic group Chemical group 0.000 claims 1
- 238000006116 polymerization reaction Methods 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 5
- 238000013461 design Methods 0.000 abstract description 5
- 230000005540 biological transmission Effects 0.000 abstract description 4
- 238000013500 data storage Methods 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 10
- 238000005457 optimization Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012913 prioritisation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Neurology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
Description
Resourse | DSP | BRAM | LUT | FF |
Used | 2240 | 1024 | 186251 | 205704 |
Available | 2520 | 1824 | 274080 | 548160 |
Utilization | 88.9% | 56.1% | 68% | 37.5% |
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910400924.3A CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910400924.3A CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110084363A true CN110084363A (zh) | 2019-08-02 |
CN110084363B CN110084363B (zh) | 2023-04-25 |
Family
ID=67420182
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910400924.3A Active CN110084363B (zh) | 2019-05-15 | 2019-05-15 | 一种基于fpga平台的深度学习模型加速方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110084363B (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516796A (zh) * | 2019-08-28 | 2019-11-29 | 西北工业大学 | 一种面向嵌入式平台的分组卷积过程优化方法 |
CN110516795A (zh) * | 2019-08-28 | 2019-11-29 | 北京达佳互联信息技术有限公司 | 一种为模型变量分配处理器的方法、装置及电子设备 |
CN110738311A (zh) * | 2019-10-14 | 2020-01-31 | 哈尔滨工业大学 | 基于高层次综合的lstm网络加速方法 |
CN112101537A (zh) * | 2020-09-17 | 2020-12-18 | 广东高云半导体科技股份有限公司 | Cnn加速器和电子设备 |
CN113780553A (zh) * | 2021-09-09 | 2021-12-10 | 中山大学 | 一种基于高层次综合工具的深度学习模型优化方法及系统 |
CN114754801A (zh) * | 2022-06-16 | 2022-07-15 | 北京理工导航控制科技股份有限公司 | 一种基于神经网络对光纤陀螺零偏温度补偿方法、装置及存储介质 |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228240A (zh) * | 2016-07-30 | 2016-12-14 | 复旦大学 | 基于fpga的深度卷积神经网络实现方法 |
CN106228238A (zh) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | 现场可编程门阵列平台上加速深度学习算法的方法和系统 |
US20180046913A1 (en) * | 2016-08-12 | 2018-02-15 | DeePhi Technology Co., Ltd. | Combining cpu and special accelerator for implementing an artificial neural network |
CN107862374A (zh) * | 2017-10-30 | 2018-03-30 | 中国科学院计算技术研究所 | 基于流水线的神经网络处理系统和处理方法 |
US20180189641A1 (en) * | 2017-01-04 | 2018-07-05 | Stmicroelectronics S.R.L. | Hardware accelerator engine |
CN108520300A (zh) * | 2018-04-09 | 2018-09-11 | 郑州云海信息技术有限公司 | 一种深度学习网络的实现方法和装置 |
CN109583006A (zh) * | 2018-10-16 | 2019-04-05 | 浙江工业大学 | 一种基于循环切割和重排的现场可编程门阵列卷积层的动态优化方法 |
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
CN109740731A (zh) * | 2018-12-15 | 2019-05-10 | 华南理工大学 | 一种自适应卷积层硬件加速器设计方法 |
-
2019
- 2019-05-15 CN CN201910400924.3A patent/CN110084363B/zh active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (zh) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | 现场可编程门阵列平台上加速深度学习算法的方法和系统 |
CN106228240A (zh) * | 2016-07-30 | 2016-12-14 | 复旦大学 | 基于fpga的深度卷积神经网络实现方法 |
US20180046913A1 (en) * | 2016-08-12 | 2018-02-15 | DeePhi Technology Co., Ltd. | Combining cpu and special accelerator for implementing an artificial neural network |
US20180189641A1 (en) * | 2017-01-04 | 2018-07-05 | Stmicroelectronics S.R.L. | Hardware accelerator engine |
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
CN107862374A (zh) * | 2017-10-30 | 2018-03-30 | 中国科学院计算技术研究所 | 基于流水线的神经网络处理系统和处理方法 |
CN108520300A (zh) * | 2018-04-09 | 2018-09-11 | 郑州云海信息技术有限公司 | 一种深度学习网络的实现方法和装置 |
CN109583006A (zh) * | 2018-10-16 | 2019-04-05 | 浙江工业大学 | 一种基于循环切割和重排的现场可编程门阵列卷积层的动态优化方法 |
CN109740731A (zh) * | 2018-12-15 | 2019-05-10 | 华南理工大学 | 一种自适应卷积层硬件加速器设计方法 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516796A (zh) * | 2019-08-28 | 2019-11-29 | 西北工业大学 | 一种面向嵌入式平台的分组卷积过程优化方法 |
CN110516795A (zh) * | 2019-08-28 | 2019-11-29 | 北京达佳互联信息技术有限公司 | 一种为模型变量分配处理器的方法、装置及电子设备 |
CN110738311A (zh) * | 2019-10-14 | 2020-01-31 | 哈尔滨工业大学 | 基于高层次综合的lstm网络加速方法 |
CN112101537A (zh) * | 2020-09-17 | 2020-12-18 | 广东高云半导体科技股份有限公司 | Cnn加速器和电子设备 |
CN112101537B (zh) * | 2020-09-17 | 2021-08-03 | 广东高云半导体科技股份有限公司 | Cnn加速器和电子设备 |
CN113780553A (zh) * | 2021-09-09 | 2021-12-10 | 中山大学 | 一种基于高层次综合工具的深度学习模型优化方法及系统 |
CN113780553B (zh) * | 2021-09-09 | 2023-11-07 | 中山大学 | 一种基于高层次综合工具的深度学习模型优化方法及系统 |
CN114754801A (zh) * | 2022-06-16 | 2022-07-15 | 北京理工导航控制科技股份有限公司 | 一种基于神经网络对光纤陀螺零偏温度补偿方法、装置及存储介质 |
CN114754801B (zh) * | 2022-06-16 | 2022-08-26 | 北京理工导航控制科技股份有限公司 | 一种基于神经网络对光纤陀螺零偏温度补偿方法、装置及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110084363B (zh) | 2023-04-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110084363A (zh) | 一种基于fpga平台的深度学习模型加速方法 | |
CN110378468B (zh) | 一种基于结构化剪枝和低比特量化的神经网络加速器 | |
CN109948029B (zh) | 基于神经网络自适应的深度哈希图像搜索方法 | |
CN111242289B (zh) | 一种规模可扩展的卷积神经网络加速系统与方法 | |
CN111967468A (zh) | 一种基于fpga的轻量级目标检测神经网络的实现方法 | |
CN110515303A (zh) | 一种基于ddqn的自适应动态路径规划方法 | |
CN106201651A (zh) | 神经形态芯片的模拟器 | |
CN111831254A (zh) | 图像处理加速方法、图像处理模型存储方法及对应装置 | |
CN112101525A (zh) | 一种通过nas设计神经网络的方法、装置和系统 | |
CN113220630B (zh) | 一种硬件加速器的可重构阵列优化方法及自动调优方法 | |
CN109934336A (zh) | 基于最优结构搜索的神经网络动态加速平台设计方法及神经网络动态加速平台 | |
CN110188880A (zh) | 一种深度神经网络的量化方法及装置 | |
CN114580636B (zh) | 基于三目标联合优化的神经网络轻量化部署方法 | |
CN116401502B (zh) | 一种基于NUMA系统特性优化Winograd卷积的方法及装置 | |
CN116384157B (zh) | 土地利用变化模拟方法 | |
CN110750560B (zh) | 一种优化网络多连接的系统和方法 | |
CN112598129A (zh) | 基于ReRAM神经网络加速器的可调硬件感知的剪枝和映射框架 | |
CN116755876A (zh) | 一种大模型混合并行训练加速方法和系统 | |
CN114218736A (zh) | 一种针对海洋模式roms众核优化的方法 | |
CN117193988A (zh) | 一种晶圆级架构ai加速芯片的任务调度方法及介质 | |
CN115600637A (zh) | 面向数据流神经网络加速器设计的架构自动优化方法 | |
CN113986816B (zh) | 可重构计算芯片 | |
KR102508635B1 (ko) | 딥 러닝 모델 학습 방법 및 학습기 | |
CN109597619A (zh) | 一种面向异构多核架构的自适应编译框架 | |
CN112434817B (zh) | 构建通信算法数据库的方法、装置和计算机存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240527 Address after: Room 24, Floor 2, Unit 1, Building 1, No. 73, Section 2, Second Ring Road West, Qingyang District, Chengdu, 610000, Sichuan Patentee after: Aegis Defense Technology (Chengdu) Co.,Ltd. Country or region after: China Address before: 610041 floor 5, building 1, No. 21, Gaopeng Avenue, high tech Zone, Chengdu, Sichuan Patentee before: Electric Coreda (Chengdu) Technology Co.,Ltd. Country or region before: China |