CN111226231A - 使用主机传达的合并权重和每层指令的封装通过神经网络加速器进行的多层神经网络处理 - Google Patents

使用主机传达的合并权重和每层指令的封装通过神经网络加速器进行的多层神经网络处理 Download PDF

Info

Publication number
CN111226231A
CN111226231A CN201880067687.7A CN201880067687A CN111226231A CN 111226231 A CN111226231 A CN 111226231A CN 201880067687 A CN201880067687 A CN 201880067687A CN 111226231 A CN111226231 A CN 111226231A
Authority
CN
China
Prior art keywords
neural network
layer
instruction
per
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201880067687.7A
Other languages
English (en)
Chinese (zh)
Inventor
A·吴
E·德拉耶
E·盖塞米
滕晓
J·泽杰达
吴永军
S·塞特勒
A·西拉萨奥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xilinx Inc
Original Assignee
Xilinx Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xilinx Inc filed Critical Xilinx Inc
Publication of CN111226231A publication Critical patent/CN111226231A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
  • Image Analysis (AREA)
  • Complex Calculations (AREA)
CN201880067687.7A 2017-10-17 2018-10-16 使用主机传达的合并权重和每层指令的封装通过神经网络加速器进行的多层神经网络处理 Pending CN111226231A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15/785,800 2017-10-17
US15/785,800 US11620490B2 (en) 2017-10-17 2017-10-17 Multi-layer neural network processing by a neural network accelerator using host communicated merged weights and a package of per-layer instructions
PCT/US2018/056112 WO2019079319A1 (en) 2017-10-17 2018-10-16 NEURONAL MULTICOUCHE NETWORK PROCESSING BY A NEURONAL NETWORK ACCELERATOR USING CONTAINED HOST COMMUNICATION WEIGHTS AND A LAYERED INSTRUCTION PACKAGE

Publications (1)

Publication Number Publication Date
CN111226231A true CN111226231A (zh) 2020-06-02

Family

ID=64110172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880067687.7A Pending CN111226231A (zh) 2017-10-17 2018-10-16 使用主机传达的合并权重和每层指令的封装通过神经网络加速器进行的多层神经网络处理

Country Status (6)

Country Link
US (1) US11620490B2 (enExample)
EP (1) EP3698296B1 (enExample)
JP (1) JP7196167B2 (enExample)
KR (1) KR102578508B1 (enExample)
CN (1) CN111226231A (enExample)
WO (1) WO2019079319A1 (enExample)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326479A (zh) * 2021-05-28 2021-08-31 哈尔滨理工大学 一种基于fpga的k均值算法的实现方法
CN114595077A (zh) * 2020-12-07 2022-06-07 辉达公司 用于神经网络计算的应用编程接口
US12072834B1 (en) 2023-04-06 2024-08-27 Moffett International Co., Limited Hierarchical networks on chip (NoC) for neural network accelerator

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037330B2 (en) * 2017-04-08 2021-06-15 Intel Corporation Low rank matrix compression
US11386644B2 (en) * 2017-10-17 2022-07-12 Xilinx, Inc. Image preprocessing for generalized image processing
US10565285B2 (en) * 2017-12-18 2020-02-18 International Business Machines Corporation Processor and memory transparent convolutional lowering and auto zero padding for deep neural network implementations
US11250107B2 (en) * 2019-07-15 2022-02-15 International Business Machines Corporation Method for interfacing with hardware accelerators
US11573828B2 (en) * 2019-09-16 2023-02-07 Nec Corporation Efficient and scalable enclave protection for machine learning programs
US11501145B1 (en) * 2019-09-17 2022-11-15 Amazon Technologies, Inc. Memory operation for systolic array
KR102463123B1 (ko) * 2019-11-29 2022-11-04 한국전자기술연구원 뉴럴 네트워크 가속기의 효율적인 제어, 모니터링 및 소프트웨어 디버깅 방법
US20200134417A1 (en) * 2019-12-24 2020-04-30 Intel Corporation Configurable processor element arrays for implementing convolutional neural networks
US11132594B2 (en) 2020-01-03 2021-09-28 Capital One Services, Llc Systems and methods for producing non-standard shaped cards
US11182159B2 (en) * 2020-02-26 2021-11-23 Google Llc Vector reductions using shared scratchpad memory
CN111461316A (zh) * 2020-03-31 2020-07-28 中科寒武纪科技股份有限公司 计算神经网络的方法、装置、板卡及计算机可读存储介质
CN111461315A (zh) * 2020-03-31 2020-07-28 中科寒武纪科技股份有限公司 计算神经网络的方法、装置、板卡及计算机可读存储介质
US11783163B2 (en) * 2020-06-15 2023-10-10 Arm Limited Hardware accelerator for IM2COL operation
KR102860333B1 (ko) * 2020-06-22 2025-09-16 삼성전자주식회사 가속기, 가속기의 동작 방법 및 이를 포함한 가속기 시스템
KR102859455B1 (ko) * 2020-08-31 2025-09-12 삼성전자주식회사 가속기, 가속기의 동작 방법 및 이를 포함한 전자 장치
CN113485762B (zh) * 2020-09-19 2024-07-26 广东高云半导体科技股份有限公司 用可配置器件卸载计算任务以提高系统性能的方法和装置
CN112613605A (zh) * 2020-12-07 2021-04-06 深兰人工智能(深圳)有限公司 神经网络加速控制方法、装置、电子设备及存储介质
CN112580787B (zh) 2020-12-25 2023-11-17 北京百度网讯科技有限公司 神经网络加速器的数据处理方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160350645A1 (en) * 2015-05-29 2016-12-01 Samsung Electronics Co., Ltd. Data-optimized neural network traversal
US20170124452A1 (en) * 2015-10-28 2017-05-04 Google Inc. Processing computational graphs
US9710265B1 (en) * 2016-10-27 2017-07-18 Google Inc. Neural network compute tile
US20170220352A1 (en) * 2016-02-03 2017-08-03 Google Inc. Accessing data in multi-dimensional tensors
US20190050717A1 (en) * 2017-08-11 2019-02-14 Google Llc Neural network accelerator with parameters resident on chip

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6346825B1 (en) 2000-10-06 2002-02-12 Xilinx, Inc. Block RAM with configurable data width and parity for use in a field programmable gate array
WO2014204615A2 (en) * 2013-05-22 2014-12-24 Neurala, Inc. Methods and apparatus for iterative nonspecific distributed runtime architecture and its application to cloud intelligence
US10339041B2 (en) * 2013-10-11 2019-07-02 Qualcomm Incorporated Shared memory architecture for a neural simulator
US11099918B2 (en) * 2015-05-11 2021-08-24 Xilinx, Inc. Accelerating algorithms and applications on FPGAs
US10083395B2 (en) * 2015-05-21 2018-09-25 Google Llc Batch processing in a neural network processor
US10891538B2 (en) * 2016-08-11 2021-01-12 Nvidia Corporation Sparse convolutional neural network accelerator
US10802992B2 (en) * 2016-08-12 2020-10-13 Xilinx Technology Beijing Limited Combining CPU and special accelerator for implementing an artificial neural network
CN107239823A (zh) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 一种用于实现稀疏神经网络的装置和方法
US10489702B2 (en) * 2016-10-14 2019-11-26 Intel Corporation Hybrid compression scheme for efficient storage of synaptic weights in hardware neuromorphic cores
US10949736B2 (en) * 2016-11-03 2021-03-16 Intel Corporation Flexible neural network accelerator and methods therefor
KR102224510B1 (ko) * 2016-12-09 2021-03-05 베이징 호라이즌 인포메이션 테크놀로지 컴퍼니 리미티드 데이터 관리를 위한 시스템들 및 방법들

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160350645A1 (en) * 2015-05-29 2016-12-01 Samsung Electronics Co., Ltd. Data-optimized neural network traversal
US20170124452A1 (en) * 2015-10-28 2017-05-04 Google Inc. Processing computational graphs
US20170220352A1 (en) * 2016-02-03 2017-08-03 Google Inc. Accessing data in multi-dimensional tensors
US9710265B1 (en) * 2016-10-27 2017-07-18 Google Inc. Neural network compute tile
US20190050717A1 (en) * 2017-08-11 2019-02-14 Google Llc Neural network accelerator with parameters resident on chip

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANTAO QIU ET.AL: "Going Deeper with Embedded FPGA Platform for Convolutional Neural Network", 《PROCEEDINGS OF THE 2016 ACM/SIGA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, FPGA\'16》, pages 6 - 8 *
SRIMAT CHAKRADHAR ET.A: "A Dynamically Configurable Coprocessor for Convolutional Neural Networks", 《PROCEEDING OF THE 37TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE》 *
ZHEN LI ET.AL: "A survey of neural network accelerators", 《FRONTIERS OF COMPUTER SCIENCE, SPRINGER BERLIN HEIDELBERG, BERLIN/HEIDELBERG》, vol. 11, no. 5, pages 29 - 37 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114595077A (zh) * 2020-12-07 2022-06-07 辉达公司 用于神经网络计算的应用编程接口
CN113326479A (zh) * 2021-05-28 2021-08-31 哈尔滨理工大学 一种基于fpga的k均值算法的实现方法
US12072834B1 (en) 2023-04-06 2024-08-27 Moffett International Co., Limited Hierarchical networks on chip (NoC) for neural network accelerator
WO2024207309A1 (en) * 2023-04-06 2024-10-10 Moffett International Co., Limited Hierarchical networks on chip (noc) for neural network accelerator
US12292852B2 (en) 2023-04-06 2025-05-06 Moffett International Co., Limited Hierarchical networks on chip (NoC) for neural network accelerator

Also Published As

Publication number Publication date
KR20200069338A (ko) 2020-06-16
WO2019079319A1 (en) 2019-04-25
US20190114529A1 (en) 2019-04-18
KR102578508B1 (ko) 2023-09-13
JP2020537785A (ja) 2020-12-24
EP3698296B1 (en) 2024-07-17
JP7196167B2 (ja) 2022-12-26
EP3698296A1 (en) 2020-08-26
US11620490B2 (en) 2023-04-04

Similar Documents

Publication Publication Date Title
CN111226231A (zh) 使用主机传达的合并权重和每层指令的封装通过神经网络加速器进行的多层神经网络处理
KR102697368B1 (ko) 일반화된 이미지 프로세싱을 위한 이미지 프리프로세싱
EP3698293B1 (en) Neural network processing system having multiple processors and a neural network accelerator
US11429848B2 (en) Host-directed multi-layer neural network processing via per-layer work requests
CN111771215B (zh) 大规模并行软件定义硬件系统中的静态块调度
US10984500B1 (en) Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit
US10354733B1 (en) Software-defined memory bandwidth reduction by hierarchical stream buffering for general matrix multiplication in a programmable IC
US10515135B1 (en) Data format suitable for fast massively parallel general matrix multiplication in a programmable IC
JP7434146B2 (ja) ニューラルネットワークの、アーキテクチャに最適化された訓練
US11568218B2 (en) Neural network processing system having host controlled kernel acclerators
CN107346351B (zh) 用于基于源代码中定义的硬件要求来设计fpga的方法和系统
US11204747B1 (en) Re-targetable interface for data exchange between heterogeneous systems and accelerator abstraction into software instructions
US10943039B1 (en) Software-driven design optimization for fixed-point multiply-accumulate circuitry

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200602