CN114127689B - 用于与硬件加速器接口的方法 - Google Patents
用于与硬件加速器接口的方法 Download PDFInfo
- Publication number
- CN114127689B CN114127689B CN202080051285.5A CN202080051285A CN114127689B CN 114127689 B CN114127689 B CN 114127689B CN 202080051285 A CN202080051285 A CN 202080051285A CN 114127689 B CN114127689 B CN 114127689B
- Authority
- CN
- China
- Prior art keywords
- operations
- routine
- operation set
- hardware accelerator
- computing task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/065—Analogue means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Neurology (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Complex Calculations (AREA)
- Advance Control (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/511,689 US11250107B2 (en) | 2019-07-15 | 2019-07-15 | Method for interfacing with hardware accelerators |
| US16/511,689 | 2019-07-15 | ||
| PCT/EP2020/068377 WO2021008868A1 (en) | 2019-07-15 | 2020-06-30 | A method for interfacing with hardware accelerators |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114127689A CN114127689A (zh) | 2022-03-01 |
| CN114127689B true CN114127689B (zh) | 2025-06-06 |
Family
ID=71409414
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202080051285.5A Active CN114127689B (zh) | 2019-07-15 | 2020-06-30 | 用于与硬件加速器接口的方法 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11250107B2 (https=) |
| EP (1) | EP3999957B1 (https=) |
| JP (1) | JP7361192B2 (https=) |
| CN (1) | CN114127689B (https=) |
| WO (1) | WO2021008868A1 (https=) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3933709A1 (en) * | 2020-06-30 | 2022-01-05 | Upstride | Graph processing method and system |
| EP4285286A1 (en) * | 2021-02-01 | 2023-12-06 | Microsoft Technology Licensing, LLC | Semi-programmable and reconfigurable co-accelerator for a deep neural network with normalization or non-linearity |
| CN115904696A (zh) * | 2021-09-30 | 2023-04-04 | 想象技术有限公司 | 用于配置具有可配置流水线的神经网络加速器的方法和设备 |
| TR2021020689A2 (tr) * | 2021-12-22 | 2023-07-21 | Havelsan Hava Elektronik Sanayi Ve Ticaret Anonim Sirketi | Gömülü ve bütünleşi̇k si̇stemlerde paralel yapay si̇ni̇r ağlari i̇le topluluk öğrenmesi̇ |
| US20240161222A1 (en) * | 2022-11-16 | 2024-05-16 | Nvidia Corporation | Application programming interface to indicate image-to-column transformation |
| US12455900B1 (en) | 2023-03-07 | 2025-10-28 | QEngine LLC | Method for executing a query in a multi-dimensional data space using vectorization and a related system |
| KR102740239B1 (ko) * | 2023-03-24 | 2024-12-10 | 한국과학기술원 | 다중 신경망 가속을 위한 확장가능 벡터-어레이 이종 가속기 구조 및 스케쥴링 기법 |
Family Cites Families (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6081890A (en) * | 1998-11-30 | 2000-06-27 | Intel Corporation | Method of communication between firmware written for different instruction set architectures |
| US7219085B2 (en) * | 2003-12-09 | 2007-05-15 | Microsoft Corporation | System and method for accelerating and optimizing the processing of machine learning techniques using a graphics processing unit |
| US8250578B2 (en) | 2008-02-22 | 2012-08-21 | International Business Machines Corporation | Pipelining hardware accelerators to computer systems |
| US7984267B2 (en) * | 2008-09-04 | 2011-07-19 | International Business Machines Corporation | Message passing module in hybrid computing system starting and sending operation information to service program for accelerator to execute application program |
| US8751556B2 (en) | 2010-06-11 | 2014-06-10 | Massachusetts Institute Of Technology | Processor for large graph algorithm computations and matrix operations |
| US8752064B2 (en) * | 2010-12-14 | 2014-06-10 | Advanced Micro Devices, Inc. | Optimizing communication of system call requests |
| JP2012256150A (ja) * | 2011-06-08 | 2012-12-27 | Renesas Electronics Corp | コンパイル装置、コンパイル方法、及びプログラム |
| US9411853B1 (en) | 2012-08-03 | 2016-08-09 | Healthstudio, LLC | In-memory aggregation system and method of multidimensional data processing for enhancing speed and scalability |
| US9471388B2 (en) * | 2013-03-14 | 2016-10-18 | Altera Corporation | Mapping network applications to a hybrid programmable many-core device |
| US20150006341A1 (en) * | 2013-06-27 | 2015-01-01 | Metratech Corp. | Billing transaction scheduling |
| US10540588B2 (en) | 2015-06-29 | 2020-01-21 | Microsoft Technology Licensing, Llc | Deep neural network processing on hardware accelerators with stacked memory |
| JP2018173672A (ja) * | 2015-09-03 | 2018-11-08 | 株式会社Preferred Networks | 実装装置 |
| JP6658033B2 (ja) * | 2016-02-05 | 2020-03-04 | 富士通株式会社 | 演算処理回路、および情報処理装置 |
| US9646243B1 (en) | 2016-09-12 | 2017-05-09 | International Business Machines Corporation | Convolutional neural networks using resistive processing unit array |
| JP6724869B2 (ja) * | 2017-06-19 | 2020-07-15 | 株式会社デンソー | 多層ニューラルネットワークのニューロンの出力レベル調整方法 |
| US11620490B2 (en) * | 2017-10-17 | 2023-04-04 | Xilinx, Inc. | Multi-layer neural network processing by a neural network accelerator using host communicated merged weights and a package of per-layer instructions |
| US10698766B2 (en) * | 2018-04-18 | 2020-06-30 | EMC IP Holding Company LLC | Optimization of checkpoint operations for deep learning computing |
| CN108876702A (zh) * | 2018-06-21 | 2018-11-23 | 北京邮电大学 | 一种加速分布式深度神经网络的训练方法及装置 |
| US10620951B2 (en) * | 2018-06-22 | 2020-04-14 | Intel Corporation | Matrix multiplication acceleration of sparse matrices using column folding and squeezing |
-
2019
- 2019-07-15 US US16/511,689 patent/US11250107B2/en active Active
-
2020
- 2020-06-30 CN CN202080051285.5A patent/CN114127689B/zh active Active
- 2020-06-30 WO PCT/EP2020/068377 patent/WO2021008868A1/en not_active Ceased
- 2020-06-30 EP EP20735563.7A patent/EP3999957B1/en active Active
- 2020-06-30 JP JP2022500757A patent/JP7361192B2/ja active Active
Non-Patent Citations (1)
| Title |
|---|
| Mixed-precision_architecture_based_on_computational_memory_for_training_deep_neural_networks;S. R. Nandakumar;《2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS(ISCAS)》;20180426;正文摘要、第II-III部分 * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2022541144A (ja) | 2022-09-22 |
| WO2021008868A1 (en) | 2021-01-21 |
| US20210019362A1 (en) | 2021-01-21 |
| JP7361192B2 (ja) | 2023-10-13 |
| US11250107B2 (en) | 2022-02-15 |
| EP3999957A1 (en) | 2022-05-25 |
| CN114127689A (zh) | 2022-03-01 |
| EP3999957B1 (en) | 2025-09-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114127689B (zh) | 用于与硬件加速器接口的方法 | |
| US10691996B2 (en) | Hardware accelerator for compressed LSTM | |
| US10839292B2 (en) | Accelerated neural network training using a pipelined resistive processing unit architecture | |
| US10553207B2 (en) | Systems and methods for employing predication in computational models | |
| US20230297819A1 (en) | Processor array for processing sparse binary neural networks | |
| EP3298547A1 (en) | Batch processing in a neural network processor | |
| KR102396447B1 (ko) | 파이프라인 구조를 가지는 인공신경망용 연산 가속 장치 | |
| JP2022541144A5 (https=) | ||
| US12530170B2 (en) | Vector operation acceleration with convolution computation unit | |
| US11586895B1 (en) | Recursive neural network using random access memory | |
| TWI842584B (zh) | 電腦實施方法及電腦可讀儲存媒體 | |
| WO2024187039A1 (en) | Implementing and training computational efficient neural network architectures utilizing layer-skip logic | |
| EP4202774A1 (en) | Runtime predictors for neural network computation reduction | |
| EP4612616A1 (en) | Hardware-aware generation of machine learning models | |
| US11734225B2 (en) | Matrix tiling to accelerate computing in redundant matrices | |
| US20210142153A1 (en) | Resistive processing unit scalable execution | |
| US20260056710A1 (en) | Quantization and Low Precision AI Processor | |
| US20250130771A1 (en) | In-memory processing based on multiple weight sets | |
| KR20240096459A (ko) | 임베딩 그룹-당 활성화 양자화 | |
| JP2024110210A (ja) | リザーバ計算機、及び設備状態検知システム | |
| WO2023129491A1 (en) | Compute element processing using control word templates | |
| CN110929846A (zh) | 一种多层感知机深度神经网络层间流水处理方法 | |
| Khamitov et al. | Optimization of neural networks training with vector-free heuristic on Apache Spark | |
| JPH02287862A (ja) | ニューラルネットワーク演算装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |