JP7401513B2 - ハードウェアにおけるスパース行列乗算 - Google Patents
ハードウェアにおけるスパース行列乗算 Download PDFInfo
- Publication number
- JP7401513B2 JP7401513B2 JP2021207147A JP2021207147A JP7401513B2 JP 7401513 B2 JP7401513 B2 JP 7401513B2 JP 2021207147 A JP2021207147 A JP 2021207147A JP 2021207147 A JP2021207147 A JP 2021207147A JP 7401513 B2 JP7401513 B2 JP 7401513B2
- Authority
- JP
- Japan
- Prior art keywords
- shard
- sparse
- input
- vector
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8046—Systolic arrays
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
- G06F7/487—Multiplying; Dividing
- G06F7/4876—Multiplying
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
- G06F7/501—Half or full adders, i.e. basic adder cells for one denomination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/76—Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
- G06F7/78—Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0495—Quantised networks; Sparse networks; Compressed networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Algebra (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Nonlinear Science (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Complex Calculations (AREA)
- Neurology (AREA)
- Advance Control (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023206881A JP7793585B2 (ja) | 2021-05-25 | 2023-12-07 | ハードウェアにおけるスパース行列乗算 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/329,259 US12189710B2 (en) | 2021-05-25 | 2021-05-25 | Sparse matrix multiplication in hardware |
| US17/329,259 | 2021-05-25 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023206881A Division JP7793585B2 (ja) | 2021-05-25 | 2023-12-07 | ハードウェアにおけるスパース行列乗算 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2022181161A JP2022181161A (ja) | 2022-12-07 |
| JP7401513B2 true JP7401513B2 (ja) | 2023-12-19 |
Family
ID=80222142
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021207147A Active JP7401513B2 (ja) | 2021-05-25 | 2021-12-21 | ハードウェアにおけるスパース行列乗算 |
| JP2023206881A Active JP7793585B2 (ja) | 2021-05-25 | 2023-12-07 | ハードウェアにおけるスパース行列乗算 |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023206881A Active JP7793585B2 (ja) | 2021-05-25 | 2023-12-07 | ハードウェアにおけるスパース行列乗算 |
Country Status (5)
| Country | Link |
|---|---|
| US (2) | US12189710B2 (https=) |
| EP (1) | EP4095719A1 (https=) |
| JP (2) | JP7401513B2 (https=) |
| KR (2) | KR102601034B1 (https=) |
| CN (1) | CN114329329B (https=) |
Families Citing this family (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220012304A1 (en) * | 2020-07-07 | 2022-01-13 | Sudarshan Kumar | Fast matrix multiplication |
| US11940907B2 (en) * | 2021-06-25 | 2024-03-26 | Intel Corporation | Methods and apparatus for sparse tensor storage for neural network accelerators |
| US20220012012A1 (en) * | 2021-09-24 | 2022-01-13 | Martin Langhammer | Systems and Methods for Sparsity Operations in a Specialized Processing Block |
| US20230267169A1 (en) * | 2022-02-24 | 2023-08-24 | Xilinx, Inc. | Sparse matrix dense vector multliplication circuitry |
| CN115470450B (zh) * | 2022-08-30 | 2025-09-05 | 无锡江南计算技术研究所 | 一种矩阵乘运算装置及其低开销异常定位方法 |
| WO2024108584A1 (zh) * | 2022-11-25 | 2024-05-30 | 华为技术有限公司 | 稀疏算子处理方法及装置 |
| KR20240081961A (ko) * | 2022-12-01 | 2024-06-10 | 삼성전자주식회사 | 희소 행렬의 압축 저장 포맷을 변환하는 전자 장치 및 그 동작 방법 |
| KR102745798B1 (ko) * | 2023-12-26 | 2024-12-23 | 리벨리온 주식회사 | 데이터 연산 방법 및 이를 지원하는 데이터 연산 장치 |
| US20260111173A1 (en) * | 2024-10-17 | 2026-04-23 | Edgecortix Inc. | On-chip non-zero value unpacking and distribution |
| CN119602947B (zh) * | 2024-11-14 | 2025-10-03 | 西安交通大学 | 一种面向密码算法bike的二进制多项式乘法器及加密方法 |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9697176B2 (en) | 2014-11-14 | 2017-07-04 | Advanced Micro Devices, Inc. | Efficient sparse matrix-vector multiplication on parallel processors |
| US9760538B2 (en) | 2014-12-22 | 2017-09-12 | Palo Alto Research Center Incorporated | Computer-implemented system and method for efficient sparse matrix representation and processing |
| US10528321B2 (en) | 2016-12-07 | 2020-01-07 | Microsoft Technology Licensing, Llc | Block floating point for neural network implementations |
| US10489063B2 (en) | 2016-12-19 | 2019-11-26 | Intel Corporation | Memory-to-memory instructions to accelerate sparse-matrix by dense-vector and sparse-vector by dense-vector multiplication |
| US11216722B2 (en) | 2016-12-31 | 2022-01-04 | Intel Corporation | Hardware accelerator template and design framework for implementing recurrent neural networks |
| US10180928B2 (en) | 2016-12-31 | 2019-01-15 | Intel Corporation | Heterogeneous hardware accelerator architecture for processing sparse matrix data with skewed non-zero distributions |
| WO2018134740A2 (en) * | 2017-01-22 | 2018-07-26 | Gsi Technology Inc. | Sparse matrix multiplication in associative memory device |
| US10572568B2 (en) * | 2018-03-28 | 2020-02-25 | Intel Corporation | Accelerator for sparse-dense matrix multiplication |
| US10726096B2 (en) | 2018-10-12 | 2020-07-28 | Hewlett Packard Enterprise Development Lp | Sparse matrix vector multiplication with a matrix vector multiplication unit |
| KR102838677B1 (ko) * | 2019-03-15 | 2025-07-25 | 인텔 코포레이션 | 매트릭스 가속기 아키텍처를 위한 희소 최적화 |
| US11188618B2 (en) * | 2019-09-05 | 2021-11-30 | Intel Corporation | Sparse matrix multiplication acceleration mechanism |
| US11663746B2 (en) * | 2019-11-15 | 2023-05-30 | Intel Corporation | Systolic arithmetic on sparse data |
| US12141229B2 (en) * | 2021-05-19 | 2024-11-12 | Nvidia Corporation | Techniques for accelerating matrix multiplication computations using hierarchical representations of sparse matrices |
-
2021
- 2021-05-25 US US17/329,259 patent/US12189710B2/en active Active
- 2021-12-21 JP JP2021207147A patent/JP7401513B2/ja active Active
- 2021-12-30 CN CN202111665133.7A patent/CN114329329B/zh active Active
-
2022
- 2022-02-07 EP EP22155308.4A patent/EP4095719A1/en active Pending
- 2022-02-09 KR KR1020220016772A patent/KR102601034B1/ko active Active
-
2023
- 2023-11-06 KR KR1020230151637A patent/KR20230155417A/ko active Pending
- 2023-12-07 JP JP2023206881A patent/JP7793585B2/ja active Active
-
2024
- 2024-11-12 US US18/944,274 patent/US20250068694A1/en active Pending
Non-Patent Citations (3)
| Title |
|---|
| HE XIN ET AL,Sparse-TPU adapting systolic arrays for sparse matrices,PROCEEDINGS OF THE 34TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACMPUB27,米国,2020年06月29日,pages 1-12,[平成5年1月5日検索],インターネット <URL:https://tnm.engin.umich.edu/wp-content/uploads/sites/353/2020/08/2020.6.sparse-tpu_ics2020.pdf> |
| QIN ERIC ET AL,SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training,2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA),米国,IEEE,2020年02月22日,pages 58-70,[online],[平成5年1月5日検索],インターネット <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9065523> |
| YAN MINGYU ET AL,HyGCN: A GCN Accelerator with Hybrid Architecture,2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA),米国,IEEE,2020年02月22日,pages 15-29,[平成5年1月5日検索],インターネット <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9065592> |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20220159257A (ko) | 2022-12-02 |
| US20220382829A1 (en) | 2022-12-01 |
| KR102601034B1 (ko) | 2023-11-09 |
| US12189710B2 (en) | 2025-01-07 |
| JP7793585B2 (ja) | 2026-01-05 |
| EP4095719A1 (en) | 2022-11-30 |
| US20250068694A1 (en) | 2025-02-27 |
| CN114329329B (zh) | 2026-02-17 |
| JP2024028901A (ja) | 2024-03-05 |
| KR20230155417A (ko) | 2023-11-10 |
| CN114329329A (zh) | 2022-04-12 |
| JP2022181161A (ja) | 2022-12-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7401513B2 (ja) | ハードウェアにおけるスパース行列乗算 | |
| Kung | Systolic algorithms for the CMU WARP processor | |
| CN109902804B (zh) | 一种池化运算方法及装置 | |
| EP2017743B1 (en) | High speed and efficient matrix multiplication hardware module | |
| EP4579445A2 (en) | Accessing data in multi-dimensional tensors using adders | |
| WO2022037257A1 (zh) | 卷积计算引擎、人工智能芯片以及数据处理方法 | |
| US11983616B2 (en) | Methods and apparatus for constructing digital circuits for performing matrix operations | |
| Li et al. | VBSF: a new storage format for SIMD sparse matrix–vector multiplication on modern processors: Y. Li et al. | |
| JP2022541721A (ja) | 効率的な乗算のための代替数字形式をサポートするシステムおよび方法 | |
| CN109074516A (zh) | 计算处理装置和计算处理方法 | |
| US20180373677A1 (en) | Apparatus and Methods of Providing Efficient Data Parallelization for Multi-Dimensional FFTs | |
| TW202020654A (zh) | 具有壓縮進位之數位電路 | |
| CN112446007A (zh) | 一种矩阵运算方法、运算装置以及处理器 | |
| CN115485656A (zh) | 用于卷积运算的存储器内处理方法 | |
| EP4128064A1 (en) | Power reduction for machine learning accelerator | |
| KR102372869B1 (ko) | 인공 신경망을 위한 행렬 연산기 및 행렬 연산 방법 | |
| CN109598335B (zh) | 一种二维卷积脉动阵列结构及实现方法 | |
| CN1020170C (zh) | 高速数字处理器 | |
| WO2023006170A1 (en) | Devices and methods for providing computationally efficient neural networks | |
| Wang et al. | An FPGA-based reconfigurable CNN training accelerator using decomposable Winograd | |
| US6718465B1 (en) | Reconfigurable inner product processor architecture implementing square recursive decomposition of partial product matrices | |
| CN112766473A (zh) | 运算装置及相关产品 | |
| CN117786285A (zh) | 数据处理方法、装置、设备及存储介质 | |
| HK40064573A (zh) | 硬件中的稀疏矩阵乘法 | |
| CN117077734B (zh) | 卷积输入变换方法、硬件加速器和加速器结构确定方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220406 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220406 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230606 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230816 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231114 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20231207 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7401513 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |