JP7401513B2 - ハードウェアにおけるスパース行列乗算 - Google Patents

ハードウェアにおけるスパース行列乗算 Download PDF

Info

Publication number
JP7401513B2
JP7401513B2 JP2021207147A JP2021207147A JP7401513B2 JP 7401513 B2 JP7401513 B2 JP 7401513B2 JP 2021207147 A JP2021207147 A JP 2021207147A JP 2021207147 A JP2021207147 A JP 2021207147A JP 7401513 B2 JP7401513 B2 JP 7401513B2
Authority
JP
Japan
Prior art keywords
shard
sparse
input
vector
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2021207147A
Other languages
English (en)
Japanese (ja)
Other versions
JP2022181161A (ja
Inventor
レイナー・アルウィン・ポープ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Publication of JP2022181161A publication Critical patent/JP2022181161A/ja
Priority to JP2023206881A priority Critical patent/JP7793585B2/ja
Application granted granted Critical
Publication of JP7401513B2 publication Critical patent/JP7401513B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8046Systolic arrays
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4876Multiplying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • G06F7/78Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Nonlinear Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
JP2021207147A 2021-05-25 2021-12-21 ハードウェアにおけるスパース行列乗算 Active JP7401513B2 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2023206881A JP7793585B2 (ja) 2021-05-25 2023-12-07 ハードウェアにおけるスパース行列乗算

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/329,259 US12189710B2 (en) 2021-05-25 2021-05-25 Sparse matrix multiplication in hardware
US17/329,259 2021-05-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP2023206881A Division JP7793585B2 (ja) 2021-05-25 2023-12-07 ハードウェアにおけるスパース行列乗算

Publications (2)

Publication Number Publication Date
JP2022181161A JP2022181161A (ja) 2022-12-07
JP7401513B2 true JP7401513B2 (ja) 2023-12-19

Family

ID=80222142

Family Applications (2)

Application Number Title Priority Date Filing Date
JP2021207147A Active JP7401513B2 (ja) 2021-05-25 2021-12-21 ハードウェアにおけるスパース行列乗算
JP2023206881A Active JP7793585B2 (ja) 2021-05-25 2023-12-07 ハードウェアにおけるスパース行列乗算

Family Applications After (1)

Application Number Title Priority Date Filing Date
JP2023206881A Active JP7793585B2 (ja) 2021-05-25 2023-12-07 ハードウェアにおけるスパース行列乗算

Country Status (5)

Country Link
US (2) US12189710B2 (https=)
EP (1) EP4095719A1 (https=)
JP (2) JP7401513B2 (https=)
KR (2) KR102601034B1 (https=)
CN (1) CN114329329B (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220012304A1 (en) * 2020-07-07 2022-01-13 Sudarshan Kumar Fast matrix multiplication
US11940907B2 (en) * 2021-06-25 2024-03-26 Intel Corporation Methods and apparatus for sparse tensor storage for neural network accelerators
US20220012012A1 (en) * 2021-09-24 2022-01-13 Martin Langhammer Systems and Methods for Sparsity Operations in a Specialized Processing Block
US20230267169A1 (en) * 2022-02-24 2023-08-24 Xilinx, Inc. Sparse matrix dense vector multliplication circuitry
CN115470450B (zh) * 2022-08-30 2025-09-05 无锡江南计算技术研究所 一种矩阵乘运算装置及其低开销异常定位方法
WO2024108584A1 (zh) * 2022-11-25 2024-05-30 华为技术有限公司 稀疏算子处理方法及装置
KR20240081961A (ko) * 2022-12-01 2024-06-10 삼성전자주식회사 희소 행렬의 압축 저장 포맷을 변환하는 전자 장치 및 그 동작 방법
KR102745798B1 (ko) * 2023-12-26 2024-12-23 리벨리온 주식회사 데이터 연산 방법 및 이를 지원하는 데이터 연산 장치
US20260111173A1 (en) * 2024-10-17 2026-04-23 Edgecortix Inc. On-chip non-zero value unpacking and distribution
CN119602947B (zh) * 2024-11-14 2025-10-03 西安交通大学 一种面向密码算法bike的二进制多项式乘法器及加密方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9697176B2 (en) 2014-11-14 2017-07-04 Advanced Micro Devices, Inc. Efficient sparse matrix-vector multiplication on parallel processors
US9760538B2 (en) 2014-12-22 2017-09-12 Palo Alto Research Center Incorporated Computer-implemented system and method for efficient sparse matrix representation and processing
US10528321B2 (en) 2016-12-07 2020-01-07 Microsoft Technology Licensing, Llc Block floating point for neural network implementations
US10489063B2 (en) 2016-12-19 2019-11-26 Intel Corporation Memory-to-memory instructions to accelerate sparse-matrix by dense-vector and sparse-vector by dense-vector multiplication
US11216722B2 (en) 2016-12-31 2022-01-04 Intel Corporation Hardware accelerator template and design framework for implementing recurrent neural networks
US10180928B2 (en) 2016-12-31 2019-01-15 Intel Corporation Heterogeneous hardware accelerator architecture for processing sparse matrix data with skewed non-zero distributions
WO2018134740A2 (en) * 2017-01-22 2018-07-26 Gsi Technology Inc. Sparse matrix multiplication in associative memory device
US10572568B2 (en) * 2018-03-28 2020-02-25 Intel Corporation Accelerator for sparse-dense matrix multiplication
US10726096B2 (en) 2018-10-12 2020-07-28 Hewlett Packard Enterprise Development Lp Sparse matrix vector multiplication with a matrix vector multiplication unit
KR102838677B1 (ko) * 2019-03-15 2025-07-25 인텔 코포레이션 매트릭스 가속기 아키텍처를 위한 희소 최적화
US11188618B2 (en) * 2019-09-05 2021-11-30 Intel Corporation Sparse matrix multiplication acceleration mechanism
US11663746B2 (en) * 2019-11-15 2023-05-30 Intel Corporation Systolic arithmetic on sparse data
US12141229B2 (en) * 2021-05-19 2024-11-12 Nvidia Corporation Techniques for accelerating matrix multiplication computations using hierarchical representations of sparse matrices

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HE XIN ET AL,Sparse-TPU adapting systolic arrays for sparse matrices,PROCEEDINGS OF THE 34TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACMPUB27,米国,2020年06月29日,pages 1-12,[平成5年1月5日検索],インターネット <URL:https://tnm.engin.umich.edu/wp-content/uploads/sites/353/2020/08/2020.6.sparse-tpu_ics2020.pdf>
QIN ERIC ET AL,SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training,2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA),米国,IEEE,2020年02月22日,pages 58-70,[online],[平成5年1月5日検索],インターネット <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9065523>
YAN MINGYU ET AL,HyGCN: A GCN Accelerator with Hybrid Architecture,2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA),米国,IEEE,2020年02月22日,pages 15-29,[平成5年1月5日検索],インターネット <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9065592>

Also Published As

Publication number Publication date
KR20220159257A (ko) 2022-12-02
US20220382829A1 (en) 2022-12-01
KR102601034B1 (ko) 2023-11-09
US12189710B2 (en) 2025-01-07
JP7793585B2 (ja) 2026-01-05
EP4095719A1 (en) 2022-11-30
US20250068694A1 (en) 2025-02-27
CN114329329B (zh) 2026-02-17
JP2024028901A (ja) 2024-03-05
KR20230155417A (ko) 2023-11-10
CN114329329A (zh) 2022-04-12
JP2022181161A (ja) 2022-12-07

Similar Documents

Publication Publication Date Title
JP7401513B2 (ja) ハードウェアにおけるスパース行列乗算
Kung Systolic algorithms for the CMU WARP processor
CN109902804B (zh) 一种池化运算方法及装置
EP2017743B1 (en) High speed and efficient matrix multiplication hardware module
EP4579445A2 (en) Accessing data in multi-dimensional tensors using adders
WO2022037257A1 (zh) 卷积计算引擎、人工智能芯片以及数据处理方法
US11983616B2 (en) Methods and apparatus for constructing digital circuits for performing matrix operations
Li et al. VBSF: a new storage format for SIMD sparse matrix–vector multiplication on modern processors: Y. Li et al.
JP2022541721A (ja) 効率的な乗算のための代替数字形式をサポートするシステムおよび方法
CN109074516A (zh) 计算处理装置和计算处理方法
US20180373677A1 (en) Apparatus and Methods of Providing Efficient Data Parallelization for Multi-Dimensional FFTs
TW202020654A (zh) 具有壓縮進位之數位電路
CN112446007A (zh) 一种矩阵运算方法、运算装置以及处理器
CN115485656A (zh) 用于卷积运算的存储器内处理方法
EP4128064A1 (en) Power reduction for machine learning accelerator
KR102372869B1 (ko) 인공 신경망을 위한 행렬 연산기 및 행렬 연산 방법
CN109598335B (zh) 一种二维卷积脉动阵列结构及实现方法
CN1020170C (zh) 高速数字处理器
WO2023006170A1 (en) Devices and methods for providing computationally efficient neural networks
Wang et al. An FPGA-based reconfigurable CNN training accelerator using decomposable Winograd
US6718465B1 (en) Reconfigurable inner product processor architecture implementing square recursive decomposition of partial product matrices
CN112766473A (zh) 运算装置及相关产品
CN117786285A (zh) 数据处理方法、装置、设备及存储介质
HK40064573A (zh) 硬件中的稀疏矩阵乘法
CN117077734B (zh) 卷积输入变换方法、硬件加速器和加速器结构确定方法

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20220406

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20220406

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20230606

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20230816

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20231114

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20231207

R150 Certificate of patent or registration of utility model

Ref document number: 7401513

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150