CN114329329B - 硬件中的稀疏矩阵乘法 - Google Patents

硬件中的稀疏矩阵乘法

Info

Publication number
CN114329329B
CN114329329B CN202111665133.7A CN202111665133A CN114329329B CN 114329329 B CN114329329 B CN 114329329B CN 202111665133 A CN202111665133 A CN 202111665133A CN 114329329 B CN114329329 B CN 114329329B
Authority
CN
China
Prior art keywords
sparse
input
vector
matrix
values
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111665133.7A
Other languages
English (en)
Chinese (zh)
Other versions
CN114329329A (zh
Inventor
赖纳·阿尔文·波普
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Publication of CN114329329A publication Critical patent/CN114329329A/zh
Application granted granted Critical
Publication of CN114329329B publication Critical patent/CN114329329B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8046Systolic arrays
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4876Multiplying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • G06F7/78Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Nonlinear Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
CN202111665133.7A 2021-05-25 2021-12-30 硬件中的稀疏矩阵乘法 Active CN114329329B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/329,259 US12189710B2 (en) 2021-05-25 2021-05-25 Sparse matrix multiplication in hardware
US17/329,259 2021-05-25

Publications (2)

Publication Number Publication Date
CN114329329A CN114329329A (zh) 2022-04-12
CN114329329B true CN114329329B (zh) 2026-02-17

Family

ID=80222142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111665133.7A Active CN114329329B (zh) 2021-05-25 2021-12-30 硬件中的稀疏矩阵乘法

Country Status (5)

Country Link
US (2) US12189710B2 (https=)
EP (1) EP4095719A1 (https=)
JP (2) JP7401513B2 (https=)
KR (2) KR102601034B1 (https=)
CN (1) CN114329329B (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220012304A1 (en) * 2020-07-07 2022-01-13 Sudarshan Kumar Fast matrix multiplication
US11940907B2 (en) * 2021-06-25 2024-03-26 Intel Corporation Methods and apparatus for sparse tensor storage for neural network accelerators
US20220012012A1 (en) * 2021-09-24 2022-01-13 Martin Langhammer Systems and Methods for Sparsity Operations in a Specialized Processing Block
US20230267169A1 (en) * 2022-02-24 2023-08-24 Xilinx, Inc. Sparse matrix dense vector multliplication circuitry
CN115470450B (zh) * 2022-08-30 2025-09-05 无锡江南计算技术研究所 一种矩阵乘运算装置及其低开销异常定位方法
WO2024108584A1 (zh) * 2022-11-25 2024-05-30 华为技术有限公司 稀疏算子处理方法及装置
KR20240081961A (ko) * 2022-12-01 2024-06-10 삼성전자주식회사 희소 행렬의 압축 저장 포맷을 변환하는 전자 장치 및 그 동작 방법
KR102745798B1 (ko) * 2023-12-26 2024-12-23 리벨리온 주식회사 데이터 연산 방법 및 이를 지원하는 데이터 연산 장치
US20260111173A1 (en) * 2024-10-17 2026-04-23 Edgecortix Inc. On-chip non-zero value unpacking and distribution
CN119602947B (zh) * 2024-11-14 2025-10-03 西安交通大学 一种面向密码算法bike的二进制多项式乘法器及加密方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9697176B2 (en) 2014-11-14 2017-07-04 Advanced Micro Devices, Inc. Efficient sparse matrix-vector multiplication on parallel processors
US9760538B2 (en) 2014-12-22 2017-09-12 Palo Alto Research Center Incorporated Computer-implemented system and method for efficient sparse matrix representation and processing
US10528321B2 (en) 2016-12-07 2020-01-07 Microsoft Technology Licensing, Llc Block floating point for neural network implementations
US10489063B2 (en) 2016-12-19 2019-11-26 Intel Corporation Memory-to-memory instructions to accelerate sparse-matrix by dense-vector and sparse-vector by dense-vector multiplication
US11216722B2 (en) 2016-12-31 2022-01-04 Intel Corporation Hardware accelerator template and design framework for implementing recurrent neural networks
US10180928B2 (en) 2016-12-31 2019-01-15 Intel Corporation Heterogeneous hardware accelerator architecture for processing sparse matrix data with skewed non-zero distributions
WO2018134740A2 (en) * 2017-01-22 2018-07-26 Gsi Technology Inc. Sparse matrix multiplication in associative memory device
US10572568B2 (en) * 2018-03-28 2020-02-25 Intel Corporation Accelerator for sparse-dense matrix multiplication
US10726096B2 (en) 2018-10-12 2020-07-28 Hewlett Packard Enterprise Development Lp Sparse matrix vector multiplication with a matrix vector multiplication unit
KR102838677B1 (ko) * 2019-03-15 2025-07-25 인텔 코포레이션 매트릭스 가속기 아키텍처를 위한 희소 최적화
US11188618B2 (en) * 2019-09-05 2021-11-30 Intel Corporation Sparse matrix multiplication acceleration mechanism
US11663746B2 (en) * 2019-11-15 2023-05-30 Intel Corporation Systolic arithmetic on sparse data
US12141229B2 (en) * 2021-05-19 2024-11-12 Nvidia Corporation Techniques for accelerating matrix multiplication computations using hierarchical representations of sparse matrices

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HyGCN: A GCN Accelerator with Hybrid Architecture;YAN MINGYU 等;2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA);20200413;第15-29页 *

Also Published As

Publication number Publication date
KR20220159257A (ko) 2022-12-02
US20220382829A1 (en) 2022-12-01
KR102601034B1 (ko) 2023-11-09
US12189710B2 (en) 2025-01-07
JP7793585B2 (ja) 2026-01-05
EP4095719A1 (en) 2022-11-30
US20250068694A1 (en) 2025-02-27
JP2024028901A (ja) 2024-03-05
KR20230155417A (ko) 2023-11-10
CN114329329A (zh) 2022-04-12
JP7401513B2 (ja) 2023-12-19
JP2022181161A (ja) 2022-12-07

Similar Documents

Publication Publication Date Title
CN114329329B (zh) 硬件中的稀疏矩阵乘法
Lu et al. SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs
KR102511911B1 (ko) Gemm 데이터플로우 가속기 반도체 회로
US10726096B2 (en) Sparse matrix vector multiplication with a matrix vector multiplication unit
CN109657782B (zh) 运算方法、装置及相关产品
EP3179415B1 (en) Systems and methods for a multi-core optimized recurrent neural network
US20250173398A1 (en) Matrix computing method and apparatus
CN108170639A (zh) 基于分布式环境的张量cp分解实现方法
KR20190107766A (ko) 계산 장치 및 방법
Hunter et al. Two sparsities are better than one: unlocking the performance benefits of sparse–sparse networks
Conte et al. GPU-acceleration of waveform relaxation methods for large differential systems
US20240086719A1 (en) Sparse encoding and decoding at mixture-of-experts layer
KR102869716B1 (ko) 기계 학습 가속을 위한 전력 감소
CN112446007A (zh) 一种矩阵运算方法、运算装置以及处理器
Zhuzhunashvili et al. Preconditioned spectral clustering for stochastic block partition streaming graph challenge (preliminary version at arxiv.)
CN118069315A (zh) 基于分布式平台的稀疏三角矩阵的求解方法及装置
JP2010122850A (ja) 行列方程式計算装置および行列方程式計算方法
HK40064573A (zh) 硬件中的稀疏矩阵乘法
US20240160906A1 (en) Collective communication phases at mixture-of-experts layer
Wang et al. A parallel sparse approximate inverse preconditioning algorithm based on MPI and CUDA
CN115576895B (zh) 计算装置、计算方法及计算机可读存储介质
Venieris et al. Towards heterogeneous solvers for large-scale linear systems
Al Na'mneh et al. An efficient bit reversal permutation algorithm
Sergiyenko et al. Genetic Programming of Discrete Cosine Transform Processors
Tomii et al. A Hardware Solver for Simultaneous Linear Equations with Multistage Interconnection Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40064573

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant