KR102601034B1 - 하드웨어에서의 희소 행렬 곱셈 - Google Patents

하드웨어에서의 희소 행렬 곱셈 Download PDF

Info

Publication number
KR102601034B1
KR102601034B1 KR1020220016772A KR20220016772A KR102601034B1 KR 102601034 B1 KR102601034 B1 KR 102601034B1 KR 1020220016772 A KR1020220016772 A KR 1020220016772A KR 20220016772 A KR20220016772 A KR 20220016772A KR 102601034 B1 KR102601034 B1 KR 102601034B1
Authority
KR
South Korea
Prior art keywords
shard
sparse
input
vector
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR1020220016772A
Other languages
English (en)
Korean (ko)
Other versions
KR20220159257A (ko
Inventor
라이너 알윈 포프
Original Assignee
구글 엘엘씨
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 구글 엘엘씨 filed Critical 구글 엘엘씨
Publication of KR20220159257A publication Critical patent/KR20220159257A/ko
Priority to KR1020230151637A priority Critical patent/KR20230155417A/ko
Application granted granted Critical
Publication of KR102601034B1 publication Critical patent/KR102601034B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8046Systolic arrays
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4876Multiplying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • G06F7/78Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Nonlinear Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
KR1020220016772A 2021-05-25 2022-02-09 하드웨어에서의 희소 행렬 곱셈 Active KR102601034B1 (ko)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020230151637A KR20230155417A (ko) 2021-05-25 2023-11-06 하드웨어에서의 희소 행렬 곱셈

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/329,259 US12189710B2 (en) 2021-05-25 2021-05-25 Sparse matrix multiplication in hardware
US17/329,259 2021-05-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
KR1020230151637A Division KR20230155417A (ko) 2021-05-25 2023-11-06 하드웨어에서의 희소 행렬 곱셈

Publications (2)

Publication Number Publication Date
KR20220159257A KR20220159257A (ko) 2022-12-02
KR102601034B1 true KR102601034B1 (ko) 2023-11-09

Family

ID=80222142

Family Applications (2)

Application Number Title Priority Date Filing Date
KR1020220016772A Active KR102601034B1 (ko) 2021-05-25 2022-02-09 하드웨어에서의 희소 행렬 곱셈
KR1020230151637A Pending KR20230155417A (ko) 2021-05-25 2023-11-06 하드웨어에서의 희소 행렬 곱셈

Family Applications After (1)

Application Number Title Priority Date Filing Date
KR1020230151637A Pending KR20230155417A (ko) 2021-05-25 2023-11-06 하드웨어에서의 희소 행렬 곱셈

Country Status (5)

Country Link
US (2) US12189710B2 (https=)
EP (1) EP4095719A1 (https=)
JP (2) JP7401513B2 (https=)
KR (2) KR102601034B1 (https=)
CN (1) CN114329329B (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220012304A1 (en) * 2020-07-07 2022-01-13 Sudarshan Kumar Fast matrix multiplication
US11940907B2 (en) * 2021-06-25 2024-03-26 Intel Corporation Methods and apparatus for sparse tensor storage for neural network accelerators
US20220012012A1 (en) * 2021-09-24 2022-01-13 Martin Langhammer Systems and Methods for Sparsity Operations in a Specialized Processing Block
US20230267169A1 (en) * 2022-02-24 2023-08-24 Xilinx, Inc. Sparse matrix dense vector multliplication circuitry
CN115470450B (zh) * 2022-08-30 2025-09-05 无锡江南计算技术研究所 一种矩阵乘运算装置及其低开销异常定位方法
WO2024108584A1 (zh) * 2022-11-25 2024-05-30 华为技术有限公司 稀疏算子处理方法及装置
KR20240081961A (ko) * 2022-12-01 2024-06-10 삼성전자주식회사 희소 행렬의 압축 저장 포맷을 변환하는 전자 장치 및 그 동작 방법
KR102745798B1 (ko) * 2023-12-26 2024-12-23 리벨리온 주식회사 데이터 연산 방법 및 이를 지원하는 데이터 연산 장치
US20260111173A1 (en) * 2024-10-17 2026-04-23 Edgecortix Inc. On-chip non-zero value unpacking and distribution
CN119602947B (zh) * 2024-11-14 2025-10-03 西安交通大学 一种面向密码算法bike的二进制多项式乘法器及加密方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9697176B2 (en) 2014-11-14 2017-07-04 Advanced Micro Devices, Inc. Efficient sparse matrix-vector multiplication on parallel processors
US9760538B2 (en) 2014-12-22 2017-09-12 Palo Alto Research Center Incorporated Computer-implemented system and method for efficient sparse matrix representation and processing
US10528321B2 (en) 2016-12-07 2020-01-07 Microsoft Technology Licensing, Llc Block floating point for neural network implementations
US10489063B2 (en) 2016-12-19 2019-11-26 Intel Corporation Memory-to-memory instructions to accelerate sparse-matrix by dense-vector and sparse-vector by dense-vector multiplication
US11216722B2 (en) 2016-12-31 2022-01-04 Intel Corporation Hardware accelerator template and design framework for implementing recurrent neural networks
US10180928B2 (en) 2016-12-31 2019-01-15 Intel Corporation Heterogeneous hardware accelerator architecture for processing sparse matrix data with skewed non-zero distributions
WO2018134740A2 (en) * 2017-01-22 2018-07-26 Gsi Technology Inc. Sparse matrix multiplication in associative memory device
US10572568B2 (en) * 2018-03-28 2020-02-25 Intel Corporation Accelerator for sparse-dense matrix multiplication
US10726096B2 (en) 2018-10-12 2020-07-28 Hewlett Packard Enterprise Development Lp Sparse matrix vector multiplication with a matrix vector multiplication unit
KR102838677B1 (ko) * 2019-03-15 2025-07-25 인텔 코포레이션 매트릭스 가속기 아키텍처를 위한 희소 최적화
US11188618B2 (en) * 2019-09-05 2021-11-30 Intel Corporation Sparse matrix multiplication acceleration mechanism
US11663746B2 (en) * 2019-11-15 2023-05-30 Intel Corporation Systolic arithmetic on sparse data
US12141229B2 (en) * 2021-05-19 2024-11-12 Nvidia Corporation Techniques for accelerating matrix multiplication computations using hierarchical representations of sparse matrices

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HE XIN 등, "Sparse - TPU adapting systolic arrays for sparse matrices", PROCEEDINGS OF THE 34TH ACM INT. CONF. ON SUPERCOMPUTING, ACMPUB27,(2020.6.29.)
QIN ERIC 등, "SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training", 2020 IEEE INT. SYM. ON HIGHPERFORMANCE COMPUTER ARCHITECTURE(HPCA), IEEE, (2020.2.22.)
YAN MINGYU 등, "HyGCN: A GCN Accelerator with Hybrid Architecture", 2020 IEEE INT. SYM. ON HIGH PERFORMANCE COMPUTER ARCHITECTURE(HPCA), IEEE, (2020.2.22.)

Also Published As

Publication number Publication date
KR20220159257A (ko) 2022-12-02
US20220382829A1 (en) 2022-12-01
US12189710B2 (en) 2025-01-07
JP7793585B2 (ja) 2026-01-05
EP4095719A1 (en) 2022-11-30
US20250068694A1 (en) 2025-02-27
CN114329329B (zh) 2026-02-17
JP2024028901A (ja) 2024-03-05
KR20230155417A (ko) 2023-11-10
CN114329329A (zh) 2022-04-12
JP7401513B2 (ja) 2023-12-19
JP2022181161A (ja) 2022-12-07

Similar Documents

Publication Publication Date Title
KR102601034B1 (ko) 하드웨어에서의 희소 행렬 곱셈
Kung Systolic algorithms for the CMU WARP processor
KR102859456B1 (ko) 신경망 가속기
KR102511911B1 (ko) Gemm 데이터플로우 가속기 반도체 회로
Lu et al. SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs
Peterka et al. A configurable algorithm for parallel image-compositing applications
AU2008202591B2 (en) High speed and efficient matrix multiplication hardware module
JP6715900B2 (ja) ニューラルネットワークのパラメータを適応させるための方法および装置
EP3651077B1 (en) Computation device and method
JP2022540550A (ja) ニューラルネットワークアクセラレータにおいてスパースデータを読み取るおよび書き込むためのシステムおよび方法
US11983616B2 (en) Methods and apparatus for constructing digital circuits for performing matrix operations
EP3789892B1 (en) Method and apparatus for processing data
Hunter et al. Two sparsities are better than one: unlocking the performance benefits of sparse–sparse networks
CN108170639A (zh) 基于分布式环境的张量cp分解实现方法
US20240086719A1 (en) Sparse encoding and decoding at mixture-of-experts layer
TW202020654A (zh) 具有壓縮進位之數位電路
US20240169463A1 (en) Mixture-of-experts layer with dynamic gating
Arredondo-Velazquez et al. A streaming architecture for Convolutional Neural Networks based on layer operations chaining
CN1020170C (zh) 高速数字处理器
WO2023006170A1 (en) Devices and methods for providing computationally efficient neural networks
US20240160906A1 (en) Collective communication phases at mixture-of-experts layer
HK40064573A (zh) 硬件中的稀疏矩阵乘法
CN115576895B (zh) 计算装置、计算方法及计算机可读存储介质
Gerdt et al. Some algorithms for calculating unitary matrices for quantum circuits
Tomii et al. A Hardware Solver for Simultaneous Linear Equations with Multistage Interconnection Network

Legal Events

Date Code Title Description
PA0109 Patent application

Patent event code: PA01091R01D

Comment text: Patent Application

Patent event date: 20220209

PA0201 Request for examination

Patent event code: PA02012R01D

Patent event date: 20220210

Comment text: Request for Examination of Application

Patent event code: PA02011R01I

Patent event date: 20220209

Comment text: Patent Application

PG1501 Laying open of application
E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

Comment text: Notification of reason for refusal

Patent event date: 20230626

Patent event code: PE09021S01D

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

Patent event code: PE07011S01D

Comment text: Decision to Grant Registration

Patent event date: 20230829

A107 Divisional application of patent
PA0107 Divisional application

Comment text: Divisional Application of Patent

Patent event date: 20231106

Patent event code: PA01071R01D

GRNT Written decision to grant
PR0701 Registration of establishment

Comment text: Registration of Establishment

Patent event date: 20231107

Patent event code: PR07011E01D

PR1002 Payment of registration fee

Payment date: 20231107

End annual number: 3

Start annual number: 1

PG1601 Publication of registration