PL3547114T3 - Akcelerator do mnożenia macierzy rzadkiej przez gęstą - Google Patents

Akcelerator do mnożenia macierzy rzadkiej przez gęstą

Info

Publication number
PL3547114T3
PL3547114T3 PL19157044.9T PL19157044T PL3547114T3 PL 3547114 T3 PL3547114 T3 PL 3547114T3 PL 19157044 T PL19157044 T PL 19157044T PL 3547114 T3 PL3547114 T3 PL 3547114T3
Authority
PL
Poland
Prior art keywords
sparse
accelerator
matrix multiplication
dense matrix
dense
Prior art date
Application number
PL19157044.9T
Other languages
English (en)
Inventor
Srinivasan Narayanamoorthy
Nadathur Rajagopalan Satish
Alexey Suprun
Kenneth J. Janik
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Publication of PL3547114T3 publication Critical patent/PL3547114T3/pl

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30043LOAD or STORE instructions; Clear instruction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • G06F9/3016Decoding the operand specifier, e.g. specifier format
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3888Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Complex Calculations (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)
PL19157044.9T 2018-03-28 2019-02-13 Akcelerator do mnożenia macierzy rzadkiej przez gęstą PL3547114T3 (pl)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US15/938,924 US10572568B2 (en) 2018-03-28 2018-03-28 Accelerator for sparse-dense matrix multiplication

Publications (1)

Publication Number Publication Date
PL3547114T3 true PL3547114T3 (pl) 2025-06-09

Family

ID=65231633

Family Applications (2)

Application Number Title Priority Date Filing Date
PL20199012.4T PL3779681T3 (pl) 2018-03-28 2019-02-13 Akcelerator do mnożenia macierzy rzadkich przez gęste
PL19157044.9T PL3547114T3 (pl) 2018-03-28 2019-02-13 Akcelerator do mnożenia macierzy rzadkiej przez gęstą

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PL20199012.4T PL3779681T3 (pl) 2018-03-28 2019-02-13 Akcelerator do mnożenia macierzy rzadkich przez gęste

Country Status (7)

Country Link
US (7) US10572568B2 (pl)
EP (4) EP4462250A3 (pl)
CN (4) CN112069459B (pl)
DK (1) DK3779681T3 (pl)
ES (2) ES3019657T3 (pl)
FI (1) FI3779681T3 (pl)
PL (2) PL3779681T3 (pl)

Families Citing this family (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018174936A1 (en) 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for tile matrix multiplication and accumulation
US10409614B2 (en) 2017-04-24 2019-09-10 Intel Corporation Instructions having support for floating point and integer data types in the same register
US10474458B2 (en) 2017-04-28 2019-11-12 Intel Corporation Instructions and logic to perform floating-point and integer operations for machine learning
US11275588B2 (en) 2017-07-01 2022-03-15 Intel Corporation Context save with variable save state size
WO2019090325A1 (en) 2017-11-06 2019-05-09 Neuralmagic, Inc. Methods and systems for improved transforms in convolutional neural networks
US11715287B2 (en) 2017-11-18 2023-08-01 Neuralmagic Inc. Systems and methods for exchange of data in distributed training of machine learning algorithms
US10691772B2 (en) * 2018-04-20 2020-06-23 Advanced Micro Devices, Inc. High-performance sparse triangular solve on graphics processing units
GB2574060B (en) * 2018-05-25 2022-11-23 Myrtle Software Ltd Processing matrix vector multiplication
US11449363B2 (en) 2018-05-31 2022-09-20 Neuralmagic Inc. Systems and methods for improved neural network execution
US10963787B2 (en) * 2018-05-31 2021-03-30 Neuralmagic Inc. Systems and methods for generation of sparse code for convolutional neural networks
US10832133B2 (en) 2018-05-31 2020-11-10 Neuralmagic Inc. System and method of executing neural networks
US11216732B2 (en) 2018-05-31 2022-01-04 Neuralmagic Inc. Systems and methods for generation of sparse code for convolutional neural networks
US10620951B2 (en) * 2018-06-22 2020-04-14 Intel Corporation Matrix multiplication acceleration of sparse matrices using column folding and squeezing
WO2020046859A1 (en) 2018-08-27 2020-03-05 Neuralmagic Inc. Systems and methods for neural network convolutional layer matrix multiplication using cache memory
US10719323B2 (en) 2018-09-27 2020-07-21 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US11636343B2 (en) 2018-10-01 2023-04-25 Neuralmagic Inc. Systems and methods for neural network pruning with accuracy preservation
US12008475B2 (en) * 2018-11-14 2024-06-11 Nvidia Corporation Transposed sparse matrix multiply by dense matrix for neural network training
US11663001B2 (en) * 2018-11-19 2023-05-30 Advanced Micro Devices, Inc. Family of lossy sparse load SIMD instructions
US20200210517A1 (en) 2018-12-27 2020-07-02 Intel Corporation Systems and methods to accelerate multiplication of sparse matrices
US11544559B2 (en) 2019-01-08 2023-01-03 Neuralmagic Inc. System and method for executing convolution in a neural network
CN113424148A (zh) 2019-03-15 2021-09-21 英特尔公司 用于检测跨分片访问、提供多分片推理缩放和提供最佳页迁移的多分片存储器管理
WO2020190796A1 (en) 2019-03-15 2020-09-24 Intel Corporation Systems and methods for cache optimization
CN112905241B (zh) 2019-03-15 2024-03-29 英特尔公司 用于矩阵加速器架构的稀疏优化
US11934342B2 (en) 2019-03-15 2024-03-19 Intel Corporation Assistance for hardware prefetch in cache access
US11392376B2 (en) 2019-04-11 2022-07-19 Arm Limited Processor for sparse matrix computation
US11127167B2 (en) * 2019-04-29 2021-09-21 Nvidia Corporation Efficient matrix format suitable for neural networks
US11379556B2 (en) * 2019-05-21 2022-07-05 Arm Limited Apparatus and method for matrix operations
US11403097B2 (en) 2019-06-26 2022-08-02 Intel Corporation Systems and methods to skip inconsequential matrix operations
US12353846B2 (en) * 2019-07-09 2025-07-08 MemryX Matrix data reuse techniques in multiply and accumulate units of processing system
US11195095B2 (en) 2019-08-08 2021-12-07 Neuralmagic Inc. System and method of accelerating execution of a neural network
WO2021040921A1 (en) * 2019-08-29 2021-03-04 Alibaba Group Holding Limited Systems and methods for providing vector-wise sparsity in a neural network
WO2021058578A1 (en) * 2019-09-25 2021-04-01 Deepmind Technologies Limited Fast sparse neural networks
KR20210045224A (ko) * 2019-10-16 2021-04-26 삼성전자주식회사 데이터를 처리하는 방법 및 장치
CN110766136B (zh) * 2019-10-16 2022-09-09 北京航空航天大学 一种稀疏矩阵与向量的压缩方法
CN110889259B (zh) * 2019-11-06 2021-07-09 北京中科胜芯科技有限公司 针对排列的块对角权重矩阵的稀疏矩阵向量乘法计算单元
US11861761B2 (en) 2019-11-15 2024-01-02 Intel Corporation Graphics processing unit processing and caching improvements
US11537859B2 (en) * 2019-12-06 2022-12-27 International Business Machines Corporation Flexible precision neural inference processing unit
US11372644B2 (en) * 2019-12-09 2022-06-28 Meta Platforms, Inc. Matrix processing instruction with optional up/down sampling of matrix
CN113094099A (zh) * 2019-12-23 2021-07-09 超威半导体(上海)有限公司 矩阵数据广播架构
KR102788804B1 (ko) * 2019-12-27 2025-03-31 삼성전자주식회사 전자 장치 및 그 제어 방법
US11829439B2 (en) * 2019-12-30 2023-11-28 Qualcomm Incorporated Methods and apparatus to perform matrix multiplication in a streaming processor
CN111240743B (zh) * 2020-01-03 2022-06-03 格兰菲智能科技有限公司 人工智能集成电路
US11586601B2 (en) * 2020-02-05 2023-02-21 Alibaba Group Holding Limited Apparatus and method for representation of a sparse matrix in a neural network
US11226816B2 (en) 2020-02-12 2022-01-18 Samsung Electronics Co., Ltd. Systems and methods for data placement for in-memory-compute
US11281554B2 (en) 2020-03-17 2022-03-22 Samsung Electronics Co., Ltd. System and method for in-memory computation
DE102020131666A1 (de) 2020-05-05 2021-11-11 Intel Corporation Skalierbare Multiplikationsbeschleunigung dünnbesetzter Matrizen unter Verwendung systolischer Arrays mit Rückkopplungseingaben
US11204977B2 (en) * 2020-05-05 2021-12-21 Intel Corporation Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs
CN115989505A (zh) 2020-07-21 2023-04-18 多伦多大学管理委员会 使用稀疏性来加速深度学习网络的系统和方法
CN112199636B (zh) * 2020-10-15 2022-10-28 清华大学 适用于微处理器的快速卷积方法及装置
JP7566931B2 (ja) * 2020-11-19 2024-10-15 グーグル エルエルシー 出力後処理を伴うシストリックアレイセル
US11556757B1 (en) 2020-12-10 2023-01-17 Neuralmagic Ltd. System and method of executing deep tensor columns in neural networks
US12493779B2 (en) * 2020-12-15 2025-12-09 The George Washington University SGCNAX: a scalable graph convolutional neural network accelerator with workload balancing
US20220197595A1 (en) * 2020-12-21 2022-06-23 Intel Corporation Efficient multiply and accumulate instruction when an operand is equal to or near a power of two
CN114692074A (zh) * 2020-12-25 2022-07-01 安徽寒武纪信息科技有限公司 矩阵乘法电路、方法及相关产品
CN119149890A (zh) * 2020-12-30 2024-12-17 华为技术有限公司 一种矩阵计算装置、方法、系统、电路、芯片及设备
CN115885479A (zh) * 2021-01-20 2023-03-31 辉达公司 执行无线通信信号数据解码的技术
CN112835552A (zh) * 2021-01-26 2021-05-25 算筹信息科技有限公司 一种外积累加求解稀疏矩阵与稠密矩阵内积的方法
CN112799635B (zh) * 2021-02-08 2022-11-15 算筹(深圳)信息科技有限公司 一种新型外积累加求解稠密矩阵与稀疏矩阵内积的方法
US12141438B2 (en) * 2021-02-25 2024-11-12 Alibaba Group Holding Limited Zero skipping techniques for reducing data movement
US12400120B2 (en) * 2021-03-04 2025-08-26 Samsung Electronics Co., Ltd. Method and apparatus with neural network operation using sparsification
US20220318013A1 (en) * 2021-03-25 2022-10-06 Intel Corporation Supporting 8-bit floating point format operands in a computing architecture
CN115461759A (zh) * 2021-04-09 2022-12-09 辉达公司 增加数据集的稀疏性
TWI847030B (zh) 2021-05-05 2024-07-01 創鑫智慧股份有限公司 矩陣乘法器及其操作方法
US20220366007A1 (en) * 2021-05-13 2022-11-17 Nvidia Corporation Performing matrix value indication
US12189710B2 (en) * 2021-05-25 2025-01-07 Google Llc Sparse matrix multiplication in hardware
CN113377534A (zh) * 2021-06-08 2021-09-10 东南大学 一种基于csr格式的高性能稀疏矩阵向量乘法计算方法
US20230008777A1 (en) * 2021-07-09 2023-01-12 Waymo Llc Accelerating convolutions for sparse inputs
WO2023003737A2 (en) * 2021-07-23 2023-01-26 Cryptography Research, Inc. Multi-lane cryptographic engine and operations thereof
US11443014B1 (en) * 2021-08-23 2022-09-13 SambaNova Systems, Inc. Sparse matrix multiplier in hardware and a reconfigurable data processor including same
US20230102279A1 (en) * 2021-09-25 2023-03-30 Intel Corporation Apparatuses, methods, and systems for instructions for structured-sparse tile matrix fma
CN113870918B (zh) * 2021-09-30 2023-03-28 华中科技大学 存内稀疏矩阵乘法运算方法、方程求解方法以及求解器
US11960982B1 (en) 2021-10-21 2024-04-16 Neuralmagic, Inc. System and method of determining and executing deep tensor columns in neural networks
US20230133305A1 (en) * 2021-10-28 2023-05-04 Kwai Inc. Methods and devices for accelerating a transformer with a sparse attention pattern
US11941248B2 (en) * 2021-12-13 2024-03-26 Xilinx, Inc. Compression of sparse tensors
CN119156618A (zh) * 2022-05-18 2024-12-17 谷歌有限责任公司 在机器学习硬件加速器处利用数据稀疏性
US12417100B2 (en) * 2022-08-03 2025-09-16 Intel Corporation Instructions for structured-sparse tile matrix FMA
CN115310037A (zh) * 2022-08-17 2022-11-08 平头哥(杭州)半导体有限公司 矩阵乘法计算单元、加速单元、计算系统和相关方法
CN115481364B (zh) * 2022-09-19 2025-06-10 浙江大学 基于gpu加速的大规模椭圆曲线多标量乘法的并行计算方法
CN115578243B (zh) * 2022-10-09 2024-01-05 北京中科通量科技有限公司 一种面向稀疏矩阵的膨胀处理方法
TWI819937B (zh) * 2022-12-28 2023-10-21 國立成功大學 應用於類神經網路的記憶內運算的加速器
WO2024243796A1 (en) * 2023-05-30 2024-12-05 Intel Corporation Methods and apparatus for matrix multiplication with reinforcement learning
CN119337040A (zh) * 2023-07-21 2025-01-21 华为技术有限公司 计算装置、方法、设备、芯片及系统
CN116821576B (zh) * 2023-08-28 2023-12-26 英特尔(中国)研究中心有限公司 用于基于risc-v加速n:m稀疏网络的方法和装置
CN117931131B (zh) * 2024-03-22 2024-07-26 中国人民解放军国防科技大学 一种稀疏矩阵乘指令实现方法及系统
CN119646369B (zh) * 2024-11-28 2025-11-25 西安交通大学 一种基于张量核心加速任意精度稀疏矩阵乘加运算的方法
CN119808860B (zh) * 2025-03-17 2025-07-08 上海燧原科技股份有限公司 混合专家模型的优化方法、装置、设备、介质及程序
CN120336834B (zh) * 2025-03-20 2025-10-31 上海期智研究院 基于无规则稀疏模型的加速方法、装置、电子设备和介质

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5673407A (en) * 1994-03-08 1997-09-30 Texas Instruments Incorporated Data processor having capability to perform both floating point operations and memory access in response to a single instruction
JP3790307B2 (ja) * 1996-10-16 2006-06-28 株式会社ルネサステクノロジ データプロセッサ及びデータ処理システム
US7725521B2 (en) * 2001-10-29 2010-05-25 Intel Corporation Method and apparatus for computing matrix transformations
US8775495B2 (en) * 2006-02-13 2014-07-08 Indiana University Research And Technology Compression system and method for accelerating sparse matrix computations
US8577948B2 (en) 2010-09-20 2013-11-05 Intel Corporation Split path multiply accumulate unit
US20150277904A1 (en) * 2014-03-28 2015-10-01 Roger Espasa Method and apparatus for performing a plurality of multiplication operations
US10275247B2 (en) * 2015-03-28 2019-04-30 Intel Corporation Apparatuses and methods to accelerate vector multiplication of vector elements having matching indices
US20160378465A1 (en) * 2015-06-23 2016-12-29 Intel Corporation Efficient sparse array handling in a processor
US9558156B1 (en) * 2015-11-24 2017-01-31 International Business Machines Corporation Sparse matrix multiplication using a single field programmable gate array module
US20170337156A1 (en) * 2016-04-26 2017-11-23 Onnivation Llc Computing machine architecture for matrix and array processing
US10891538B2 (en) * 2016-08-11 2021-01-12 Nvidia Corporation Sparse convolutional neural network accelerator
CN107239823A (zh) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 一种用于实现稀疏神经网络的装置和方法
US10360163B2 (en) * 2016-10-27 2019-07-23 Google Llc Exploiting input data sparsity in neural network compute units
US11003985B2 (en) * 2016-11-07 2021-05-11 Electronics And Telecommunications Research Institute Convolutional neural network system and operation method thereof
KR102499396B1 (ko) * 2017-03-03 2023-02-13 삼성전자 주식회사 뉴럴 네트워크 장치 및 뉴럴 네트워크 장치의 동작 방법
US10331762B1 (en) * 2017-12-07 2019-06-25 International Business Machines Corporation Stream processing for LU decomposition
US20190278600A1 (en) * 2018-03-09 2019-09-12 Nvidia Corporation Tiled compressed sparse matrix format

Also Published As

Publication number Publication date
EP3779681A3 (en) 2021-02-24
EP4462250A2 (en) 2024-11-13
US10984074B2 (en) 2021-04-20
US11829440B2 (en) 2023-11-28
CN121030150A (zh) 2025-11-28
US10572568B2 (en) 2020-02-25
US20240070226A1 (en) 2024-02-29
EP4592827A2 (en) 2025-07-30
US20250272354A1 (en) 2025-08-28
US20190042542A1 (en) 2019-02-07
CN112069459A (zh) 2020-12-11
EP3779681B1 (en) 2024-04-10
EP3779681A2 (en) 2021-02-17
EP4592827A3 (en) 2025-10-15
CN119377541A (zh) 2025-01-28
CN112069459B (zh) 2024-06-04
CN110321525A (zh) 2019-10-11
ES2982493T3 (es) 2024-10-16
FI3779681T3 (fi) 2024-06-28
US20250272355A1 (en) 2025-08-28
DK3779681T3 (da) 2024-07-08
EP3547114B1 (en) 2025-01-15
PL3779681T3 (pl) 2024-07-29
ES3019657T3 (en) 2025-05-21
US20200334323A1 (en) 2020-10-22
EP3547114A1 (en) 2019-10-02
US20210342417A1 (en) 2021-11-04
US20200265107A1 (en) 2020-08-20
EP4462250A3 (en) 2025-03-05
US10867009B2 (en) 2020-12-15

Similar Documents

Publication Publication Date Title
PL3547114T3 (pl) Akcelerator do mnożenia macierzy rzadkiej przez gęstą
SG11202000140QA (en) Operation accelerator
GB201710332D0 (en) Register-based matrix multiplication
GB2582094B (en) Matrix computation engine
EP3832499C0 (en) MATRIX CALCULATION DEVICE
EP3284092A4 (en) Crossbar arrays for calculating matrix multiplication
EP3262651A4 (en) Crossbar arrays for calculating matrix multiplication
GB201819748D0 (en) Afterburner system
GB2547235B (en) Haptic pedal
GB201818063D0 (en) An actuation system
SG11202102475RA (en) Reinforced film for biocontainers
GB201908691D0 (en) Workbench system
GB201908692D0 (en) Workbench system
GB201809704D0 (en) Hardware accelerator
GB201805845D0 (en) Hydraulic manifold
GB2585518B (en) Simulation system
SG11202007612WA (en) Simulation system
SG11202007045SA (en) Fuel system
GB201705539D0 (en) Powertrain components
PL3938409T3 (pl) Kombinacja przyspieszaczy
GB201909160D0 (en) Haptic system
GB201719928D0 (en) Accelerator
GB201910375D0 (en) Computer arrangement
GB201803002D0 (en) Gyrokinetic powertrain
GB201820494D0 (en) Quick weigh