PL3547114T3 - Akcelerator do mnożenia macierzy rzadkiej przez gęstą - Google Patents
Akcelerator do mnożenia macierzy rzadkiej przez gęstąInfo
- Publication number
- PL3547114T3 PL3547114T3 PL19157044.9T PL19157044T PL3547114T3 PL 3547114 T3 PL3547114 T3 PL 3547114T3 PL 19157044 T PL19157044 T PL 19157044T PL 3547114 T3 PL3547114 T3 PL 3547114T3
- Authority
- PL
- Poland
- Prior art keywords
- sparse
- accelerator
- matrix multiplication
- dense matrix
- dense
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30032—Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
- G06F9/30038—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
- G06F9/30043—LOAD or STORE instructions; Clear instruction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/3016—Decoding the operand specifier, e.g. specifier format
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Complex Calculations (AREA)
- Advance Control (AREA)
- Executing Machine-Instructions (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/938,924 US10572568B2 (en) | 2018-03-28 | 2018-03-28 | Accelerator for sparse-dense matrix multiplication |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| PL3547114T3 true PL3547114T3 (pl) | 2025-06-09 |
Family
ID=65231633
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PL20199012.4T PL3779681T3 (pl) | 2018-03-28 | 2019-02-13 | Akcelerator do mnożenia macierzy rzadkich przez gęste |
| PL19157044.9T PL3547114T3 (pl) | 2018-03-28 | 2019-02-13 | Akcelerator do mnożenia macierzy rzadkiej przez gęstą |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PL20199012.4T PL3779681T3 (pl) | 2018-03-28 | 2019-02-13 | Akcelerator do mnożenia macierzy rzadkich przez gęste |
Country Status (7)
| Country | Link |
|---|---|
| US (7) | US10572568B2 (pl) |
| EP (4) | EP4462250A3 (pl) |
| CN (4) | CN112069459B (pl) |
| DK (1) | DK3779681T3 (pl) |
| ES (2) | ES3019657T3 (pl) |
| FI (1) | FI3779681T3 (pl) |
| PL (2) | PL3779681T3 (pl) |
Families Citing this family (87)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018174936A1 (en) | 2017-03-20 | 2018-09-27 | Intel Corporation | Systems, methods, and apparatuses for tile matrix multiplication and accumulation |
| US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
| US10474458B2 (en) | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
| US11275588B2 (en) | 2017-07-01 | 2022-03-15 | Intel Corporation | Context save with variable save state size |
| WO2019090325A1 (en) | 2017-11-06 | 2019-05-09 | Neuralmagic, Inc. | Methods and systems for improved transforms in convolutional neural networks |
| US11715287B2 (en) | 2017-11-18 | 2023-08-01 | Neuralmagic Inc. | Systems and methods for exchange of data in distributed training of machine learning algorithms |
| US10691772B2 (en) * | 2018-04-20 | 2020-06-23 | Advanced Micro Devices, Inc. | High-performance sparse triangular solve on graphics processing units |
| GB2574060B (en) * | 2018-05-25 | 2022-11-23 | Myrtle Software Ltd | Processing matrix vector multiplication |
| US11449363B2 (en) | 2018-05-31 | 2022-09-20 | Neuralmagic Inc. | Systems and methods for improved neural network execution |
| US10963787B2 (en) * | 2018-05-31 | 2021-03-30 | Neuralmagic Inc. | Systems and methods for generation of sparse code for convolutional neural networks |
| US10832133B2 (en) | 2018-05-31 | 2020-11-10 | Neuralmagic Inc. | System and method of executing neural networks |
| US11216732B2 (en) | 2018-05-31 | 2022-01-04 | Neuralmagic Inc. | Systems and methods for generation of sparse code for convolutional neural networks |
| US10620951B2 (en) * | 2018-06-22 | 2020-04-14 | Intel Corporation | Matrix multiplication acceleration of sparse matrices using column folding and squeezing |
| WO2020046859A1 (en) | 2018-08-27 | 2020-03-05 | Neuralmagic Inc. | Systems and methods for neural network convolutional layer matrix multiplication using cache memory |
| US10719323B2 (en) | 2018-09-27 | 2020-07-21 | Intel Corporation | Systems and methods for performing matrix compress and decompress instructions |
| US11636343B2 (en) | 2018-10-01 | 2023-04-25 | Neuralmagic Inc. | Systems and methods for neural network pruning with accuracy preservation |
| US12008475B2 (en) * | 2018-11-14 | 2024-06-11 | Nvidia Corporation | Transposed sparse matrix multiply by dense matrix for neural network training |
| US11663001B2 (en) * | 2018-11-19 | 2023-05-30 | Advanced Micro Devices, Inc. | Family of lossy sparse load SIMD instructions |
| US20200210517A1 (en) | 2018-12-27 | 2020-07-02 | Intel Corporation | Systems and methods to accelerate multiplication of sparse matrices |
| US11544559B2 (en) | 2019-01-08 | 2023-01-03 | Neuralmagic Inc. | System and method for executing convolution in a neural network |
| CN113424148A (zh) | 2019-03-15 | 2021-09-21 | 英特尔公司 | 用于检测跨分片访问、提供多分片推理缩放和提供最佳页迁移的多分片存储器管理 |
| WO2020190796A1 (en) | 2019-03-15 | 2020-09-24 | Intel Corporation | Systems and methods for cache optimization |
| CN112905241B (zh) | 2019-03-15 | 2024-03-29 | 英特尔公司 | 用于矩阵加速器架构的稀疏优化 |
| US11934342B2 (en) | 2019-03-15 | 2024-03-19 | Intel Corporation | Assistance for hardware prefetch in cache access |
| US11392376B2 (en) | 2019-04-11 | 2022-07-19 | Arm Limited | Processor for sparse matrix computation |
| US11127167B2 (en) * | 2019-04-29 | 2021-09-21 | Nvidia Corporation | Efficient matrix format suitable for neural networks |
| US11379556B2 (en) * | 2019-05-21 | 2022-07-05 | Arm Limited | Apparatus and method for matrix operations |
| US11403097B2 (en) | 2019-06-26 | 2022-08-02 | Intel Corporation | Systems and methods to skip inconsequential matrix operations |
| US12353846B2 (en) * | 2019-07-09 | 2025-07-08 | MemryX | Matrix data reuse techniques in multiply and accumulate units of processing system |
| US11195095B2 (en) | 2019-08-08 | 2021-12-07 | Neuralmagic Inc. | System and method of accelerating execution of a neural network |
| WO2021040921A1 (en) * | 2019-08-29 | 2021-03-04 | Alibaba Group Holding Limited | Systems and methods for providing vector-wise sparsity in a neural network |
| WO2021058578A1 (en) * | 2019-09-25 | 2021-04-01 | Deepmind Technologies Limited | Fast sparse neural networks |
| KR20210045224A (ko) * | 2019-10-16 | 2021-04-26 | 삼성전자주식회사 | 데이터를 처리하는 방법 및 장치 |
| CN110766136B (zh) * | 2019-10-16 | 2022-09-09 | 北京航空航天大学 | 一种稀疏矩阵与向量的压缩方法 |
| CN110889259B (zh) * | 2019-11-06 | 2021-07-09 | 北京中科胜芯科技有限公司 | 针对排列的块对角权重矩阵的稀疏矩阵向量乘法计算单元 |
| US11861761B2 (en) | 2019-11-15 | 2024-01-02 | Intel Corporation | Graphics processing unit processing and caching improvements |
| US11537859B2 (en) * | 2019-12-06 | 2022-12-27 | International Business Machines Corporation | Flexible precision neural inference processing unit |
| US11372644B2 (en) * | 2019-12-09 | 2022-06-28 | Meta Platforms, Inc. | Matrix processing instruction with optional up/down sampling of matrix |
| CN113094099A (zh) * | 2019-12-23 | 2021-07-09 | 超威半导体(上海)有限公司 | 矩阵数据广播架构 |
| KR102788804B1 (ko) * | 2019-12-27 | 2025-03-31 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
| US11829439B2 (en) * | 2019-12-30 | 2023-11-28 | Qualcomm Incorporated | Methods and apparatus to perform matrix multiplication in a streaming processor |
| CN111240743B (zh) * | 2020-01-03 | 2022-06-03 | 格兰菲智能科技有限公司 | 人工智能集成电路 |
| US11586601B2 (en) * | 2020-02-05 | 2023-02-21 | Alibaba Group Holding Limited | Apparatus and method for representation of a sparse matrix in a neural network |
| US11226816B2 (en) | 2020-02-12 | 2022-01-18 | Samsung Electronics Co., Ltd. | Systems and methods for data placement for in-memory-compute |
| US11281554B2 (en) | 2020-03-17 | 2022-03-22 | Samsung Electronics Co., Ltd. | System and method for in-memory computation |
| DE102020131666A1 (de) | 2020-05-05 | 2021-11-11 | Intel Corporation | Skalierbare Multiplikationsbeschleunigung dünnbesetzter Matrizen unter Verwendung systolischer Arrays mit Rückkopplungseingaben |
| US11204977B2 (en) * | 2020-05-05 | 2021-12-21 | Intel Corporation | Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs |
| CN115989505A (zh) | 2020-07-21 | 2023-04-18 | 多伦多大学管理委员会 | 使用稀疏性来加速深度学习网络的系统和方法 |
| CN112199636B (zh) * | 2020-10-15 | 2022-10-28 | 清华大学 | 适用于微处理器的快速卷积方法及装置 |
| JP7566931B2 (ja) * | 2020-11-19 | 2024-10-15 | グーグル エルエルシー | 出力後処理を伴うシストリックアレイセル |
| US11556757B1 (en) | 2020-12-10 | 2023-01-17 | Neuralmagic Ltd. | System and method of executing deep tensor columns in neural networks |
| US12493779B2 (en) * | 2020-12-15 | 2025-12-09 | The George Washington University | SGCNAX: a scalable graph convolutional neural network accelerator with workload balancing |
| US20220197595A1 (en) * | 2020-12-21 | 2022-06-23 | Intel Corporation | Efficient multiply and accumulate instruction when an operand is equal to or near a power of two |
| CN114692074A (zh) * | 2020-12-25 | 2022-07-01 | 安徽寒武纪信息科技有限公司 | 矩阵乘法电路、方法及相关产品 |
| CN119149890A (zh) * | 2020-12-30 | 2024-12-17 | 华为技术有限公司 | 一种矩阵计算装置、方法、系统、电路、芯片及设备 |
| CN115885479A (zh) * | 2021-01-20 | 2023-03-31 | 辉达公司 | 执行无线通信信号数据解码的技术 |
| CN112835552A (zh) * | 2021-01-26 | 2021-05-25 | 算筹信息科技有限公司 | 一种外积累加求解稀疏矩阵与稠密矩阵内积的方法 |
| CN112799635B (zh) * | 2021-02-08 | 2022-11-15 | 算筹(深圳)信息科技有限公司 | 一种新型外积累加求解稠密矩阵与稀疏矩阵内积的方法 |
| US12141438B2 (en) * | 2021-02-25 | 2024-11-12 | Alibaba Group Holding Limited | Zero skipping techniques for reducing data movement |
| US12400120B2 (en) * | 2021-03-04 | 2025-08-26 | Samsung Electronics Co., Ltd. | Method and apparatus with neural network operation using sparsification |
| US20220318013A1 (en) * | 2021-03-25 | 2022-10-06 | Intel Corporation | Supporting 8-bit floating point format operands in a computing architecture |
| CN115461759A (zh) * | 2021-04-09 | 2022-12-09 | 辉达公司 | 增加数据集的稀疏性 |
| TWI847030B (zh) | 2021-05-05 | 2024-07-01 | 創鑫智慧股份有限公司 | 矩陣乘法器及其操作方法 |
| US20220366007A1 (en) * | 2021-05-13 | 2022-11-17 | Nvidia Corporation | Performing matrix value indication |
| US12189710B2 (en) * | 2021-05-25 | 2025-01-07 | Google Llc | Sparse matrix multiplication in hardware |
| CN113377534A (zh) * | 2021-06-08 | 2021-09-10 | 东南大学 | 一种基于csr格式的高性能稀疏矩阵向量乘法计算方法 |
| US20230008777A1 (en) * | 2021-07-09 | 2023-01-12 | Waymo Llc | Accelerating convolutions for sparse inputs |
| WO2023003737A2 (en) * | 2021-07-23 | 2023-01-26 | Cryptography Research, Inc. | Multi-lane cryptographic engine and operations thereof |
| US11443014B1 (en) * | 2021-08-23 | 2022-09-13 | SambaNova Systems, Inc. | Sparse matrix multiplier in hardware and a reconfigurable data processor including same |
| US20230102279A1 (en) * | 2021-09-25 | 2023-03-30 | Intel Corporation | Apparatuses, methods, and systems for instructions for structured-sparse tile matrix fma |
| CN113870918B (zh) * | 2021-09-30 | 2023-03-28 | 华中科技大学 | 存内稀疏矩阵乘法运算方法、方程求解方法以及求解器 |
| US11960982B1 (en) | 2021-10-21 | 2024-04-16 | Neuralmagic, Inc. | System and method of determining and executing deep tensor columns in neural networks |
| US20230133305A1 (en) * | 2021-10-28 | 2023-05-04 | Kwai Inc. | Methods and devices for accelerating a transformer with a sparse attention pattern |
| US11941248B2 (en) * | 2021-12-13 | 2024-03-26 | Xilinx, Inc. | Compression of sparse tensors |
| CN119156618A (zh) * | 2022-05-18 | 2024-12-17 | 谷歌有限责任公司 | 在机器学习硬件加速器处利用数据稀疏性 |
| US12417100B2 (en) * | 2022-08-03 | 2025-09-16 | Intel Corporation | Instructions for structured-sparse tile matrix FMA |
| CN115310037A (zh) * | 2022-08-17 | 2022-11-08 | 平头哥(杭州)半导体有限公司 | 矩阵乘法计算单元、加速单元、计算系统和相关方法 |
| CN115481364B (zh) * | 2022-09-19 | 2025-06-10 | 浙江大学 | 基于gpu加速的大规模椭圆曲线多标量乘法的并行计算方法 |
| CN115578243B (zh) * | 2022-10-09 | 2024-01-05 | 北京中科通量科技有限公司 | 一种面向稀疏矩阵的膨胀处理方法 |
| TWI819937B (zh) * | 2022-12-28 | 2023-10-21 | 國立成功大學 | 應用於類神經網路的記憶內運算的加速器 |
| WO2024243796A1 (en) * | 2023-05-30 | 2024-12-05 | Intel Corporation | Methods and apparatus for matrix multiplication with reinforcement learning |
| CN119337040A (zh) * | 2023-07-21 | 2025-01-21 | 华为技术有限公司 | 计算装置、方法、设备、芯片及系统 |
| CN116821576B (zh) * | 2023-08-28 | 2023-12-26 | 英特尔(中国)研究中心有限公司 | 用于基于risc-v加速n:m稀疏网络的方法和装置 |
| CN117931131B (zh) * | 2024-03-22 | 2024-07-26 | 中国人民解放军国防科技大学 | 一种稀疏矩阵乘指令实现方法及系统 |
| CN119646369B (zh) * | 2024-11-28 | 2025-11-25 | 西安交通大学 | 一种基于张量核心加速任意精度稀疏矩阵乘加运算的方法 |
| CN119808860B (zh) * | 2025-03-17 | 2025-07-08 | 上海燧原科技股份有限公司 | 混合专家模型的优化方法、装置、设备、介质及程序 |
| CN120336834B (zh) * | 2025-03-20 | 2025-10-31 | 上海期智研究院 | 基于无规则稀疏模型的加速方法、装置、电子设备和介质 |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5673407A (en) * | 1994-03-08 | 1997-09-30 | Texas Instruments Incorporated | Data processor having capability to perform both floating point operations and memory access in response to a single instruction |
| JP3790307B2 (ja) * | 1996-10-16 | 2006-06-28 | 株式会社ルネサステクノロジ | データプロセッサ及びデータ処理システム |
| US7725521B2 (en) * | 2001-10-29 | 2010-05-25 | Intel Corporation | Method and apparatus for computing matrix transformations |
| US8775495B2 (en) * | 2006-02-13 | 2014-07-08 | Indiana University Research And Technology | Compression system and method for accelerating sparse matrix computations |
| US8577948B2 (en) | 2010-09-20 | 2013-11-05 | Intel Corporation | Split path multiply accumulate unit |
| US20150277904A1 (en) * | 2014-03-28 | 2015-10-01 | Roger Espasa | Method and apparatus for performing a plurality of multiplication operations |
| US10275247B2 (en) * | 2015-03-28 | 2019-04-30 | Intel Corporation | Apparatuses and methods to accelerate vector multiplication of vector elements having matching indices |
| US20160378465A1 (en) * | 2015-06-23 | 2016-12-29 | Intel Corporation | Efficient sparse array handling in a processor |
| US9558156B1 (en) * | 2015-11-24 | 2017-01-31 | International Business Machines Corporation | Sparse matrix multiplication using a single field programmable gate array module |
| US20170337156A1 (en) * | 2016-04-26 | 2017-11-23 | Onnivation Llc | Computing machine architecture for matrix and array processing |
| US10891538B2 (en) * | 2016-08-11 | 2021-01-12 | Nvidia Corporation | Sparse convolutional neural network accelerator |
| CN107239823A (zh) * | 2016-08-12 | 2017-10-10 | 北京深鉴科技有限公司 | 一种用于实现稀疏神经网络的装置和方法 |
| US10360163B2 (en) * | 2016-10-27 | 2019-07-23 | Google Llc | Exploiting input data sparsity in neural network compute units |
| US11003985B2 (en) * | 2016-11-07 | 2021-05-11 | Electronics And Telecommunications Research Institute | Convolutional neural network system and operation method thereof |
| KR102499396B1 (ko) * | 2017-03-03 | 2023-02-13 | 삼성전자 주식회사 | 뉴럴 네트워크 장치 및 뉴럴 네트워크 장치의 동작 방법 |
| US10331762B1 (en) * | 2017-12-07 | 2019-06-25 | International Business Machines Corporation | Stream processing for LU decomposition |
| US20190278600A1 (en) * | 2018-03-09 | 2019-09-12 | Nvidia Corporation | Tiled compressed sparse matrix format |
-
2018
- 2018-03-28 US US15/938,924 patent/US10572568B2/en active Active
-
2019
- 2019-02-13 EP EP24203575.6A patent/EP4462250A3/en active Pending
- 2019-02-13 ES ES19157044T patent/ES3019657T3/es active Active
- 2019-02-13 EP EP25183176.4A patent/EP4592827A3/en active Pending
- 2019-02-13 PL PL20199012.4T patent/PL3779681T3/pl unknown
- 2019-02-13 DK DK20199012.4T patent/DK3779681T3/da active
- 2019-02-13 ES ES20199012T patent/ES2982493T3/es active Active
- 2019-02-13 PL PL19157044.9T patent/PL3547114T3/pl unknown
- 2019-02-13 EP EP20199012.4A patent/EP3779681B1/en active Active
- 2019-02-13 EP EP19157044.9A patent/EP3547114B1/en active Active
- 2019-02-13 FI FIEP20199012.4T patent/FI3779681T3/fi active
- 2019-03-25 CN CN202010951887.8A patent/CN112069459B/zh active Active
- 2019-03-25 CN CN202411416040.4A patent/CN119377541A/zh active Pending
- 2019-03-25 CN CN201910227563.7A patent/CN110321525A/zh active Pending
- 2019-03-25 CN CN202511151775.3A patent/CN121030150A/zh active Pending
-
2020
- 2020-02-24 US US16/799,586 patent/US10984074B2/en active Active
- 2020-07-06 US US16/921,823 patent/US10867009B2/en active Active
-
2021
- 2021-04-13 US US17/229,550 patent/US11829440B2/en active Active
-
2023
- 2023-11-09 US US18/388,507 patent/US20240070226A1/en active Pending
-
2025
- 2025-05-14 US US19/207,914 patent/US20250272354A1/en active Pending
- 2025-05-14 US US19/207,972 patent/US20250272355A1/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| EP3779681A3 (en) | 2021-02-24 |
| EP4462250A2 (en) | 2024-11-13 |
| US10984074B2 (en) | 2021-04-20 |
| US11829440B2 (en) | 2023-11-28 |
| CN121030150A (zh) | 2025-11-28 |
| US10572568B2 (en) | 2020-02-25 |
| US20240070226A1 (en) | 2024-02-29 |
| EP4592827A2 (en) | 2025-07-30 |
| US20250272354A1 (en) | 2025-08-28 |
| US20190042542A1 (en) | 2019-02-07 |
| CN112069459A (zh) | 2020-12-11 |
| EP3779681B1 (en) | 2024-04-10 |
| EP3779681A2 (en) | 2021-02-17 |
| EP4592827A3 (en) | 2025-10-15 |
| CN119377541A (zh) | 2025-01-28 |
| CN112069459B (zh) | 2024-06-04 |
| CN110321525A (zh) | 2019-10-11 |
| ES2982493T3 (es) | 2024-10-16 |
| FI3779681T3 (fi) | 2024-06-28 |
| US20250272355A1 (en) | 2025-08-28 |
| DK3779681T3 (da) | 2024-07-08 |
| EP3547114B1 (en) | 2025-01-15 |
| PL3779681T3 (pl) | 2024-07-29 |
| ES3019657T3 (en) | 2025-05-21 |
| US20200334323A1 (en) | 2020-10-22 |
| EP3547114A1 (en) | 2019-10-02 |
| US20210342417A1 (en) | 2021-11-04 |
| US20200265107A1 (en) | 2020-08-20 |
| EP4462250A3 (en) | 2025-03-05 |
| US10867009B2 (en) | 2020-12-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| PL3547114T3 (pl) | Akcelerator do mnożenia macierzy rzadkiej przez gęstą | |
| SG11202000140QA (en) | Operation accelerator | |
| GB201710332D0 (en) | Register-based matrix multiplication | |
| GB2582094B (en) | Matrix computation engine | |
| EP3832499C0 (en) | MATRIX CALCULATION DEVICE | |
| EP3284092A4 (en) | Crossbar arrays for calculating matrix multiplication | |
| EP3262651A4 (en) | Crossbar arrays for calculating matrix multiplication | |
| GB201819748D0 (en) | Afterburner system | |
| GB2547235B (en) | Haptic pedal | |
| GB201818063D0 (en) | An actuation system | |
| SG11202102475RA (en) | Reinforced film for biocontainers | |
| GB201908691D0 (en) | Workbench system | |
| GB201908692D0 (en) | Workbench system | |
| GB201809704D0 (en) | Hardware accelerator | |
| GB201805845D0 (en) | Hydraulic manifold | |
| GB2585518B (en) | Simulation system | |
| SG11202007612WA (en) | Simulation system | |
| SG11202007045SA (en) | Fuel system | |
| GB201705539D0 (en) | Powertrain components | |
| PL3938409T3 (pl) | Kombinacja przyspieszaczy | |
| GB201909160D0 (en) | Haptic system | |
| GB201719928D0 (en) | Accelerator | |
| GB201910375D0 (en) | Computer arrangement | |
| GB201803002D0 (en) | Gyrokinetic powertrain | |
| GB201820494D0 (en) | Quick weigh |