CN119278445A - 使用包括处理元件阵列的卷积引擎执行的矩阵乘法 - Google Patents
使用包括处理元件阵列的卷积引擎执行的矩阵乘法 Download PDFInfo
- Publication number
- CN119278445A CN119278445A CN202380043098.6A CN202380043098A CN119278445A CN 119278445 A CN119278445 A CN 119278445A CN 202380043098 A CN202380043098 A CN 202380043098A CN 119278445 A CN119278445 A CN 119278445A
- Authority
- CN
- China
- Prior art keywords
- matrix
- processor
- processing
- processing element
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
- G06F17/153—Multidimensional correlation or convolution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263336586P | 2022-04-29 | 2022-04-29 | |
| US63/336,586 | 2022-04-29 | ||
| PCT/US2023/020213 WO2023212203A1 (en) | 2022-04-29 | 2023-04-27 | Matrix multiplication performed using convolution engine which includes array of processing elements |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN119278445A true CN119278445A (zh) | 2025-01-07 |
Family
ID=86469086
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202380043098.6A Pending CN119278445A (zh) | 2022-04-29 | 2023-04-27 | 使用包括处理元件阵列的卷积引擎执行的矩阵乘法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20250284767A1 (https=) |
| EP (1) | EP4515426A1 (https=) |
| JP (1) | JP2025514088A (https=) |
| KR (1) | KR20250002449A (https=) |
| CN (1) | CN119278445A (https=) |
| WO (1) | WO2023212203A1 (https=) |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11157287B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system with variable latency memory access |
| US11157441B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system using non-consecutive data formatting |
| US11409692B2 (en) | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
| US11256977B2 (en) * | 2017-12-29 | 2022-02-22 | Facebook, Inc. | Lowering hardware for neural networks |
| EP3674982A1 (en) * | 2018-12-27 | 2020-07-01 | IMEC vzw | Hardware accelerator architecture for convolutional neural network |
-
2023
- 2023-04-27 EP EP23725527.8A patent/EP4515426A1/en active Pending
- 2023-04-27 CN CN202380043098.6A patent/CN119278445A/zh active Pending
- 2023-04-27 JP JP2024562065A patent/JP2025514088A/ja active Pending
- 2023-04-27 US US18/859,039 patent/US20250284767A1/en active Pending
- 2023-04-27 KR KR1020247037544A patent/KR20250002449A/ko active Pending
- 2023-04-27 WO PCT/US2023/020213 patent/WO2023212203A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| US20250284767A1 (en) | 2025-09-11 |
| WO2023212203A1 (en) | 2023-11-02 |
| KR20250002449A (ko) | 2025-01-07 |
| JP2025514088A (ja) | 2025-05-02 |
| EP4515426A1 (en) | 2025-03-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11698773B2 (en) | Accelerated mathematical engine | |
| US12174910B2 (en) | Methods and systems for implementing a convolution transpose layer of a neural network | |
| KR20200081044A (ko) | 뉴럴 네트워크의 컨볼루션 연산을 처리하는 방법 및 장치 | |
| TW202123093A (zh) | 實行卷積運算的系統及方法 | |
| CN111758107A (zh) | 用于基于硬件的池化的系统和方法 | |
| CN110050267A (zh) | 用于数据管理的系统和方法 | |
| EP3093757B1 (en) | Multi-dimensional sliding window operation for a vector processor | |
| EP3796190A1 (en) | Memory device and method | |
| EP4345691A1 (en) | Methods and systems for performing channel equalisation on a convolution layer in a neural network | |
| CN112836793B (zh) | 浮点可分离卷积计算加速装置、系统以及图像处理方法 | |
| EP4402611A1 (en) | Eliminating memory bottlenecks for depthwise convolutions | |
| CN119278445A (zh) | 使用包括处理元件阵列的卷积引擎执行的矩阵乘法 | |
| Devendran et al. | Optimization of the Convolution Operation to Accelerate Deep Neural Networks in FPGA. | |
| US12579413B2 (en) | Method and apparatus for performing convolution neural network operations | |
| US20250209132A1 (en) | Efficient multiply-accumulate units for convolutional neural network processing including max pooling | |
| US20250307206A1 (en) | Efficient selection of single instruction multiple data operations for neural processing units | |
| US20250231742A1 (en) | Transposing information using shadow latches and active latches for efficient die area in processing system | |
| CN119091446B (zh) | 图像特征提取方法、装置及系统 | |
| EP4485281A1 (en) | Activation accelerator for neural network accelerator | |
| EP4361889A1 (en) | Implementing a scatter function on a neural network accelerator |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |