JP7408671B2 - シストリックアレイに対するブロックスパース演算のためのアーキテクチャ - Google Patents
シストリックアレイに対するブロックスパース演算のためのアーキテクチャ Download PDFInfo
- Publication number
- JP7408671B2 JP7408671B2 JP2021547450A JP2021547450A JP7408671B2 JP 7408671 B2 JP7408671 B2 JP 7408671B2 JP 2021547450 A JP2021547450 A JP 2021547450A JP 2021547450 A JP2021547450 A JP 2021547450A JP 7408671 B2 JP7408671 B2 JP 7408671B2
- Authority
- JP
- Japan
- Prior art keywords
- memory
- graphics
- processor
- matrix
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8046—Systolic arrays
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/3016—Decoding the operand specifier, e.g. specifier format
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
- G06F9/38885—Divergence aspects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0495—Quantised networks; Sparse networks; Compressed networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Computer Hardware Design (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Neurology (AREA)
- Image Generation (AREA)
- Image Processing (AREA)
- Complex Calculations (AREA)
- Advance Control (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Executing Machine-Instructions (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (9)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962819337P | 2019-03-15 | 2019-03-15 | |
| US201962819361P | 2019-03-15 | 2019-03-15 | |
| US201962819435P | 2019-03-15 | 2019-03-15 | |
| US62/819,361 | 2019-03-15 | ||
| US62/819,435 | 2019-03-15 | ||
| US62/819,337 | 2019-03-15 | ||
| US201962935670P | 2019-11-15 | 2019-11-15 | |
| US62/935,670 | 2019-11-15 | ||
| PCT/US2020/022847 WO2020190809A1 (en) | 2019-03-15 | 2020-03-14 | Architecture for block sparse operations on a systolic array |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2022523761A JP2022523761A (ja) | 2022-04-26 |
| JPWO2020190809A5 JPWO2020190809A5 (enExample) | 2022-07-13 |
| JP7408671B2 true JP7408671B2 (ja) | 2024-01-05 |
Family
ID=70285850
Family Applications (4)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021547450A Active JP7408671B2 (ja) | 2019-03-15 | 2020-03-14 | シストリックアレイに対するブロックスパース演算のためのアーキテクチャ |
| JP2021547288A Active JP7494197B2 (ja) | 2019-03-15 | 2020-03-14 | 行列アクセラレータアーキテクチャ内のシストリック分解 |
| JP2021547452A Active JP7423644B2 (ja) | 2019-03-15 | 2020-03-14 | 行列アクセラレータアーキテクチャのためのスパース最適化 |
| JP2024006026A Active JP7717863B2 (ja) | 2019-03-15 | 2024-01-18 | 行列アクセラレータアーキテクチャのためのスパース最適化 |
Family Applications After (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021547288A Active JP7494197B2 (ja) | 2019-03-15 | 2020-03-14 | 行列アクセラレータアーキテクチャ内のシストリック分解 |
| JP2021547452A Active JP7423644B2 (ja) | 2019-03-15 | 2020-03-14 | 行列アクセラレータアーキテクチャのためのスパース最適化 |
| JP2024006026A Active JP7717863B2 (ja) | 2019-03-15 | 2024-01-18 | 行列アクセラレータアーキテクチャのためのスパース最適化 |
Country Status (14)
| Country | Link |
|---|---|
| US (8) | US11113784B2 (enExample) |
| EP (4) | EP3938888A1 (enExample) |
| JP (4) | JP7408671B2 (enExample) |
| KR (3) | KR20210135999A (enExample) |
| CN (5) | CN113383310A (enExample) |
| AU (1) | AU2020241262B2 (enExample) |
| BR (2) | BR112021016106A2 (enExample) |
| DE (2) | DE112020000846T5 (enExample) |
| DK (1) | DK3938890T3 (enExample) |
| ES (1) | ES3041900T3 (enExample) |
| FI (1) | FI3938890T3 (enExample) |
| PL (1) | PL3938890T3 (enExample) |
| SG (1) | SG11202107290QA (enExample) |
| WO (3) | WO2020190808A1 (enExample) |
Families Citing this family (94)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
| US10474458B2 (en) | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
| KR102559581B1 (ko) * | 2018-05-23 | 2023-07-25 | 삼성전자주식회사 | 재구성 가능 로직을 포함하는 스토리지 장치 및 상기 스토리지 장치의 동작 방법 |
| US10719323B2 (en) | 2018-09-27 | 2020-07-21 | Intel Corporation | Systems and methods for performing matrix compress and decompress instructions |
| US20200210517A1 (en) | 2018-12-27 | 2020-07-02 | Intel Corporation | Systems and methods to accelerate multiplication of sparse matrices |
| US11934342B2 (en) | 2019-03-15 | 2024-03-19 | Intel Corporation | Assistance for hardware prefetch in cache access |
| WO2020190796A1 (en) | 2019-03-15 | 2020-09-24 | Intel Corporation | Systems and methods for cache optimization |
| EP4270201A3 (en) | 2019-03-15 | 2024-01-31 | INTEL Corporation | Memory controller management techniques |
| EP3938888A1 (en) * | 2019-03-15 | 2022-01-19 | INTEL Corporation | Systolic disaggregation within a matrix accelerator architecture |
| US11392376B2 (en) * | 2019-04-11 | 2022-07-19 | Arm Limited | Processor for sparse matrix computation |
| US11222092B2 (en) | 2019-07-16 | 2022-01-11 | Facebook Technologies, Llc | Optimization for deconvolution |
| KR102213258B1 (ko) * | 2019-07-29 | 2021-02-08 | 한국전자기술연구원 | 효율적인 명령어 처리를 위한 프로세싱-인-메모리 제어 방법 및 이를 적용한 연산장치 |
| US11663746B2 (en) | 2019-11-15 | 2023-05-30 | Intel Corporation | Systolic arithmetic on sparse data |
| CN112899740B (zh) * | 2019-11-15 | 2022-04-19 | 源秩科技(上海)有限公司 | 基于电化学的加工装置和方法 |
| US11861761B2 (en) | 2019-11-15 | 2024-01-02 | Intel Corporation | Graphics processing unit processing and caching improvements |
| CN111176582A (zh) * | 2019-12-31 | 2020-05-19 | 北京百度网讯科技有限公司 | 矩阵存储方法、矩阵访问方法、装置和电子设备 |
| US11586601B2 (en) * | 2020-02-05 | 2023-02-21 | Alibaba Group Holding Limited | Apparatus and method for representation of a sparse matrix in a neural network |
| US12361266B2 (en) * | 2020-05-14 | 2025-07-15 | Samsung Electronics Co., Ltd. | Hierarchical weight preprocessing for neural network accelerator |
| US11615320B1 (en) | 2020-06-30 | 2023-03-28 | Cadence Design Systems, Inc. | Method, product, and apparatus for variable precision weight management for neural networks |
| US11823018B1 (en) | 2020-06-30 | 2023-11-21 | Cadence Design Systems, Inc. | Method, product, and apparatus for a machine learning process using weight sharing within a systolic array having reduced memory bandwidth |
| US11676068B1 (en) | 2020-06-30 | 2023-06-13 | Cadence Design Systems, Inc. | Method, product, and apparatus for a machine learning process leveraging input sparsity on a pixel by pixel basis |
| US11651283B1 (en) * | 2020-06-30 | 2023-05-16 | Cadence Design Systems, Inc. | Method, product, and apparatus for a machine learning process using dynamic rearrangement of sparse data and corresponding weights |
| US11687831B1 (en) | 2020-06-30 | 2023-06-27 | Cadence Design Systems, Inc. | Method, product, and apparatus for a multidimensional processing array for hardware acceleration of convolutional neural network inference |
| US11848980B2 (en) * | 2020-07-09 | 2023-12-19 | Boray Data Technology Co. Ltd. | Distributed pipeline configuration in a distributed computing system |
| IT202000016909A1 (it) * | 2020-07-13 | 2022-01-13 | St Microelectronics Srl | Procedimento di elaborazione dati implementato su elaboratore, sistema micro-controllore e prodotto informatico corrispondenti |
| US11620818B2 (en) * | 2020-10-01 | 2023-04-04 | Intel Corporation | Spatially sparse neural network accelerator for multi-dimension visual analytics |
| US12443835B2 (en) | 2020-10-05 | 2025-10-14 | Numenta, Inc. | Hardware architecture for processing data in sparse neural network |
| US12437199B2 (en) * | 2020-11-24 | 2025-10-07 | Arm Limited | Activation compression method for deep learning acceleration |
| KR20220161485A (ko) * | 2020-11-30 | 2022-12-06 | 구글 엘엘씨 | 다수의 누산기들을 구비한 시스톨릭 어레이 셀들 |
| US11977885B2 (en) * | 2020-11-30 | 2024-05-07 | Intel Corporation | Utilizing structured sparsity in systolic arrays |
| US12028094B2 (en) | 2020-12-23 | 2024-07-02 | Intel Corporation | Application programming interface for fine grained low latency decompression within processor core |
| US12106104B2 (en) * | 2020-12-23 | 2024-10-01 | Intel Corporation | Processor instructions for data compression and decompression |
| US12182018B2 (en) | 2020-12-23 | 2024-12-31 | Intel Corporation | Instruction and micro-architecture support for decompression on core |
| US20220222319A1 (en) * | 2021-01-14 | 2022-07-14 | Microsoft Technology Licensing, Llc | Compressed matrix with sparsity metadata |
| CN115244507A (zh) * | 2021-02-25 | 2022-10-25 | 阿里巴巴集团控股有限公司 | 用于减少数据移动的跳零稀疏技术 |
| CN113705069B (zh) * | 2021-02-26 | 2025-08-26 | 腾讯科技(深圳)有限公司 | 基于脉动阵列的浅深度模型的计算优化方法和装置 |
| US12308072B2 (en) * | 2021-03-10 | 2025-05-20 | Invention And Collaboration Laboratory Pte. Ltd. | Integrated scaling and stretching platform for optimizing monolithic integration and/or heterogeneous integration in a single semiconductor die |
| US20220300816A1 (en) * | 2021-03-19 | 2022-09-22 | Rebellions Inc. | Neural processing device and method for pruning thereof |
| DE112021007476T5 (de) * | 2021-04-09 | 2024-01-25 | Nvidia Corporation | Erhöhung der Spärlichkeit in Datensätzen |
| KR20240006684A (ko) * | 2021-05-13 | 2024-01-15 | 어드밴스드 마이크로 디바이시즈, 인코포레이티드 | 유연하고 확장가능한 그래프 처리 가속기 |
| CN113516172B (zh) * | 2021-05-19 | 2023-05-12 | 电子科技大学 | 基于随机计算贝叶斯神经网络误差注入的图像分类方法 |
| US12189710B2 (en) * | 2021-05-25 | 2025-01-07 | Google Llc | Sparse matrix multiplication in hardware |
| CN113076521B (zh) * | 2021-06-03 | 2021-09-21 | 沐曦集成电路(上海)有限公司 | 一种基于gpgpu可重构架构的方法及计算系统 |
| CN113268270B (zh) * | 2021-06-07 | 2022-10-21 | 中科计算技术西部研究院 | 一种针对成对隐马尔可夫模型的加速方法、系统及装置 |
| US11669331B2 (en) | 2021-06-17 | 2023-06-06 | International Business Machines Corporation | Neural network processing assist instruction |
| US12174783B2 (en) * | 2021-06-24 | 2024-12-24 | Intel Corporation | Systolic array of arbitrary physical and logical depth |
| US12190158B2 (en) * | 2021-06-25 | 2025-01-07 | Intel Corporation | Using sparsity metadata to reduce systolic array power consumption |
| US12399685B2 (en) * | 2021-06-25 | 2025-08-26 | Intel Corporation | Systolic array having support for output sparsity |
| US12346694B2 (en) | 2021-06-25 | 2025-07-01 | Intel Corporation | Register file for systolic array |
| US12189571B2 (en) * | 2021-06-25 | 2025-01-07 | Intel Corporation | Dual pipeline parallel systolic array |
| US12423379B2 (en) * | 2021-07-06 | 2025-09-23 | Google Llc | In situ sparse matrix expansion |
| US11941111B2 (en) | 2021-07-31 | 2024-03-26 | International Business Machines Corporation | Exploiting fine-grained structured weight sparsity in systolic arrays |
| US11429864B1 (en) * | 2021-08-16 | 2022-08-30 | Moffett International Co., Limited | System and method for bank-balanced sparse activation and joint-activation-weight-sparse training of neural networks |
| US12242851B2 (en) | 2021-09-09 | 2025-03-04 | Intel Corporation | Verifying compressed stream fused with copy or transform operations |
| US20230079975A1 (en) * | 2021-09-10 | 2023-03-16 | Arm Limited | Power management for system-on-chip |
| US20230102279A1 (en) * | 2021-09-25 | 2023-03-30 | Intel Corporation | Apparatuses, methods, and systems for instructions for structured-sparse tile matrix fma |
| US20230131961A1 (en) * | 2021-10-22 | 2023-04-27 | Nvidia Corporation | Application programming interface to configure processor partitioning |
| US11657260B2 (en) * | 2021-10-26 | 2023-05-23 | Edgecortix Pte. Ltd. | Neural network hardware accelerator data parallelism |
| CN114218152B (zh) * | 2021-12-06 | 2023-08-15 | 海飞科(南京)信息技术有限公司 | 流处理方法、处理电路和电子设备 |
| US12417182B2 (en) | 2021-12-14 | 2025-09-16 | Intel Corporation | De-prioritizing speculative code lines in on-chip caches |
| US12360768B2 (en) | 2021-12-16 | 2025-07-15 | Intel Corporation | Throttling code fetch for speculative code paths |
| US20230205704A1 (en) * | 2021-12-23 | 2023-06-29 | Intel Corporation | Distributed compression/decompression system |
| KR102859218B1 (ko) * | 2021-12-30 | 2025-09-15 | 서경대학교 산학협력단 | 딥러닝에 효과적인 simt 구조를 갖는 gpu |
| TWI824392B (zh) | 2022-01-21 | 2023-12-01 | 財團法人國家實驗研究院 | 適用於分散式深度學習計算的隨需即組共用資料快取方法、電腦程式、電腦可讀取媒體 |
| CN114780236B (zh) * | 2022-04-11 | 2025-06-27 | 上海壁仞科技股份有限公司 | 数据处理集成电路 |
| CN115034198B (zh) * | 2022-05-16 | 2023-05-12 | 北京百度网讯科技有限公司 | 语言模型中嵌入模块计算优化的方法 |
| JP2025516768A (ja) * | 2022-05-18 | 2025-05-30 | グーグル エルエルシー | 機械学習ハードウェアアクセラレータにおけるデータスパース性の活用 |
| US20230409326A1 (en) * | 2022-06-15 | 2023-12-21 | Intel Corporation | Device, method and system for executing a tile load and expand instruction |
| CN115563443B (zh) * | 2022-09-23 | 2025-10-03 | 上海壁仞科技股份有限公司 | 卷积运算方法及装置、卷积处理方法、设备与存储介质 |
| US12224774B2 (en) | 2022-11-16 | 2025-02-11 | Samsung Electronics Co., Ltd. | Runtime reconfigurable compression format conversion |
| US20240169469A1 (en) * | 2022-11-16 | 2024-05-23 | Nvidia Corporation | Application programming interface to transform information corresponding to a memory transaction |
| KR102864772B1 (ko) * | 2022-11-21 | 2025-09-24 | 연세대학교 산학협력단 | 근접 데이터 처리를 이용한 이종 멀티 코어 프로세서 및 시스템 |
| KR20240102798A (ko) | 2022-12-26 | 2024-07-03 | 리벨리온 주식회사 | 뉴럴 프로세서 및 이의 명령어 페치 방법 |
| KR102548582B1 (ko) * | 2022-12-26 | 2023-06-29 | 리벨리온 주식회사 | 뉴럴 프로세서 및 이의 명령어 페치 방법 |
| TWI831588B (zh) * | 2023-01-30 | 2024-02-01 | 創鑫智慧股份有限公司 | 神經網路演算裝置以及在神經網路演算中的數值轉換方法 |
| TWI830669B (zh) * | 2023-02-22 | 2024-01-21 | 旺宏電子股份有限公司 | 編碼方法及編碼電路 |
| JPWO2024203190A1 (enExample) * | 2023-03-30 | 2024-10-03 | ||
| US12353413B2 (en) | 2023-08-04 | 2025-07-08 | Optum, Inc. | Quality evaluation and augmentation of data provided by a federated query system |
| US12505246B2 (en) | 2023-08-04 | 2025-12-23 | Optum, Inc. | Attribute-level access control for federated queries |
| US12204538B1 (en) | 2023-09-06 | 2025-01-21 | Optum, Inc. | Dynamically tailored time intervals for federated query system |
| US12393593B2 (en) | 2023-09-12 | 2025-08-19 | Optum, Inc. | Priority-driven federated query-based data caching |
| CN119670639B (zh) * | 2023-09-20 | 2025-11-21 | 中国科学院深圳先进技术研究院 | 一种支持注意力机制的加速器 |
| CN117093816B (zh) * | 2023-10-19 | 2024-01-19 | 上海登临科技有限公司 | 矩阵乘运算方法、装置和电子设备 |
| US12045309B1 (en) * | 2023-11-29 | 2024-07-23 | Recogni Inc. | Systems and methods for performing matrix multiplication with a plurality of processing elements |
| US12504897B2 (en) | 2024-02-15 | 2025-12-23 | International Business Machines Corporation | Using an ensemble of data transformers to encode data before zero-value compression |
| US20250292358A1 (en) * | 2024-03-15 | 2025-09-18 | Intel Corporation | Hardware compression for sparse matrix content |
| US20250292354A1 (en) * | 2024-03-16 | 2025-09-18 | Intel Corporation | Systolic array matrix accelerator for graphics processing unit applications |
| CN118093143B (zh) * | 2024-04-12 | 2024-07-02 | 清华大学 | 大语言模型解码阶段的数据调度方法和装置 |
| KR20250160658A (ko) * | 2024-05-07 | 2025-11-14 | 리벨리온 주식회사 | 데이터 연산 장치 및 방법 |
| US20250370750A1 (en) * | 2024-05-31 | 2025-12-04 | Xilinx, Inc. | Unaligned load and store in a core |
| CN119761434A (zh) * | 2024-12-10 | 2025-04-04 | 电子科技大学 | 一种神经网络矩阵乘法加速器及其部署方法 |
| CN119537037B (zh) * | 2025-01-22 | 2025-05-02 | 山东浪潮科学研究院有限公司 | 基于gpgpu的多通道矩阵数据处理方法、设备及介质 |
| CN119537779B (zh) * | 2025-01-23 | 2025-04-18 | 山东浪潮科学研究院有限公司 | 一种gpgpu架构下的稀疏计算单元、方法、设备及介质 |
| CN120671862B (zh) * | 2025-08-21 | 2025-11-28 | 浙江大学 | 一种面向层次化贪心解码算法的专用加速器 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018125250A1 (en) | 2016-12-31 | 2018-07-05 | Intel Corporation | Systems, methods, and apparatuses for heterogeneous computing |
| WO2018213636A1 (en) | 2017-05-17 | 2018-11-22 | Google Llc | Performing matrix multiplication in hardware |
Family Cites Families (445)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US3872442A (en) | 1972-12-14 | 1975-03-18 | Sperry Rand Corp | System for conversion between coded byte and floating point format |
| US4476523A (en) | 1981-06-11 | 1984-10-09 | Data General Corporation | Fixed point and floating point computation units using commonly shared control fields |
| US4823252A (en) | 1986-03-28 | 1989-04-18 | Tandem Computers Incorporated | Overlapped control store |
| US5182801A (en) | 1989-06-09 | 1993-01-26 | Digital Equipment Corporation | Apparatus and method for providing fast data transfer between multiple devices through dynamic reconfiguration of the memory space of the devices |
| JP2581236B2 (ja) | 1989-11-16 | 1997-02-12 | 三菱電機株式会社 | データ処理装置 |
| DE4036455C1 (enExample) * | 1990-11-15 | 1992-04-02 | Siemens Ag, 8000 Muenchen, De | |
| JP2682232B2 (ja) | 1990-11-21 | 1997-11-26 | 松下電器産業株式会社 | 浮動小数点演算処理装置 |
| US5381539A (en) | 1992-06-04 | 1995-01-10 | Emc Corporation | System and method for dynamically controlling cache management |
| GB9307359D0 (en) | 1993-04-08 | 1993-06-02 | Int Computers Ltd | Cache replacement mechanism |
| US5450607A (en) | 1993-05-17 | 1995-09-12 | Mips Technologies Inc. | Unified floating point and integer datapath for a RISC processor |
| US5574928A (en) | 1993-10-29 | 1996-11-12 | Advanced Micro Devices, Inc. | Mixed integer/floating point processor core for a superscalar microprocessor with a plurality of operand buses for transferring operand segments |
| US5623636A (en) | 1993-11-09 | 1997-04-22 | Motorola Inc. | Data processing system and method for providing memory access protection using transparent translation registers and default attribute bits |
| US5627985A (en) | 1994-01-04 | 1997-05-06 | Intel Corporation | Speculative and committed resource files in an out-of-order processor |
| CN1107597A (zh) | 1994-02-24 | 1995-08-30 | 吴乾弥 | 管线式与心跳式及单指令多数据流的阵列处理架构及方法 |
| US5673407A (en) | 1994-03-08 | 1997-09-30 | Texas Instruments Incorporated | Data processor having capability to perform both floating point operations and memory access in response to a single instruction |
| GB2306271B (en) | 1994-06-22 | 1997-07-16 | Microsoft Corp | Data analyser |
| US5805475A (en) | 1995-02-10 | 1998-09-08 | International Business Machines Corporation | Load-store unit and method of loading and storing single-precision floating-point registers in a double-precision architecture |
| US5777629A (en) | 1995-03-24 | 1998-07-07 | 3Dlabs Inc. Ltd. | Graphics subsystem with smart direct-memory-access operation |
| US5651137A (en) | 1995-04-12 | 1997-07-22 | Intel Corporation | Scalable cache attributes for an input/output bus |
| US5940311A (en) | 1996-04-30 | 1999-08-17 | Texas Instruments Incorporated | Immediate floating-point operand reformatting in a microprocessor |
| US5917741A (en) | 1996-08-29 | 1999-06-29 | Intel Corporation | Method and apparatus for performing floating-point rounding operations for multiple precisions using incrementers |
| JP3790307B2 (ja) | 1996-10-16 | 2006-06-28 | 株式会社ルネサステクノロジ | データプロセッサ及びデータ処理システム |
| US6078940A (en) | 1997-01-24 | 2000-06-20 | Texas Instruments Incorporated | Microprocessor with an instruction for multiply and left shift with saturate |
| US5943687A (en) | 1997-03-14 | 1999-08-24 | Telefonakiebolaget Lm Ericsson | Penalty-based cache storage and replacement techniques |
| US5926406A (en) | 1997-04-30 | 1999-07-20 | Hewlett-Packard, Co. | System and method for calculating floating point exponential values in a geometry accelerator |
| US6092149A (en) | 1997-05-28 | 2000-07-18 | Western Digital Corporation | Disk drive cache system using a dynamic priority sequential stream of data segments continuously adapted according to prefetched sequential random, and repeating types of accesses |
| AUPO793897A0 (en) * | 1997-07-15 | 1997-08-07 | Silverbrook Research Pty Ltd | Image processing method and apparatus (ART25) |
| SG120064A1 (en) * | 1997-07-15 | 2006-03-28 | Silverbrook Res Pty Ltd | Thermal actuator |
| US7102646B1 (en) | 1997-11-25 | 2006-09-05 | Nvidia U.S. Investment Company | Demand-based memory system for graphics applications |
| US6856320B1 (en) | 1997-11-25 | 2005-02-15 | Nvidia U.S. Investment Company | Demand-based memory system for graphics applications |
| US6253311B1 (en) | 1997-11-29 | 2001-06-26 | Jp First Llc | Instruction set for bi-directional conversion and transfer of integer and floating point data |
| US6049865A (en) | 1997-12-18 | 2000-04-11 | Motorola, Inc. | Method and apparatus for implementing floating point projection instructions |
| US6260008B1 (en) | 1998-01-08 | 2001-07-10 | Sharp Kabushiki Kaisha | Method of and system for disambiguating syntactic word multiples |
| US6513099B1 (en) | 1998-12-22 | 2003-01-28 | Silicon Graphics Incorporated | Enhanced graphics cache memory |
| US6480872B1 (en) | 1999-01-21 | 2002-11-12 | Sandcraft, Inc. | Floating-point and integer multiply-add and multiply-accumulate |
| US6529928B1 (en) | 1999-03-23 | 2003-03-04 | Silicon Graphics, Inc. | Floating-point adder performing floating-point and integer operations |
| US6788738B1 (en) | 1999-05-07 | 2004-09-07 | Xilinx, Inc. | Filter accelerator for a digital signal processor |
| US6631437B1 (en) | 2000-04-06 | 2003-10-07 | Hewlett-Packard Development Company, L.P. | Method and apparatus for promoting memory read commands |
| US6578102B1 (en) | 2000-04-18 | 2003-06-10 | International Business Machines Corporation | Tracking and control of prefetch data in a PCI bus system |
| US6412046B1 (en) | 2000-05-01 | 2002-06-25 | Hewlett Packard Company | Verification of cache prefetch mechanism |
| US7499053B2 (en) | 2000-06-19 | 2009-03-03 | Mental Images Gmbh | Real-time precision ray tracing |
| US8188997B2 (en) | 2000-06-19 | 2012-05-29 | Mental Images Gmbh | Accelerated ray tracing using shallow bounding volume hierarchies |
| US6678806B1 (en) | 2000-08-23 | 2004-01-13 | Chipwrights Design, Inc. | Apparatus and method for using tagged pointers for extract, insert and format operations |
| US20020152361A1 (en) | 2001-02-05 | 2002-10-17 | International Business Machines Corporation | Directed least recently used cache replacement method |
| US6792509B2 (en) | 2001-04-19 | 2004-09-14 | International Business Machines Corporation | Partitioned cache of multiple logical levels with adaptive reconfiguration based on multiple criteria |
| US6748495B2 (en) | 2001-05-15 | 2004-06-08 | Broadcom Corporation | Random generator |
| US6947049B2 (en) | 2001-06-01 | 2005-09-20 | Nvidia Corporation | Method and system for synchronizing updates of vertex data with a graphics processor that is fetching vertex data |
| US6963954B1 (en) | 2001-09-19 | 2005-11-08 | Cisco Technology, Inc. | Method and apparatus for optimizing prefetching based on memory addresses |
| US7127482B2 (en) | 2001-11-19 | 2006-10-24 | Intel Corporation | Performance optimized approach for efficient downsampling operations |
| US6598120B1 (en) | 2002-03-08 | 2003-07-22 | International Business Machines Corporation | Assignment of building block collector agent to receive acknowledgments from other building block agents |
| US20030204840A1 (en) | 2002-04-30 | 2003-10-30 | Youfeng Wu | Apparatus and method for one-pass profiling to concurrently generate a frequency profile and a stride profile to enable data prefetching in irregular programs |
| US7197605B2 (en) | 2002-12-30 | 2007-03-27 | Intel Corporation | Allocating cache lines |
| US7483031B2 (en) | 2003-04-17 | 2009-01-27 | Nvidia Corporation | Method for synchronizing graphics processing units |
| US7373369B2 (en) | 2003-06-05 | 2008-05-13 | International Business Machines Corporation | Advanced execution of extended floating-point add operations in a narrow dataflow |
| US7272624B2 (en) | 2003-09-30 | 2007-09-18 | International Business Machines Corporation | Fused booth encoder multiplexer |
| JP3807400B2 (ja) | 2003-10-30 | 2006-08-09 | ソニー株式会社 | 記録制御装置および記録制御方法 |
| US7567252B2 (en) | 2003-12-09 | 2009-07-28 | Microsoft Corporation | Optimizing performance of a graphics processing unit for efficient execution of general matrix operations |
| GB2409068A (en) | 2003-12-09 | 2005-06-15 | Advanced Risc Mach Ltd | Data element size control within parallel lanes of processing |
| KR100800468B1 (ko) | 2004-01-29 | 2008-02-01 | 삼성전자주식회사 | 저전력 고속 동작을 위한 하드웨어 암호화/복호화 장치 및그 방법 |
| US8253750B1 (en) | 2004-02-14 | 2012-08-28 | Nvidia Corporation | Digital media processor |
| US7719540B2 (en) | 2004-03-31 | 2010-05-18 | Intel Corporation | Render-cache controller for multithreading, multi-core graphics processor |
| US7873812B1 (en) | 2004-04-05 | 2011-01-18 | Tibet MIMAR | Method and system for efficient matrix multiplication in a SIMD processor architecture |
| US7548892B2 (en) * | 2004-04-30 | 2009-06-16 | Microsoft Corporation | Processing machine learning techniques using a graphics processing unit |
| US7428566B2 (en) | 2004-11-10 | 2008-09-23 | Nvidia Corporation | Multipurpose functional unit with multiply-add and format conversion pipeline |
| US20060101244A1 (en) | 2004-11-10 | 2006-05-11 | Nvidia Corporation | Multipurpose functional unit with combined integer and floating-point multiply-add pipeline |
| US20060143396A1 (en) | 2004-12-29 | 2006-06-29 | Mason Cabot | Method for programmer-controlled cache line eviction policy |
| US20060179092A1 (en) | 2005-02-10 | 2006-08-10 | Schmookler Martin S | System and method for executing fixed point divide operations using a floating point multiply-add pipeline |
| US20060248279A1 (en) | 2005-05-02 | 2006-11-02 | Al-Sukhni Hassan F | Prefetching across a page boundary |
| US7346741B1 (en) | 2005-05-10 | 2008-03-18 | Sun Microsystems, Inc. | Memory latency of processors with configurable stride based pre-fetching technique |
| WO2006120664A2 (en) | 2005-05-13 | 2006-11-16 | Provost Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin | A data processing system and method |
| US8250348B2 (en) | 2005-05-19 | 2012-08-21 | International Business Machines Corporation | Methods and apparatus for dynamically switching processor mode |
| US7861055B2 (en) | 2005-06-07 | 2010-12-28 | Broadcom Corporation | Method and system for on-chip configurable data ram for fast memory and pseudo associative caches |
| US20060282620A1 (en) | 2005-06-14 | 2006-12-14 | Sujatha Kashyap | Weighted LRU for associative caches |
| US20070030277A1 (en) | 2005-08-08 | 2007-02-08 | Via Technologies, Inc. | Method for processing vertex, triangle, and pixel graphics data packets |
| US7659899B2 (en) | 2005-08-08 | 2010-02-09 | Via Technologies, Inc. | System and method to manage data processing stages of a logical graphics pipeline |
| US20070198815A1 (en) | 2005-08-11 | 2007-08-23 | Coresonic Ab | Programmable digital signal processor having a clustered SIMD microarchitecture including a complex short multiplier and an independent vector load unit |
| US20070074008A1 (en) | 2005-09-28 | 2007-03-29 | Donofrio David D | Mixed mode floating-point pipeline with extended functions |
| US7490224B2 (en) | 2005-10-07 | 2009-02-10 | International Business Machines Corporation | Time-of-life counter design for handling instruction flushes from a queue |
| TWI366151B (en) | 2005-10-14 | 2012-06-11 | Via Tech Inc | Multiple graphics processor system and methods |
| US8327115B2 (en) | 2006-04-12 | 2012-12-04 | Soft Machines, Inc. | Plural matrices of execution units for processing matrices of row dependent instructions in single clock cycle in super or separate mode |
| US8510827B1 (en) | 2006-05-18 | 2013-08-13 | Vmware, Inc. | Taint tracking mechanism for computer security |
| US8884972B2 (en) | 2006-05-25 | 2014-11-11 | Qualcomm Incorporated | Graphics processor with arithmetic and elementary function units |
| US7616206B1 (en) | 2006-06-16 | 2009-11-10 | Nvidia Corporation | Efficient multi-chip GPU |
| US8146066B2 (en) | 2006-06-20 | 2012-03-27 | Google Inc. | Systems and methods for caching compute kernels for an application running on a parallel-processing computer system |
| US7467280B2 (en) | 2006-07-05 | 2008-12-16 | International Business Machines Corporation | Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache |
| US8035650B2 (en) | 2006-07-25 | 2011-10-11 | Qualcomm Incorporated | Tiled cache for multiple software programs |
| US20080030510A1 (en) | 2006-08-02 | 2008-02-07 | Xgi Technology Inc. | Multi-GPU rendering system |
| US8606998B2 (en) | 2006-08-24 | 2013-12-10 | Advanced Micro Devices, Inc. | System and method for instruction-based cache allocation policies |
| US7620793B1 (en) | 2006-08-28 | 2009-11-17 | Nvidia Corporation | Mapping memory partitions to virtual memory pages |
| US7327289B1 (en) | 2006-09-20 | 2008-02-05 | Intel Corporation | Data-modifying run length encoder to avoid data expansion |
| US20080071851A1 (en) * | 2006-09-20 | 2008-03-20 | Ronen Zohar | Instruction and logic for performing a dot-product operation |
| US8122078B2 (en) | 2006-10-06 | 2012-02-21 | Calos Fund, LLC | Processor with enhanced combined-arithmetic capability |
| US20080086598A1 (en) | 2006-10-10 | 2008-04-10 | Maron William A | System and method for establishing cache priority for critical data structures of an application |
| JP4942095B2 (ja) * | 2007-01-25 | 2012-05-30 | インターナショナル・ビジネス・マシーンズ・コーポレーション | マルチコア・プロセッサにより演算を行う技術 |
| US20080189487A1 (en) | 2007-02-06 | 2008-08-07 | Arm Limited | Control of cache transactions |
| GB2447428A (en) * | 2007-03-15 | 2008-09-17 | Linear Algebra Technologies Lt | Processor having a trivial operand register |
| US7979674B2 (en) | 2007-05-16 | 2011-07-12 | International Business Machines Corporation | Re-executing launcher program upon termination of launched programs in MIMD mode booted SIMD partitions |
| US8781110B2 (en) | 2007-06-30 | 2014-07-15 | Intel Corporation | Unified system architecture for elliptic-curve cryptography |
| US7783859B2 (en) | 2007-07-12 | 2010-08-24 | Qnx Software Systems Gmbh & Co. Kg | Processing system implementing variable page size memory organization |
| US8990505B1 (en) | 2007-09-21 | 2015-03-24 | Marvell International Ltd. | Cache memory bank selection |
| DE112008003643A5 (de) | 2007-11-17 | 2010-10-28 | Krass, Maren | Rekonfigurierbare Fliesskomma- und Bit- ebenen Datenverarbeitungseinheit |
| US8106914B2 (en) | 2007-12-07 | 2012-01-31 | Nvidia Corporation | Fused multiply-add functional unit |
| US7941633B2 (en) | 2007-12-18 | 2011-05-10 | International Business Machines Corporation | Hash optimization system and method |
| US7870339B2 (en) | 2008-01-11 | 2011-01-11 | International Business Machines Corporation | Extract cache attribute facility and instruction therefore |
| US20090190432A1 (en) | 2008-01-28 | 2009-07-30 | Christoph Bilger | DRAM with Page Access |
| GB2457303A (en) * | 2008-02-11 | 2009-08-12 | Linear Algebra Technologies | Randomly accessing elements of compressed matrix data by calculating offsets from non-zero values of a bitmap |
| US8429351B1 (en) | 2008-03-28 | 2013-04-23 | Emc Corporation | Techniques for determining an amount of data to prefetch |
| US8146064B2 (en) | 2008-04-04 | 2012-03-27 | International Business Machines Corporation | Dynamically controlling a prefetching range of a software controlled cache |
| US8078833B2 (en) | 2008-05-29 | 2011-12-13 | Axis Semiconductor, Inc. | Microprocessor with highly configurable pipeline and executional unit internal hierarchal structures, optimizable for different types of computational functions |
| US7945768B2 (en) | 2008-06-05 | 2011-05-17 | Motorola Mobility, Inc. | Method and apparatus for nested instruction looping using implicit predicates |
| US8340280B2 (en) | 2008-06-13 | 2012-12-25 | Intel Corporation | Using a single instruction multiple data (SIMD) instruction to speed up galois counter mode (GCM) computations |
| US8108361B2 (en) | 2008-07-31 | 2012-01-31 | Microsoft Corporation | Efficient column based data encoding for large-scale data storage |
| US8041856B2 (en) * | 2008-09-30 | 2011-10-18 | Lsi Corporation | Skip based control logic for first in first out buffer |
| US8219757B2 (en) | 2008-09-30 | 2012-07-10 | Intel Corporation | Apparatus and method for low touch cache management |
| US20100162247A1 (en) | 2008-12-19 | 2010-06-24 | Adam Welc | Methods and systems for transactional nested parallelism |
| US8645634B1 (en) | 2009-01-16 | 2014-02-04 | Nvidia Corporation | Zero-copy data sharing by cooperating asymmetric coprocessors |
| US20100185816A1 (en) | 2009-01-21 | 2010-07-22 | Sauber William F | Multiple Cache Line Size |
| US8266409B2 (en) | 2009-03-03 | 2012-09-11 | Qualcomm Incorporated | Configurable cache and method to configure same |
| US8108612B2 (en) | 2009-05-15 | 2012-01-31 | Microsoft Corporation | Location updates for a distributed data store |
| US8566801B2 (en) | 2009-05-22 | 2013-10-22 | International Business Machines Corporation | Concurrent static single assignment for general barrier synchronized parallel programs |
| US8819359B2 (en) | 2009-06-29 | 2014-08-26 | Oracle America, Inc. | Hybrid interleaving in memory modules by interleaving physical addresses for a page across ranks in a memory module |
| US8352945B2 (en) | 2009-08-11 | 2013-01-08 | International Business Machines Corporation | System, method, and apparatus for scan-sharing for business intelligence queries in an in-memory database |
| US8615637B2 (en) | 2009-09-10 | 2013-12-24 | Advanced Micro Devices, Inc. | Systems and methods for processing memory requests in a multi-processor system using a probe engine |
| US8364739B2 (en) | 2009-09-30 | 2013-01-29 | International Business Machines Corporation | Sparse matrix-vector multiplication on graphics processor units |
| US8103910B2 (en) | 2009-11-13 | 2012-01-24 | International Business Machines Corporation | Local rollback for fault-tolerance in parallel computing systems |
| US8984043B2 (en) | 2009-12-23 | 2015-03-17 | Intel Corporation | Multiplying and adding matrices |
| US8669990B2 (en) | 2009-12-31 | 2014-03-11 | Intel Corporation | Sharing resources between a CPU and GPU |
| GB2476800A (en) | 2010-01-07 | 2011-07-13 | Linear Algebra Technologies Ltd | Sparse matrix vector multiplier using a bit map of non-zero elements to control scheduling of arithmetic operations |
| US8572322B2 (en) | 2010-03-29 | 2013-10-29 | Freescale Semiconductor, Inc. | Asynchronously scheduling memory access requests |
| US8677613B2 (en) | 2010-05-20 | 2014-03-25 | International Business Machines Corporation | Enhanced modularity in heterogeneous 3D stacks |
| US8812575B2 (en) | 2010-07-06 | 2014-08-19 | Silminds, Llc, Egypt | Decimal floating-point square-root unit using Newton-Raphson iterations |
| US20120059983A1 (en) | 2010-09-03 | 2012-03-08 | David Wilkins Nellans | Predictor-based management of dram row-buffers |
| US8682639B2 (en) | 2010-09-21 | 2014-03-25 | Texas Instruments Incorporated | Dedicated memory window for emulation address |
| US8982140B2 (en) | 2010-09-24 | 2015-03-17 | Nvidia Corporation | Hierarchical memory addressing |
| US9965395B2 (en) | 2010-09-28 | 2018-05-08 | Texas Instruments Incorporated | Memory attribute sharing between differing cache levels of multilevel cache |
| US8488055B2 (en) | 2010-09-30 | 2013-07-16 | Apple Inc. | Flash synchronization using image sensor interface timing signal |
| US8745111B2 (en) | 2010-11-16 | 2014-06-03 | Apple Inc. | Methods and apparatuses for converting floating point representations |
| CN102033985A (zh) * | 2010-11-24 | 2011-04-27 | 南京理工大学 | 基于*-矩阵算法的高效时域电磁仿真方法 |
| US8847965B2 (en) | 2010-12-03 | 2014-09-30 | The University Of North Carolina At Chapel Hill | Methods, systems, and computer readable media for fast geometric sound propagation using visibility computations |
| CN102141976B (zh) * | 2011-01-10 | 2013-08-14 | 中国科学院软件研究所 | 稀疏矩阵的对角线数据存储方法及基于该方法的SpMV实现方法 |
| GB2488985A (en) | 2011-03-08 | 2012-09-19 | Advanced Risc Mach Ltd | Mixed size data processing operation with integrated operand conversion instructions |
| US8862653B2 (en) * | 2011-04-26 | 2014-10-14 | University Of South Carolina | System and method for sparse matrix vector multiplication processing |
| FR2974645A1 (fr) | 2011-04-28 | 2012-11-02 | Kalray | Operateur de multiplication et addition fusionnees a precision mixte |
| US9501392B1 (en) | 2011-05-12 | 2016-11-22 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Management of a non-volatile memory module |
| JP5813380B2 (ja) | 2011-06-03 | 2015-11-17 | 株式会社東芝 | 半導体記憶装置 |
| US9032156B2 (en) | 2011-07-06 | 2015-05-12 | Advanced Micro Devices, Inc. | Memory access monitor |
| CN102214160B (zh) | 2011-07-08 | 2013-04-17 | 中国科学技术大学 | 一种基于龙芯3a的单精度矩阵乘法优化方法 |
| US9529712B2 (en) | 2011-07-26 | 2016-12-27 | Nvidia Corporation | Techniques for balancing accesses to memory having different memory types |
| US9727336B2 (en) | 2011-09-16 | 2017-08-08 | International Business Machines Corporation | Fine-grained instruction enablement at sub-function granularity based on an indicated subrange of registers |
| US20130099946A1 (en) | 2011-10-21 | 2013-04-25 | International Business Machines Corporation | Data Compression Utilizing Variable and Limited Length Codes |
| US8935478B2 (en) | 2011-11-01 | 2015-01-13 | International Business Machines Corporation | Variable cache line size management |
| US20130141442A1 (en) | 2011-12-06 | 2013-06-06 | John W. Brothers | Method and apparatus for multi-chip processing |
| US20130148947A1 (en) | 2011-12-13 | 2013-06-13 | Ati Technologies Ulc | Video player with multiple grpahics processors |
| US9021237B2 (en) | 2011-12-20 | 2015-04-28 | International Business Machines Corporation | Low latency variable transfer network communicating variable written to source processing core variable register allocated to destination thread to destination processing core variable register allocated to source thread |
| WO2013095504A1 (en) | 2011-12-22 | 2013-06-27 | Intel Corporation | Matrix multiply accumulate instruction |
| WO2013095537A1 (en) | 2011-12-22 | 2013-06-27 | Intel Corporation | Controlling a processor cache using a real-time attribute |
| CN106775592B (zh) | 2011-12-23 | 2019-03-12 | 英特尔公司 | 处理器、用于计算系统的方法、机器可读介质和计算机系统 |
| EP2798457B1 (en) * | 2011-12-29 | 2019-03-06 | Intel Corporation | Dot product processors, methods, systems, and instructions |
| US20130185515A1 (en) | 2012-01-16 | 2013-07-18 | Qualcomm Incorporated | Utilizing Negative Feedback from Unexpected Miss Addresses in a Hardware Prefetcher |
| US10073656B2 (en) | 2012-01-27 | 2018-09-11 | Sandisk Technologies Llc | Systems and methods for storage virtualization |
| CN104106053B (zh) | 2012-02-08 | 2018-12-11 | 英特尔公司 | 使用功率的动态cpu gpu负载平衡 |
| US20130218938A1 (en) | 2012-02-17 | 2013-08-22 | Qualcomm Incorporated | Floating-point adder with operand shifting based on a predicted exponent difference |
| US20130219088A1 (en) | 2012-02-22 | 2013-08-22 | Lsi Corporation | Configurable prioritization of data transmission in a data storage topology |
| US9036710B2 (en) * | 2012-03-08 | 2015-05-19 | Blackberry Limited | Unified transform coefficient encoding and decoding |
| US9183664B2 (en) | 2012-05-03 | 2015-11-10 | Apple Inc. | Tiled forward shading with improved depth filtering |
| US8775762B2 (en) | 2012-05-07 | 2014-07-08 | Advanced Micro Devices, Inc. | Method and apparatus for batching memory requests |
| JP5826114B2 (ja) | 2012-05-25 | 2015-12-02 | クラリオン株式会社 | データ解凍装置、データ圧縮装置、データの解凍プログラム、データの圧縮プログラム、及び、圧縮データ配信システム |
| CN108345547A (zh) | 2012-06-15 | 2018-07-31 | 英特尔公司 | 乱序加载的基于锁的和基于同步的方法 |
| US8892619B2 (en) | 2012-07-24 | 2014-11-18 | The Board Of Trustees Of The Leland Stanford Junior University | Floating-point multiply-add unit using cascade design |
| US9128845B2 (en) | 2012-07-30 | 2015-09-08 | Hewlett-Packard Development Company, L.P. | Dynamically partition a volatile memory for a cache and a memory partition |
| CN103581052B (zh) | 2012-08-02 | 2017-07-21 | 华为技术有限公司 | 一种数据处理方法、路由器及ndn系统 |
| JP6007667B2 (ja) | 2012-08-17 | 2016-10-12 | 富士通株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
| US9298456B2 (en) | 2012-08-21 | 2016-03-29 | Apple Inc. | Mechanism for performing speculative predicated instructions |
| US10346095B2 (en) | 2012-08-31 | 2019-07-09 | Sandisk Technologies, Llc | Systems, methods, and interfaces for adaptive cache persistence |
| US20140075163A1 (en) | 2012-09-07 | 2014-03-13 | Paul N. Loewenstein | Load-monitor mwait |
| US9134954B2 (en) | 2012-09-10 | 2015-09-15 | Qualcomm Incorporated | GPU memory buffer pre-fetch and pre-back signaling to avoid page-fault |
| US9146846B2 (en) | 2012-09-14 | 2015-09-29 | Advanced Micro Devices, Inc. | Programmable physical address mapping for memory |
| US10742475B2 (en) | 2012-12-05 | 2020-08-11 | Origin Wireless, Inc. | Method, apparatus, and system for object tracking sensing using broadcasting |
| US9582287B2 (en) | 2012-09-27 | 2017-02-28 | Intel Corporation | Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions |
| US9626294B2 (en) | 2012-10-03 | 2017-04-18 | International Business Machines Corporation | Performance-driven cache line memory access |
| US9317482B2 (en) * | 2012-10-14 | 2016-04-19 | Microsoft Technology Licensing, Llc | Universal FPGA/ASIC matrix-vector multiplication architecture |
| US9152382B2 (en) | 2012-10-31 | 2015-10-06 | Intel Corporation | Reducing power consumption in a fused multiply-add (FMA) unit responsive to input data values |
| US11150721B2 (en) | 2012-11-07 | 2021-10-19 | Nvidia Corporation | Providing hints to an execution unit to prepare for predicted subsequent arithmetic operations |
| US9170955B2 (en) | 2012-11-27 | 2015-10-27 | Intel Corporation | Providing extended cache replacement state information |
| US9183144B2 (en) | 2012-12-14 | 2015-11-10 | Intel Corporation | Power gating a portion of a cache memory |
| US20140173203A1 (en) | 2012-12-18 | 2014-06-19 | Andrew T. Forsyth | Block Memory Engine |
| US9558006B2 (en) | 2012-12-20 | 2017-01-31 | Intel Corporation | Continuous automatic tuning of code regions |
| US9317251B2 (en) | 2012-12-31 | 2016-04-19 | Nvidia Corporation | Efficient correction of normalizer shift amount errors in fused multiply add operations |
| US9298457B2 (en) | 2013-01-22 | 2016-03-29 | Altera Corporation | SIMD instructions for data compression and decompression |
| US9971710B2 (en) | 2013-02-07 | 2018-05-15 | Microsoft Technology Licensing, Llc | Optimizing data transfers between heterogeneous memory arenas |
| US9329870B2 (en) | 2013-02-13 | 2016-05-03 | International Business Machines Corporation | Extensible execution unit interface architecture with multiple decode logic and multiple execution units |
| US9122613B2 (en) | 2013-03-07 | 2015-09-01 | Arm Limited | Prefetching of data and instructions in a data processing apparatus |
| US10133677B2 (en) | 2013-03-14 | 2018-11-20 | Nvidia Corporation | Opportunistic migration of memory pages in a unified virtual memory system |
| US9940286B2 (en) | 2013-03-14 | 2018-04-10 | Nvidia Corporation | PCIE traffic tracking hardware in a unified virtual memory system |
| US9478066B2 (en) | 2013-03-14 | 2016-10-25 | Nvidia Corporation | Consistent vertex snapping for variable resolution rendering |
| US9525586B2 (en) | 2013-03-15 | 2016-12-20 | Intel Corporation | QoS based binary translation and application streaming |
| US9153539B2 (en) | 2013-03-15 | 2015-10-06 | Nvidia Corporation | Ground-referenced single-ended signaling connected graphics processing unit multi-chip module |
| US9176895B2 (en) | 2013-03-16 | 2015-11-03 | Intel Corporation | Increased error correction for cache memories through adaptive replacement policies |
| KR20140126189A (ko) | 2013-04-22 | 2014-10-30 | 삼성전자주식회사 | 프로세서의 멀티 실행 모드 지원 장치 및 방법 |
| US9594595B2 (en) | 2013-05-17 | 2017-03-14 | Advanced Micro Devices, Inc. | Efficient processor load balancing using predication flags |
| GB2514397B (en) | 2013-05-23 | 2017-10-11 | Linear Algebra Tech Ltd | Corner detection |
| US9436600B2 (en) | 2013-06-11 | 2016-09-06 | Svic No. 28 New Technology Business Investment L.L.P. | Non-volatile memory storage for multi-channel memory system |
| US9378127B2 (en) | 2013-06-21 | 2016-06-28 | Intel Corporation | Dynamic memory page policy |
| US10963255B2 (en) | 2013-07-15 | 2021-03-30 | Texas Instruments Incorporated | Implied fence on stream open |
| US9264066B2 (en) | 2013-07-30 | 2016-02-16 | Apple Inc. | Type conversion using floating-point unit |
| US9946666B2 (en) | 2013-08-06 | 2018-04-17 | Nvidia Corporation | Coalescing texture access and load/store operations |
| US9092345B2 (en) | 2013-08-08 | 2015-07-28 | Arm Limited | Data processing systems |
| US9710380B2 (en) | 2013-08-29 | 2017-07-18 | Intel Corporation | Managing shared cache by multi-core processor |
| TWI676898B (zh) | 2013-12-09 | 2019-11-11 | 安然國際科技有限公司 | 分散式記憶體磁碟群集儲存系統運作方法 |
| US9461667B2 (en) | 2013-12-30 | 2016-10-04 | Samsung Electronics Co., Ltd. | Rounding injection scheme for floating-point to integer conversion |
| US20150193358A1 (en) | 2014-01-06 | 2015-07-09 | Nvidia Corporation | Prioritized Memory Reads |
| US10528357B2 (en) | 2014-01-17 | 2020-01-07 | L3 Technologies, Inc. | Web-based recorder configuration utility |
| US20150205724A1 (en) | 2014-01-20 | 2015-07-23 | Honeywell International Inc. | System and method of cache partitioning for processors with limited cached memory pools |
| KR102100161B1 (ko) | 2014-02-04 | 2020-04-14 | 삼성전자주식회사 | Gpu 데이터 캐싱 방법 및 그에 따른 데이터 프로세싱 시스템 |
| WO2015119610A1 (en) | 2014-02-06 | 2015-08-13 | Empire Technology Development, Llc | Server-client secret generation with cached data |
| US9275429B2 (en) | 2014-02-17 | 2016-03-01 | Qualcomm Incorporated | Device hang detection and recovery |
| KR20150106132A (ko) | 2014-03-11 | 2015-09-21 | 삼성전자주식회사 | 전자 장치의 캐시 메모리 제어 방법 및 장치 |
| US9720667B2 (en) | 2014-03-21 | 2017-08-01 | Intel Corporation | Automatic loop vectorization using hardware transactional memory |
| US20150268963A1 (en) | 2014-03-23 | 2015-09-24 | Technion Research & Development Foundation Ltd. | Execution of data-parallel programs on coarse-grained reconfigurable architecture hardware |
| US9436972B2 (en) | 2014-03-27 | 2016-09-06 | Intel Corporation | System coherency in a distributed graphics processor hierarchy |
| EP2937794B1 (en) | 2014-04-22 | 2016-08-17 | DataVard GmbH | Method and system for archiving digital data |
| US9690696B1 (en) | 2014-05-14 | 2017-06-27 | Western Digital Technologies, Inc. | Lifetime extension of memory for data storage system |
| US9673998B2 (en) | 2014-05-15 | 2017-06-06 | Futurewei Technologies, Inc. | Differential cache for representational state transfer (REST) API |
| JP6248808B2 (ja) | 2014-05-22 | 2017-12-20 | 富士通株式会社 | 情報処理装置、情報処理システム、情報処理装置の制御方法、及び、情報処理装置の制御プログラム |
| KR102192956B1 (ko) | 2014-06-23 | 2020-12-18 | 삼성전자주식회사 | 디스플레이 장치 및 그 제어 방법 |
| US10061592B2 (en) | 2014-06-27 | 2018-08-28 | Samsung Electronics Co., Ltd. | Architecture and execution for efficient mixed precision computations in single instruction multiple data/thread (SIMD/T) devices |
| US20150378920A1 (en) | 2014-06-30 | 2015-12-31 | John G. Gierach | Graphics data pre-fetcher for last level caches |
| US9520192B2 (en) | 2014-06-30 | 2016-12-13 | Intel Corporation | Resistive memory write operation with merged reset |
| US10223333B2 (en) | 2014-08-29 | 2019-03-05 | Nvidia Corporation | Performing multi-convolution operations in a parallel processing system |
| JP2016057831A (ja) | 2014-09-09 | 2016-04-21 | 株式会社東芝 | 浮動小数点演算装置、及び情報処理システム |
| US10096086B2 (en) | 2014-09-10 | 2018-10-09 | Nvidia Corporation | Enhanced anti-aliasing by varying sample patterns spatially and/or temporally |
| KR102263326B1 (ko) | 2014-09-18 | 2021-06-09 | 삼성전자주식회사 | 그래픽 프로세싱 유닛 및 이를 이용한 그래픽 데이터 처리 방법 |
| US20160092118A1 (en) | 2014-09-26 | 2016-03-31 | Intel Corporation | Memory write management in a computer system |
| US9983884B2 (en) | 2014-09-26 | 2018-05-29 | Intel Corporation | Method and apparatus for SIMD structured branching |
| US9928076B2 (en) | 2014-09-26 | 2018-03-27 | Intel Corporation | Method and apparatus for unstructured control flow for SIMD execution engine |
| JP2016091242A (ja) | 2014-10-31 | 2016-05-23 | 富士通株式会社 | キャッシュメモリ、キャッシュメモリへのアクセス方法及び制御プログラム |
| US20160124709A1 (en) | 2014-11-04 | 2016-05-05 | International Business Machines Corporation | Fast, energy-efficient exponential computations in simd architectures |
| US10282227B2 (en) | 2014-11-18 | 2019-05-07 | Intel Corporation | Efficient preemption for graphics processors |
| US9491112B1 (en) | 2014-12-10 | 2016-11-08 | Amazon Technologies, Inc. | Allocating processor resources based on a task identifier |
| US10956617B2 (en) | 2014-12-12 | 2021-03-23 | Coresecure Technologies, Llc | Systems and methods for random fill caching and prefetching for secure cache memories |
| US9910785B2 (en) | 2014-12-14 | 2018-03-06 | Via Alliance Semiconductor Co., Ltd | Cache memory budgeted by ways based on memory access type |
| WO2016097812A1 (en) | 2014-12-14 | 2016-06-23 | Via Alliance Semiconductor Co., Ltd. | Cache memory budgeted by chunks based on memory access type |
| EP3129890B1 (en) | 2014-12-14 | 2019-08-14 | VIA Alliance Semiconductor Co., Ltd. | Set associative cache memory with heterogeneous replacement policy |
| FR3030846B1 (fr) * | 2014-12-23 | 2017-12-29 | Commissariat Energie Atomique | Representation semantique du contenu d'une image |
| US9766892B2 (en) | 2014-12-23 | 2017-09-19 | Intel Corporation | Method and apparatus for efficient execution of nested branches on a graphics processor unit |
| US9710228B2 (en) | 2014-12-29 | 2017-07-18 | Imagination Technologies Limited | Unified multiply unit |
| US9304835B1 (en) * | 2014-12-31 | 2016-04-05 | International Business Machines Corporation | Optimized system for analytics (graphs and sparse matrices) operations |
| US20170061279A1 (en) | 2015-01-14 | 2017-03-02 | Intel Corporation | Updating an artificial neural network using flexible fixed point representation |
| US9971686B2 (en) | 2015-02-23 | 2018-05-15 | Intel Corporation | Vector cache line write back processors, methods, systems, and instructions |
| US20160255169A1 (en) | 2015-02-27 | 2016-09-01 | Futurewei Technologies, Inc. | Method and system for smart object eviction for proxy cache |
| US10002455B2 (en) | 2015-04-20 | 2018-06-19 | Intel Corporation | Optimized depth buffer cache apparatus and method |
| US9626299B2 (en) | 2015-05-01 | 2017-04-18 | Intel Corporation | Changing a hash function based on a conflict ratio associated with cache sets |
| US10379865B1 (en) | 2015-05-20 | 2019-08-13 | Marvell International Ltd. | Selection of instructions to be issued |
| US10049322B2 (en) | 2015-05-21 | 2018-08-14 | Google Llc | Prefetching weights for use in a neural network processor |
| US9804666B2 (en) | 2015-05-26 | 2017-10-31 | Samsung Electronics Co., Ltd. | Warp clustering |
| US20160378465A1 (en) * | 2015-06-23 | 2016-12-29 | Intel Corporation | Efficient sparse array handling in a processor |
| GB2540761B (en) | 2015-07-23 | 2017-12-06 | Advanced Risc Mach Ltd | Cache usage estimation |
| KR20170014109A (ko) | 2015-07-29 | 2017-02-08 | 삼성전자주식회사 | 반도체 메모리 장치 및 이를 포함하는 메모리 시스템 |
| US20170039144A1 (en) | 2015-08-07 | 2017-02-09 | Intel Corporation | Loading data using sub-thread information in a processor |
| CN105068787A (zh) * | 2015-08-28 | 2015-11-18 | 华南理工大学 | 一种稀疏矩阵向量乘法的异构并行计算方法 |
| US10423354B2 (en) | 2015-09-23 | 2019-09-24 | Advanced Micro Devices, Inc. | Selective data copying between memory modules |
| US10423411B2 (en) * | 2015-09-26 | 2019-09-24 | Intel Corporation | Data element comparison processors, methods, systems, and instructions |
| US10042749B2 (en) | 2015-11-10 | 2018-08-07 | International Business Machines Corporation | Prefetch insensitive transactional memory |
| US10387309B2 (en) | 2015-10-14 | 2019-08-20 | Elastifile Ltd. | High-performance distributed caching |
| KR101843243B1 (ko) * | 2015-10-30 | 2018-03-29 | 세종대학교산학협력단 | 제로값을 피연산자로 갖는 연산자에 대한 연산을 스킵하는 연산 방법 및 연산 장치 |
| US9558156B1 (en) * | 2015-11-24 | 2017-01-31 | International Business Machines Corporation | Sparse matrix multiplication using a single field programmable gate array module |
| US10061748B2 (en) * | 2015-12-11 | 2018-08-28 | Sap Se | Adaptive tile matrix representation and multiplication |
| CN106886429B (zh) | 2015-12-16 | 2020-11-06 | 华为技术有限公司 | 一种加载驱动程序的方法和服务器 |
| US20170177336A1 (en) | 2015-12-22 | 2017-06-22 | Intel Corporation | Hardware cancellation monitor for floating point operations |
| US9996320B2 (en) | 2015-12-23 | 2018-06-12 | Intel Corporation | Fused multiply-add (FMA) low functional unit |
| KR102604737B1 (ko) | 2016-01-11 | 2023-11-22 | 삼성전자주식회사 | 가속 구조를 생성하는 방법 및 장치 |
| US10762164B2 (en) * | 2016-01-20 | 2020-09-01 | Cambricon Technologies Corporation Limited | Vector and matrix computing device |
| US20170214930A1 (en) | 2016-01-26 | 2017-07-27 | Sandia Corporation | Gpu-assisted lossless data compression |
| WO2017130201A1 (en) | 2016-01-28 | 2017-08-03 | Subply Solutions Ltd. | Method and system for providing audio content |
| US9898441B2 (en) * | 2016-02-05 | 2018-02-20 | Google Llc | Matrix processing apparatus |
| US10846362B2 (en) * | 2016-03-09 | 2020-11-24 | Nec Corporation | Information processing apparatus, information processing method, data structure and program |
| US9778871B1 (en) | 2016-03-27 | 2017-10-03 | Qualcomm Incorporated | Power-reducing memory subsystem having a system cache and local resource management |
| CN107315718B (zh) * | 2016-04-26 | 2020-08-21 | 中科寒武纪科技股份有限公司 | 一种用于执行向量内积运算的装置和方法 |
| US20170308800A1 (en) | 2016-04-26 | 2017-10-26 | Smokescreen Intelligence, LLC | Interchangeable Artificial Intelligence Perception Systems and Methods |
| US10509732B2 (en) | 2016-04-27 | 2019-12-17 | Advanced Micro Devices, Inc. | Selecting cache aging policy for prefetches based on cache test regions |
| CN107346148A (zh) | 2016-05-04 | 2017-11-14 | 杭州海存信息技术有限公司 | 基于背面查找表的仿真处理器 |
| US9846579B1 (en) | 2016-06-13 | 2017-12-19 | Apple Inc. | Unified integer and floating-point compare circuitry |
| US10176099B2 (en) | 2016-07-11 | 2019-01-08 | Intel Corporation | Using data pattern to mark cache lines as invalid |
| JP6665720B2 (ja) | 2016-07-14 | 2020-03-13 | 富士通株式会社 | 情報処理装置、コンパイルプログラム、コンパイル方法、およびキャッシュ制御方法 |
| US20180018266A1 (en) | 2016-07-18 | 2018-01-18 | Advanced Micro Devices, Inc. | Stride prefetcher for inconsistent strides |
| US10334334B2 (en) | 2016-07-22 | 2019-06-25 | Intel Corporation | Storage sled and techniques for a data center |
| US10229470B2 (en) * | 2016-08-05 | 2019-03-12 | Intel IP Corporation | Mechanism to accelerate graphics workloads in a multi-core computing architecture |
| US10528864B2 (en) | 2016-08-11 | 2020-01-07 | Nvidia Corporation | Sparse convolutional neural network accelerator |
| US10891538B2 (en) * | 2016-08-11 | 2021-01-12 | Nvidia Corporation | Sparse convolutional neural network accelerator |
| US10242311B2 (en) * | 2016-08-11 | 2019-03-26 | Vivante Corporation | Zero coefficient skipping convolution neural network engine |
| US10467195B2 (en) | 2016-09-06 | 2019-11-05 | Samsung Electronics Co., Ltd. | Adaptive caching replacement manager with dynamic updating granulates and partitions for shared flash-based storage system |
| US20180107602A1 (en) | 2016-10-13 | 2018-04-19 | Intel Corporation | Latency and Bandwidth Efficiency Improvement for Read Modify Write When a Read Operation is Requested to a Partially Modified Write Only Cacheline |
| US11315018B2 (en) | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
| US10360163B2 (en) * | 2016-10-27 | 2019-07-23 | Google Llc | Exploiting input data sparsity in neural network compute units |
| US10216479B2 (en) | 2016-12-06 | 2019-02-26 | Arm Limited | Apparatus and method for performing arithmetic operations to accumulate floating-point numbers |
| CN106683036A (zh) | 2016-12-12 | 2017-05-17 | 中国航空工业集团公司西安航空计算技术研究所 | 一种面向gpu高效绘制的帧缓冲区存储编码方法 |
| US10452551B2 (en) | 2016-12-12 | 2019-10-22 | Intel Corporation | Programmable memory prefetcher for prefetching multiple cache lines based on data in a prefetch engine control register |
| KR102712155B1 (ko) | 2016-12-15 | 2024-09-30 | 삼성전자주식회사 | 가속 구조를 생성하는 방법 및 장치 |
| US20180173623A1 (en) | 2016-12-21 | 2018-06-21 | Qualcomm Incorporated | Reducing or avoiding buffering of evicted cache data from an uncompressed cache memory in a compressed memory system to avoid stalling write operations |
| US10521389B2 (en) | 2016-12-23 | 2019-12-31 | Ati Technologies Ulc | Method and apparatus for accessing non-volatile memory as byte addressable memory |
| US20180183577A1 (en) | 2016-12-28 | 2018-06-28 | Intel Corporation | Techniques for secure message authentication with unified hardware acceleration |
| US10558575B2 (en) | 2016-12-30 | 2020-02-11 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
| US10146738B2 (en) * | 2016-12-31 | 2018-12-04 | Intel Corporation | Hardware accelerator architecture for processing very-sparse and hyper-sparse matrix data |
| KR102637736B1 (ko) | 2017-01-04 | 2024-02-19 | 삼성전자주식회사 | 그래픽스 처리 방법 및 시스템 |
| US20180210836A1 (en) | 2017-01-24 | 2018-07-26 | Microsoft Technology Licensing, Llc | Thermal and reliability based cache slice migration |
| US10394719B2 (en) | 2017-01-25 | 2019-08-27 | Samsung Electronics Co., Ltd. | Refresh aware replacement policy for volatile memory cache |
| US11397687B2 (en) | 2017-01-25 | 2022-07-26 | Samsung Electronics Co., Ltd. | Flash-integrated high bandwidth memory appliance |
| US10430912B2 (en) | 2017-02-14 | 2019-10-01 | Qualcomm Incorporated | Dynamic shader instruction nullification for graphics processing |
| GB2560159B (en) | 2017-02-23 | 2019-12-25 | Advanced Risc Mach Ltd | Widening arithmetic in a data processing apparatus |
| US10409887B1 (en) * | 2017-02-28 | 2019-09-10 | Ambarella, Inc. | Generalized dot product for computer vision applications |
| KR102499396B1 (ko) * | 2017-03-03 | 2023-02-13 | 삼성전자 주식회사 | 뉴럴 네트워크 장치 및 뉴럴 네트워크 장치의 동작 방법 |
| US10198369B2 (en) | 2017-03-24 | 2019-02-05 | Advanced Micro Devices, Inc. | Dynamic memory remapping to reduce row-buffer conflicts |
| US10209890B2 (en) | 2017-03-28 | 2019-02-19 | International Business Machines Corporation | Near memory accelerator |
| US10303602B2 (en) | 2017-03-31 | 2019-05-28 | Advanced Micro Devices, Inc. | Preemptive cache management policies for processing units |
| US10229059B2 (en) | 2017-03-31 | 2019-03-12 | Intel Corporation | Dynamic fill policy for a shared cache |
| US10503652B2 (en) | 2017-04-01 | 2019-12-10 | Intel Corporation | Sector cache for compression |
| US10423415B2 (en) | 2017-04-01 | 2019-09-24 | Intel Corporation | Hierarchical general register file (GRF) for execution block |
| US10304421B2 (en) | 2017-04-07 | 2019-05-28 | Intel Corporation | Apparatus and method for remote display and content protection in a virtualized graphics processing environment |
| US10861216B2 (en) | 2017-04-07 | 2020-12-08 | Intel Corporation | Ray tracing apparatus and method for memory access and register operations |
| US10346944B2 (en) * | 2017-04-09 | 2019-07-09 | Intel Corporation | Machine learning sparse computation mechanism |
| US20180300258A1 (en) | 2017-04-13 | 2018-10-18 | Futurewei Technologies, Inc. | Access rank aware cache replacement policy |
| US10824938B2 (en) * | 2017-04-24 | 2020-11-03 | Intel Corporation | Specialized fixed function hardware for efficient convolution |
| US10403003B2 (en) * | 2017-04-24 | 2019-09-03 | Intel Corporation | Compression mechanism |
| US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
| US10474458B2 (en) | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
| US10186011B2 (en) * | 2017-04-28 | 2019-01-22 | Intel Corporation | Programmable coarse grained and sparse matrix compute hardware with advanced scheduling |
| US10726514B2 (en) | 2017-04-28 | 2020-07-28 | Intel Corporation | Compute optimizations for low precision machine learning operations |
| US11488008B2 (en) | 2017-05-05 | 2022-11-01 | Intel Corporation | Hardware implemented point to point communication primitives for machine learning |
| US10776699B2 (en) * | 2017-05-05 | 2020-09-15 | Intel Corporation | Optimized compute hardware for machine learning operations |
| US10338919B2 (en) | 2017-05-08 | 2019-07-02 | Nvidia Corporation | Generalized acceleration of matrix multiply accumulate operations |
| US20180336136A1 (en) | 2017-05-17 | 2018-11-22 | Qualcomm Incorporated | Input/output-coherent Look-ahead Cache Access |
| TW202024961A (zh) * | 2017-05-17 | 2020-07-01 | 美商谷歌有限責任公司 | 低延遲矩陣乘法單元 |
| US10102015B1 (en) | 2017-06-22 | 2018-10-16 | Microsoft Technology Licensing, Llc | Just in time GPU executed program cross compilation |
| US10282299B2 (en) | 2017-06-23 | 2019-05-07 | Cavium, Llc | Managing cache partitions based on cache usage information |
| US10969740B2 (en) | 2017-06-27 | 2021-04-06 | Nvidia Corporation | System and method for near-eye light field rendering for wide field of view interactive three-dimensional computer graphics |
| US10984049B2 (en) | 2017-06-27 | 2021-04-20 | Nvidia Corporation | Performing traversal stack compression |
| US10331558B2 (en) | 2017-07-28 | 2019-06-25 | Apple Inc. | Systems and methods for performing memory compression |
| US10614613B2 (en) | 2017-07-28 | 2020-04-07 | Nvidia Corporation | Reducing noise during rendering by performing parallel path space filtering utilizing hashing |
| US10990648B2 (en) | 2017-08-07 | 2021-04-27 | Intel Corporation | System and method for an optimized winograd convolution accelerator |
| US10545860B2 (en) | 2017-08-10 | 2020-01-28 | Samsung Electronics Co., Ltd. | Intelligent high bandwidth memory appliance |
| US10394456B2 (en) | 2017-08-23 | 2019-08-27 | Micron Technology, Inc. | On demand memory page size |
| US11232531B2 (en) | 2017-08-29 | 2022-01-25 | Intel Corporation | Method and apparatus for efficient loop processing in a graphics hardware front end |
| US10691572B2 (en) | 2017-08-30 | 2020-06-23 | Nvidia Corporation | Liveness as a factor to evaluate memory vulnerability to soft errors |
| US10725740B2 (en) * | 2017-08-31 | 2020-07-28 | Qualcomm Incorporated | Providing efficient multiplication of sparse matrices in matrix-processor-based devices |
| US10503507B2 (en) * | 2017-08-31 | 2019-12-10 | Nvidia Corporation | Inline data inspection for workload simplification |
| US10943171B2 (en) | 2017-09-01 | 2021-03-09 | Facebook, Inc. | Sparse neural network training optimization |
| US10503520B2 (en) | 2017-09-26 | 2019-12-10 | Intel Corporation | Automatic waking of power domains for graphics configuration requests |
| US10782904B2 (en) | 2017-09-28 | 2020-09-22 | Intel Corporation | Host computing arrangement, remote server arrangement, storage system and methods thereof |
| US10692244B2 (en) | 2017-10-06 | 2020-06-23 | Nvidia Corporation | Learning based camera pose estimation from images of an environment |
| US11222256B2 (en) * | 2017-10-17 | 2022-01-11 | Xilinx, Inc. | Neural network processing system having multiple processors and a neural network accelerator |
| US11568218B2 (en) | 2017-10-17 | 2023-01-31 | Xilinx, Inc. | Neural network processing system having host controlled kernel acclerators |
| GB2569271B (en) | 2017-10-20 | 2020-05-13 | Graphcore Ltd | Synchronization with a host processor |
| GB2569274B (en) | 2017-10-20 | 2020-07-15 | Graphcore Ltd | Synchronization amongst processor tiles |
| GB2569844B (en) | 2017-10-20 | 2021-01-06 | Graphcore Ltd | Sending data off-chip |
| GB2569098B (en) | 2017-10-20 | 2020-01-08 | Graphcore Ltd | Combining states of multiple threads in a multi-threaded processor |
| US11651223B2 (en) * | 2017-10-27 | 2023-05-16 | Baidu Usa Llc | Systems and methods for block-sparse recurrent neural networks |
| KR102414047B1 (ko) | 2017-10-30 | 2022-06-29 | 에스케이하이닉스 주식회사 | 통합 메모리 디바이스 및 그의 동작 방법 |
| US10762137B1 (en) | 2017-11-15 | 2020-09-01 | Amazon Technologies, Inc. | Page table search engine |
| US10762620B2 (en) | 2017-11-27 | 2020-09-01 | Nvidia Corporation | Deep-learning method for separating reflection and transmission images visible at a semi-reflective surface in a computer image of a real-world scene |
| US11977974B2 (en) | 2017-11-30 | 2024-05-07 | International Business Machines Corporation | Compression of fully connected / recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression |
| US11294810B2 (en) | 2017-12-12 | 2022-04-05 | Advanced Micro Devices, Inc. | Memory request throttling to constrain memory bandwidth utilization |
| US10579535B2 (en) | 2017-12-15 | 2020-03-03 | Intel Corporation | Defragmented and efficient micro-operation cache |
| EP3789871B1 (en) * | 2017-12-27 | 2023-06-07 | Cambricon Technologies Corporation Limited | Integrated circuit chip device |
| US10482156B2 (en) * | 2017-12-29 | 2019-11-19 | Facebook, Inc. | Sparsity-aware hardware accelerators |
| KR102533241B1 (ko) | 2018-01-25 | 2023-05-16 | 삼성전자주식회사 | 적응적으로 캐시 일관성을 제어하도록 구성된 이종 컴퓨팅 시스템 |
| US10970080B2 (en) * | 2018-02-08 | 2021-04-06 | Marvell Asia Pte, Ltd. | Systems and methods for programmable hardware architecture for machine learning |
| US11693627B2 (en) * | 2018-02-09 | 2023-07-04 | Deepmind Technologies Limited | Contiguous sparsity pattern neural networks |
| US10755201B2 (en) | 2018-02-14 | 2020-08-25 | Lucid Circuit, Inc. | Systems and methods for data collection and analysis at the edge |
| JP2019148969A (ja) * | 2018-02-27 | 2019-09-05 | 富士通株式会社 | 行列演算装置、行列演算方法および行列演算プログラム |
| US20190278600A1 (en) | 2018-03-09 | 2019-09-12 | Nvidia Corporation | Tiled compressed sparse matrix format |
| US12481500B2 (en) | 2018-03-09 | 2025-11-25 | Nvidia Corporation | Accelerating linear algebra kernels for any processor architecture |
| US10678508B2 (en) | 2018-03-23 | 2020-06-09 | Amazon Technologies, Inc. | Accelerated quantized multiply-and-add operations |
| US10572568B2 (en) | 2018-03-28 | 2020-02-25 | Intel Corporation | Accelerator for sparse-dense matrix multiplication |
| WO2019197661A1 (en) | 2018-04-13 | 2019-10-17 | Koninklijke Kpn N.V. | Frame-level super-resolution-based video coding |
| US11010092B2 (en) | 2018-05-09 | 2021-05-18 | Micron Technology, Inc. | Prefetch signaling in memory system or sub-system |
| US10572409B1 (en) * | 2018-05-10 | 2020-02-25 | Xilinx, Inc. | Sparse matrix processing circuitry |
| US11269805B2 (en) | 2018-05-15 | 2022-03-08 | Intel Corporation | Signal pathways in multi-tile processors |
| GB2574060B (en) * | 2018-05-25 | 2022-11-23 | Myrtle Software Ltd | Processing matrix vector multiplication |
| US10838864B2 (en) | 2018-05-30 | 2020-11-17 | Advanced Micro Devices, Inc. | Prioritizing local and remote memory access in a non-uniform memory access architecture |
| US10699468B2 (en) | 2018-06-09 | 2020-06-30 | Adshir Ltd. | Method for non-planar specular reflections in hybrid ray tracing |
| US10620951B2 (en) | 2018-06-22 | 2020-04-14 | Intel Corporation | Matrix multiplication acceleration of sparse matrices using column folding and squeezing |
| US12099912B2 (en) * | 2018-06-22 | 2024-09-24 | Samsung Electronics Co., Ltd. | Neural processor |
| CN110795228B (zh) | 2018-08-03 | 2023-08-25 | 伊姆西Ip控股有限责任公司 | 用于训练深度学习模型的方法和制品、以及计算系统 |
| EP3690679A4 (en) | 2018-08-06 | 2021-02-17 | Huawei Technologies Co., Ltd. | MATRIX PROCESSING PROCESS AND APPARATUS, AND LOGIC CIRCUIT |
| EP3608828A1 (de) * | 2018-08-09 | 2020-02-12 | Olympus Soft Imaging Solutions GmbH | Verfahren zur bereitstellung eines auswertungsmittels für wenigstens ein optisches anwendungssystem einer mikroskopischen anwendungstechnologie |
| KR20200022118A (ko) | 2018-08-22 | 2020-03-03 | 에스케이하이닉스 주식회사 | 데이터 저장 장치 및 그 동작 방법 |
| US20190042457A1 (en) | 2018-08-22 | 2019-02-07 | Intel Corporation | Cache (partition) size determination method and apparatus |
| US11833681B2 (en) * | 2018-08-24 | 2023-12-05 | Nvidia Corporation | Robotic control system |
| US10846241B2 (en) | 2018-08-29 | 2020-11-24 | Vmware, Inc. | Score-based cache admission and eviction |
| US11093248B2 (en) | 2018-09-10 | 2021-08-17 | International Business Machines Corporation | Prefetch queue allocation protection bubble in a processor |
| US10817426B2 (en) | 2018-09-24 | 2020-10-27 | Arm Limited | Prefetching techniques |
| US10769070B2 (en) | 2018-09-25 | 2020-09-08 | Arm Limited | Multiple stride prefetching |
| US20200098725A1 (en) | 2018-09-26 | 2020-03-26 | Intel Corporation | Semiconductor package or semiconductor package structure with dual-sided interposer and memory |
| US10853067B2 (en) | 2018-09-27 | 2020-12-01 | Intel Corporation | Computer processor for higher precision computations using a mixed-precision decomposition of operations |
| US11294626B2 (en) | 2018-09-27 | 2022-04-05 | Intel Corporation | Floating-point dynamic range expansion |
| EP3857387A4 (en) | 2018-09-28 | 2022-05-18 | INTEL Corporation | Translation lookaside buffer to implement adapative page size |
| US11307863B1 (en) | 2018-10-08 | 2022-04-19 | Nvidia Corporation | Graphics processing unit systems for performing data analytics operations in data science |
| CN116541647A (zh) | 2018-10-09 | 2023-08-04 | 华为技术有限公司 | 运算加速器、处理方法及相关设备 |
| US11263529B2 (en) | 2018-10-10 | 2022-03-01 | Google Llc | Modifying machine learning models to improve locality |
| GB2578097B (en) | 2018-10-15 | 2021-02-17 | Advanced Risc Mach Ltd | Cache control circuitry and methods |
| US10768895B2 (en) * | 2018-11-08 | 2020-09-08 | Movidius Limited | Dot product calculators and methods of operating the same |
| US11366663B2 (en) | 2018-11-09 | 2022-06-21 | Intel Corporation | Systems and methods for performing 16-bit floating-point vector dot product instructions |
| US10963246B2 (en) | 2018-11-09 | 2021-03-30 | Intel Corporation | Systems and methods for performing 16-bit floating-point matrix dot product instructions |
| US20200175074A1 (en) | 2018-12-04 | 2020-06-04 | Vmware, Inc. | Tree structure aware cache eviction policy |
| GB2579590B (en) | 2018-12-04 | 2021-10-13 | Imagination Tech Ltd | Workload repetition redundancy |
| US11893470B2 (en) | 2018-12-06 | 2024-02-06 | MIPS Tech, LLC | Neural network processing using specialized data representation |
| US20200202195A1 (en) | 2018-12-06 | 2020-06-25 | MIPS Tech, LLC | Neural network processing using mixed-precision data representation |
| US11615307B2 (en) | 2018-12-06 | 2023-03-28 | MIPS Tech, LLC | Neural network data computation using mixed-precision |
| GB2580151B (en) | 2018-12-21 | 2021-02-24 | Graphcore Ltd | Identifying processing units in a processor |
| GB2580316B (en) | 2018-12-27 | 2021-02-24 | Graphcore Ltd | Instruction cache in a multi-threaded processor |
| US10832371B2 (en) | 2018-12-28 | 2020-11-10 | Intel Corporation | Unified architecture for BVH construction based on hardware pre-sorting and a parallel, reconfigurable clustering array |
| US10909741B2 (en) | 2018-12-28 | 2021-02-02 | Intel Corporation | Speculative execution of hit and intersection shaders on programmable ray tracing architectures |
| US10937225B2 (en) | 2018-12-28 | 2021-03-02 | Intel Corporation | Cell primitive for unstructured volume rendering |
| US11210100B2 (en) | 2019-01-08 | 2021-12-28 | Apple Inc. | Coprocessor operation bundling |
| US11550971B1 (en) | 2019-01-18 | 2023-01-10 | X Development Llc | Physics simulation on machine-learning accelerated hardware platforms |
| KR20200091623A (ko) * | 2019-01-23 | 2020-07-31 | 삼성전자주식회사 | 위노그라드 변환에 기반한 뉴럴 네트워크의 컨볼루션 연산을 수행하는 방법 및 장치 |
| US11106600B2 (en) | 2019-01-24 | 2021-08-31 | Advanced Micro Devices, Inc. | Cache replacement based on translation lookaside buffer evictions |
| US10725923B1 (en) | 2019-02-05 | 2020-07-28 | Arm Limited | Cache access detection and prediction |
| US11805109B1 (en) | 2019-02-25 | 2023-10-31 | Amazon Technologies, Inc. | Data transfer encryption offloading using session pairs |
| US10915461B2 (en) | 2019-03-05 | 2021-02-09 | International Business Machines Corporation | Multilevel cache eviction management |
| EP4270201A3 (en) | 2019-03-15 | 2024-01-31 | INTEL Corporation | Memory controller management techniques |
| WO2020190796A1 (en) | 2019-03-15 | 2020-09-24 | Intel Corporation | Systems and methods for cache optimization |
| EP3938888A1 (en) | 2019-03-15 | 2022-01-19 | INTEL Corporation | Systolic disaggregation within a matrix accelerator architecture |
| US11934342B2 (en) | 2019-03-15 | 2024-03-19 | Intel Corporation | Assistance for hardware prefetch in cache access |
| KR102151444B1 (ko) | 2019-04-11 | 2020-09-03 | 주식회사 실리콘아츠 | Mimd 기반의 t&i 스케줄링을 이용한 레이 트레이싱 장치 |
| US11036642B2 (en) | 2019-04-26 | 2021-06-15 | Intel Corporation | Architectural enhancements for computing systems having artificial intelligence logic disposed locally to memory |
| US11126404B2 (en) | 2019-05-20 | 2021-09-21 | Nxp B.V. | Random number generator using multiple entropy sources and a method for generating random numbers |
| US11675998B2 (en) | 2019-07-15 | 2023-06-13 | Meta Platforms Technologies, Llc | System and method for performing small channel count convolutions in energy-efficient input operand stationary accelerator |
| US11201838B2 (en) | 2019-09-25 | 2021-12-14 | Intel Corporation | System, apparatus and method for increasing efficiency of link communications |
| US11663746B2 (en) | 2019-11-15 | 2023-05-30 | Intel Corporation | Systolic arithmetic on sparse data |
| US11861761B2 (en) | 2019-11-15 | 2024-01-02 | Intel Corporation | Graphics processing unit processing and caching improvements |
| US11275561B2 (en) | 2019-12-12 | 2022-03-15 | International Business Machines Corporation | Mixed precision floating-point multiply-add operation |
| US11645145B2 (en) | 2019-12-16 | 2023-05-09 | Qualcomm Incorporated | Methods and apparatus to facilitate speculative page fault handling in a graphics processing unit |
| US11658922B2 (en) | 2020-08-31 | 2023-05-23 | Micron Technology, Inc. | Optional path ordering in packet-based network |
| US11392527B2 (en) | 2020-08-31 | 2022-07-19 | Micron Technology, Inc. | Ordered delivery of data packets based on type of path information in each packet |
| US12164924B2 (en) | 2020-09-25 | 2024-12-10 | Advanced Micro Devices, Inc. | Compression metadata assisted computation |
| CN112506567B (zh) | 2020-11-27 | 2022-11-04 | 海光信息技术股份有限公司 | 数据读取方法和数据读取电路 |
| US12216734B2 (en) | 2020-12-23 | 2025-02-04 | Intel Corporation | Apparatus and method for conjugate transpose and multiply |
| US12190405B2 (en) | 2021-07-06 | 2025-01-07 | Intel Corporation | Direct memory writes by network interface of a graphics processing unit |
| US11775307B2 (en) | 2021-09-24 | 2023-10-03 | Apple Inc. | Systems and methods for synchronizing data processing in a cellular modem |
| EP4402554A1 (en) | 2021-10-19 | 2024-07-24 | Google Llc | Large-scale accelerator system energy performance optimization |
| CN114050884B (zh) | 2021-11-08 | 2023-05-12 | 重庆邮电大学 | 一种面向工业无线与tsn融合的跨网时间同步方法 |
| US11941742B2 (en) | 2022-06-23 | 2024-03-26 | Apple Inc. | Tiled processor communication fabric |
| US20240111609A1 (en) | 2022-09-30 | 2024-04-04 | Intel Corporation | Synchronization utilizing local team barriers for thread team processing |
| CN115756384B (zh) | 2022-11-22 | 2024-05-17 | 海光信息技术股份有限公司 | 张量计算单元及使用方法、数据处理装置及操作方法 |
-
2020
- 2020-03-14 EP EP20718907.7A patent/EP3938888A1/en active Pending
- 2020-03-14 JP JP2021547450A patent/JP7408671B2/ja active Active
- 2020-03-14 KR KR1020217025888A patent/KR20210135999A/ko active Pending
- 2020-03-14 SG SG11202107290QA patent/SG11202107290QA/en unknown
- 2020-03-14 DK DK20718909.3T patent/DK3938890T3/da active
- 2020-03-14 CN CN202080014231.1A patent/CN113383310A/zh active Pending
- 2020-03-14 DE DE112020000846.0T patent/DE112020000846T5/de active Pending
- 2020-03-14 WO PCT/US2020/022846 patent/WO2020190808A1/en not_active Ceased
- 2020-03-14 BR BR112021016106A patent/BR112021016106A2/pt unknown
- 2020-03-14 PL PL20718909.3T patent/PL3938890T3/pl unknown
- 2020-03-14 EP EP25174786.1A patent/EP4571498A3/en active Pending
- 2020-03-14 EP EP20718909.3A patent/EP3938890B1/en active Active
- 2020-03-14 FI FIEP20718909.3T patent/FI3938890T3/fi active
- 2020-03-14 BR BR112021016138A patent/BR112021016138A2/pt unknown
- 2020-03-14 CN CN202110224132.2A patent/CN112905241B/zh active Active
- 2020-03-14 DE DE112020001249.2T patent/DE112020001249T5/de active Pending
- 2020-03-14 KR KR1020217025864A patent/KR102838677B1/ko active Active
- 2020-03-14 CN CN202110214543.3A patent/CN112905240B/zh active Active
- 2020-03-14 AU AU2020241262A patent/AU2020241262B2/en active Active
- 2020-03-14 CN CN202080004209.9A patent/CN112534404A/zh active Pending
- 2020-03-14 EP EP20718908.5A patent/EP3938889A1/en active Pending
- 2020-03-14 WO PCT/US2020/022847 patent/WO2020190809A1/en not_active Ceased
- 2020-03-14 CN CN202080004288.3A patent/CN112534405A/zh active Pending
- 2020-03-14 JP JP2021547288A patent/JP7494197B2/ja active Active
- 2020-03-14 WO PCT/US2020/022845 patent/WO2020190807A1/en not_active Ceased
- 2020-03-14 KR KR1020217025943A patent/KR20210136994A/ko active Pending
- 2020-03-14 ES ES20718909T patent/ES3041900T3/es active Active
- 2020-03-14 JP JP2021547452A patent/JP7423644B2/ja active Active
- 2020-10-06 US US17/064,427 patent/US11113784B2/en active Active
- 2020-12-15 US US17/122,905 patent/US11842423B2/en active Active
-
2021
- 2021-06-03 US US17/303,654 patent/US11676239B2/en active Active
-
2023
- 2023-05-02 US US18/310,688 patent/US12293431B2/en active Active
- 2023-12-07 US US18/532,245 patent/US12198222B2/en active Active
-
2024
- 2024-01-18 JP JP2024006026A patent/JP7717863B2/ja active Active
- 2024-12-03 US US18/967,123 patent/US20250166114A1/en active Pending
- 2024-12-03 US US18/967,172 patent/US20250104180A1/en active Pending
-
2025
- 2025-02-20 US US19/058,072 patent/US20250209564A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018125250A1 (en) | 2016-12-31 | 2018-07-05 | Intel Corporation | Systems, methods, and apparatuses for heterogeneous computing |
| WO2018213636A1 (en) | 2017-05-17 | 2018-11-22 | Google Llc | Performing matrix multiplication in hardware |
Also Published As
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7408671B2 (ja) | シストリックアレイに対するブロックスパース演算のためのアーキテクチャ | |
| JP7631649B2 (ja) | ハイブリッド浮動小数点フォーマットのドット積累算命令を有するグラフィックスプロセッサ及びグラフィックス処理ユニット | |
| EP4427188A1 (en) | Combined denoising and upscaling network with importance sampling in a graphics environment | |
| EP4109303A1 (en) | Using sparsity metadata to reduce systolic array power consumption | |
| US20230066626A1 (en) | Temporally amortized supersampling using a mixed precision convolutional neural network | |
| WO2023079323A1 (en) | Temporally amortized supersampling using a kernel splatting network | |
| US20230147063A1 (en) | Motion vector refinement for temporally amortized supersampling |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220705 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220705 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20230531 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230606 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230626 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20231003 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20231107 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231121 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20231220 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7408671 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |