PL3637246T3 - Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego - Google Patents
Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowegoInfo
- Publication number
- PL3637246T3 PL3637246T3 PL19214143T PL19214143T PL3637246T3 PL 3637246 T3 PL3637246 T3 PL 3637246T3 PL 19214143 T PL19214143 T PL 19214143T PL 19214143 T PL19214143 T PL 19214143T PL 3637246 T3 PL3637246 T3 PL 3637246T3
- Authority
- PL
- Poland
- Prior art keywords
- logic
- instructions
- point
- machine learning
- integer operations
- Prior art date
Links
- 238000010801 machine learning Methods 0.000 title 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/57—Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
- G09G5/393—Arrangements for updating the contents of the bit-mapped memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
- G06F2207/3808—Details concerning the type of numbers or the way they are handled
- G06F2207/3812—Devices capable of handling different types of numbers
- G06F2207/3824—Accepting both fixed-point and floating-point numbers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30025—Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/3013—Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Nonlinear Science (AREA)
- Neurology (AREA)
- Computer Hardware Design (AREA)
- Multimedia (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Advance Control (AREA)
- Complex Calculations (AREA)
- Image Analysis (AREA)
- Computer Graphics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Numerical Control (AREA)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762491699P | 2017-04-28 | 2017-04-28 | |
US15/787,129 US10474458B2 (en) | 2017-04-28 | 2017-10-18 | Instructions and logic to perform floating-point and integer operations for machine learning |
EP18164093.9A EP3396524A1 (en) | 2017-04-28 | 2018-03-26 | Instructions and logic to perform floating-point and integer operations for machine learning |
EP19214143.0A EP3637246B1 (en) | 2017-04-28 | 2018-03-26 | Instructions and logic to perform floating-point and integer operations for machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
PL3637246T3 true PL3637246T3 (pl) | 2022-07-04 |
Family
ID=61827531
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PL19214143T PL3637246T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
PL19214829.4T PL3637247T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
PL21195277.5T PL3937004T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika przeprowadzania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
PL21165109.6T PL3859519T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PL19214829.4T PL3637247T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
PL21195277.5T PL3937004T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika przeprowadzania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
PL21165109.6T PL3859519T3 (pl) | 2017-04-28 | 2018-03-26 | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego |
Country Status (6)
Country | Link |
---|---|
US (8) | US10474458B2 (pl) |
EP (9) | EP3796154A1 (pl) |
CN (9) | CN115826916A (pl) |
ES (4) | ES2925598T3 (pl) |
PL (4) | PL3637246T3 (pl) |
TW (5) | TWI784372B (pl) |
Families Citing this family (96)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11037330B2 (en) * | 2017-04-08 | 2021-06-15 | Intel Corporation | Low rank matrix compression |
US10474458B2 (en) | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
DE102018110607A1 (de) | 2017-05-08 | 2018-11-08 | Nvidia Corporation | Verallgemeinerte Beschleunigung von Matrix-Multiplikations-und-Akkumulations-Operationen |
US10338919B2 (en) | 2017-05-08 | 2019-07-02 | Nvidia Corporation | Generalized acceleration of matrix multiply accumulate operations |
CN108228696B (zh) * | 2017-08-31 | 2021-03-23 | 深圳市商汤科技有限公司 | 人脸图像检索方法和系统、拍摄装置、计算机存储介质 |
US11216250B2 (en) * | 2017-12-06 | 2022-01-04 | Advanced Micro Devices, Inc. | Dynamic, variable bit-width numerical precision on field-programmable gate arrays for machine learning tasks |
US11048644B1 (en) * | 2017-12-11 | 2021-06-29 | Amazon Technologies, Inc. | Memory mapping in an access device for non-volatile memory |
US10671147B2 (en) * | 2017-12-18 | 2020-06-02 | Facebook, Inc. | Dynamic power management for artificial intelligence hardware accelerators |
US10474430B2 (en) * | 2017-12-29 | 2019-11-12 | Facebook, Inc. | Mixed-precision processing elements, systems, and methods for computational models |
KR102637735B1 (ko) * | 2018-01-09 | 2024-02-19 | 삼성전자주식회사 | 근사 곱셈기를 구비하는 뉴럴 네트워크 처리 장치 및 이를 포함하는 시스템온 칩 |
US10311861B1 (en) * | 2018-01-15 | 2019-06-04 | Gyrfalcon Technology Inc. | System and method for encoding data in a voice recognition integrated circuit solution |
CN108388446A (zh) * | 2018-02-05 | 2018-08-10 | 上海寒武纪信息科技有限公司 | 运算模块以及方法 |
US11514306B1 (en) * | 2018-03-14 | 2022-11-29 | Meta Platforms, Inc. | Static memory allocation in neural networks |
US11216732B2 (en) * | 2018-05-31 | 2022-01-04 | Neuralmagic Inc. | Systems and methods for generation of sparse code for convolutional neural networks |
US10684824B2 (en) * | 2018-06-06 | 2020-06-16 | Nvidia Corporation | Stochastic rounding of numerical values |
US10803141B2 (en) * | 2018-07-05 | 2020-10-13 | Gsi Technology Inc. | In-memory stochastic rounder |
US10769310B2 (en) * | 2018-07-20 | 2020-09-08 | Nxp B.V. | Method for making a machine learning model more difficult to copy |
US10636484B2 (en) * | 2018-09-12 | 2020-04-28 | Winbond Electronics Corporation | Circuit and method for memory operation |
US11455766B2 (en) * | 2018-09-18 | 2022-09-27 | Advanced Micro Devices, Inc. | Variable precision computing system |
US10922203B1 (en) * | 2018-09-21 | 2021-02-16 | Nvidia Corporation | Fault injection architecture for resilient GPU computing |
US10853067B2 (en) | 2018-09-27 | 2020-12-01 | Intel Corporation | Computer processor for higher precision computations using a mixed-precision decomposition of operations |
US11468291B2 (en) | 2018-09-28 | 2022-10-11 | Nxp B.V. | Method for protecting a machine learning ensemble from copying |
US11243743B2 (en) * | 2018-10-18 | 2022-02-08 | Facebook, Inc. | Optimization of neural networks using hardware calculation efficiency and adjustment factors |
US11366663B2 (en) * | 2018-11-09 | 2022-06-21 | Intel Corporation | Systems and methods for performing 16-bit floating-point vector dot product instructions |
CN109710211B (zh) * | 2018-11-15 | 2021-03-19 | 珠海市杰理科技股份有限公司 | 浮点数据类型转换方法、装置、存储介质及计算机设备 |
US11568235B2 (en) * | 2018-11-19 | 2023-01-31 | International Business Machines Corporation | Data driven mixed precision learning for neural networks |
US11449268B2 (en) * | 2018-11-20 | 2022-09-20 | Samsung Electronics Co., Ltd. | Deep solid state device (deep-SSD): a neural network based persistent data storage |
US11537853B1 (en) * | 2018-11-28 | 2022-12-27 | Amazon Technologies, Inc. | Decompression and compression of neural network data using different compression schemes |
CN111258641B (zh) * | 2018-11-30 | 2022-12-09 | 上海寒武纪信息科技有限公司 | 运算方法、装置及相关产品 |
CN109754084B (zh) * | 2018-12-29 | 2020-06-12 | 中科寒武纪科技股份有限公司 | 网络结构的处理方法、装置及相关产品 |
US10963219B2 (en) * | 2019-02-06 | 2021-03-30 | International Business Machines Corporation | Hybrid floating point representation for deep learning acceleration |
US11651192B2 (en) * | 2019-02-12 | 2023-05-16 | Apple Inc. | Compressed convolutional neural network models |
US11074100B2 (en) * | 2019-02-27 | 2021-07-27 | Micron Technology, Inc. | Arithmetic and logical operations in a multi-user network |
EP3938894B1 (en) | 2019-03-15 | 2023-08-30 | INTEL Corporation | Multi-tile memory management for detecting cross tile access, providing multi-tile inference scaling, and providing optimal page migration |
KR20210136994A (ko) | 2019-03-15 | 2021-11-17 | 인텔 코포레이션 | 매트릭스 가속기 아키텍처 내에서의 시스톨릭 분리 |
US11934342B2 (en) | 2019-03-15 | 2024-03-19 | Intel Corporation | Assistance for hardware prefetch in cache access |
US10884736B1 (en) * | 2019-03-15 | 2021-01-05 | Cadence Design Systems, Inc. | Method and apparatus for a low energy programmable vector processing unit for neural networks backend processing |
US11768664B2 (en) | 2019-03-15 | 2023-09-26 | Advanced Micro Devices, Inc. | Processing unit with mixed precision operations |
US10853129B1 (en) * | 2019-03-19 | 2020-12-01 | Amazon Technologies, Inc. | Accelerator based inference service |
CN111767980B (zh) * | 2019-04-02 | 2024-03-05 | 杭州海康威视数字技术股份有限公司 | 模型优化方法、装置及设备 |
CN110334801A (zh) * | 2019-05-09 | 2019-10-15 | 苏州浪潮智能科技有限公司 | 一种卷积神经网络的硬件加速方法、装置、设备及系统 |
US11288040B2 (en) | 2019-06-07 | 2022-03-29 | Intel Corporation | Floating-point dot-product hardware with wide multiply-adder tree for machine learning accelerators |
FR3097993B1 (fr) * | 2019-06-25 | 2021-10-22 | Kalray | Opérateur de produit scalaire de nombres à virgule flottante réalisant un arrondi correct |
FR3097992B1 (fr) | 2019-06-25 | 2021-06-25 | Kalray | Opérateur d’addition et multiplication fusionnées pour nombres à virgule flottante de précision mixte réalisant un arrondi correct |
EP3991357A1 (en) * | 2019-06-25 | 2022-05-04 | Marvell Asia Pte, Ltd. | Automotive network switch with anomaly detection |
EP3764286A1 (fr) * | 2019-07-10 | 2021-01-13 | STMicroelectronics (Rousset) SAS | Procédé et outil informatique de détermination de fonctions de transferts entre des paires de couches successives d'un réseau de neurones |
CN110399972B (zh) * | 2019-07-22 | 2021-05-25 | 上海商汤智能科技有限公司 | 数据处理方法、装置及电子设备 |
US11704231B2 (en) * | 2019-07-26 | 2023-07-18 | Microsoft Technology Licensing, Llc | Techniques for conformance testing computational operations |
CN112394997A (zh) * | 2019-08-13 | 2021-02-23 | 上海寒武纪信息科技有限公司 | 八位整形转半精度浮点指令处理装置、方法及相关产品 |
CN110503195A (zh) * | 2019-08-14 | 2019-11-26 | 北京中科寒武纪科技有限公司 | 利用人工智能处理器执行任务的方法及其相关产品 |
CN110598172B (zh) * | 2019-08-22 | 2022-10-25 | 瑞芯微电子股份有限公司 | 一种基于csa加法器的卷积运算方法和电路 |
WO2021036905A1 (zh) * | 2019-08-27 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | 数据处理方法、装置、计算机设备和存储介质 |
US11842169B1 (en) * | 2019-09-25 | 2023-12-12 | Amazon Technologies, Inc. | Systolic multiply delayed accumulate processor architecture |
US20210089316A1 (en) * | 2019-09-25 | 2021-03-25 | Intel Corporation | Deep learning implementations using systolic arrays and fused operations |
US11663444B2 (en) | 2019-09-27 | 2023-05-30 | Microsoft Technology Licensing, Llc | Pipelined neural network processing with continuous and asynchronous updates |
US11676010B2 (en) * | 2019-10-14 | 2023-06-13 | Micron Technology, Inc. | Memory sub-system with a bus to transmit data for a machine learning operation and another bus to transmit host data |
CN110764733B (zh) * | 2019-10-15 | 2023-06-30 | 天津津航计算技术研究所 | 一种基于fpga的多种分布随机数生成装置 |
US11288220B2 (en) | 2019-10-18 | 2022-03-29 | Achronix Semiconductor Corporation | Cascade communications between FPGA tiles |
CN112783520A (zh) * | 2019-11-04 | 2021-05-11 | 阿里巴巴集团控股有限公司 | 执行方法、装置、电子设备及存储介质 |
CN110888623B (zh) * | 2019-11-25 | 2021-11-23 | 集美大学 | 数据转换方法、乘法器、加法器、终端设备及存储介质 |
US11334317B2 (en) * | 2019-11-27 | 2022-05-17 | Core Concept Technologies Inc. | Information processing apparatus, program, and information processing method configured to handle a high-precision computer number |
US11816446B2 (en) * | 2019-11-27 | 2023-11-14 | Amazon Technologies, Inc. | Systolic array component combining multiple integer and floating-point data types |
US11467806B2 (en) | 2019-11-27 | 2022-10-11 | Amazon Technologies, Inc. | Systolic array including fused multiply accumulate with efficient prenormalization and extended dynamic range |
TWI774110B (zh) * | 2019-11-29 | 2022-08-11 | 凌華科技股份有限公司 | 適於工業自動化設備之共享記憶體的資料分發服務之系統及其運作方法 |
US11282192B2 (en) * | 2019-12-19 | 2022-03-22 | Varian Medical Systems International Ag | Training deep learning engines for radiotherapy treatment planning |
CN111186139B (zh) * | 2019-12-25 | 2022-03-15 | 西北工业大学 | 一种3d打印模型的多层次并行切片方法 |
US11861492B1 (en) * | 2019-12-26 | 2024-01-02 | Cadence Design Systems, Inc. | Quantizing trained neural networks with removal of normalization |
CN111242293B (zh) * | 2020-01-13 | 2023-07-18 | 腾讯科技(深圳)有限公司 | 一种处理部件、数据处理的方法以及电子设备 |
US11922292B2 (en) | 2020-01-27 | 2024-03-05 | Google Llc | Shared scratchpad memory with parallel load-store |
US20210241080A1 (en) * | 2020-02-05 | 2021-08-05 | Macronix International Co., Ltd. | Artificial intelligence accelerator and operation thereof |
US11360772B2 (en) | 2020-03-31 | 2022-06-14 | International Business Machines Corporation | Instruction sequence merging and splitting for optimized accelerator implementation |
JP6896306B1 (ja) * | 2020-04-13 | 2021-06-30 | LeapMind株式会社 | ニューラルネットワーク回路、エッジデバイスおよびニューラルネットワーク演算方法 |
CN111582465B (zh) * | 2020-05-08 | 2023-04-07 | 中国科学院上海高等研究院 | 基于fpga的卷积神经网络加速处理系统、方法以及终端 |
US11232062B1 (en) | 2020-06-29 | 2022-01-25 | Amazon Technologies, Inc. | Parallelism within a systolic array using multiple accumulate busses |
US11113233B1 (en) | 2020-06-29 | 2021-09-07 | Amazon Technologies, Inc. | Multiple busses in a grouped systolic array |
US11308026B1 (en) | 2020-06-29 | 2022-04-19 | Amazon Technologies, Inc. | Multiple busses interleaved in a systolic array |
US11308027B1 (en) | 2020-06-29 | 2022-04-19 | Amazon Technologies, Inc. | Multiple accumulate busses in a systolic array |
US11422773B1 (en) | 2020-06-29 | 2022-08-23 | Amazon Technologies, Inc. | Multiple busses within a systolic array processing element |
JP2022016795A (ja) * | 2020-07-13 | 2022-01-25 | 富士通株式会社 | 情報処理装置、情報処理プログラムおよび情報処理方法 |
CN111930342B (zh) * | 2020-09-15 | 2021-01-19 | 浙江大学 | 一种针对规格化浮点数的误差无偏近似乘法器及其实现方法 |
CN112784969B (zh) * | 2021-02-01 | 2024-05-14 | 东北大学 | 用于图像特征提取的卷积神经网络加速学习方法 |
CN112579519B (zh) * | 2021-03-01 | 2021-05-25 | 湖北芯擎科技有限公司 | 数据运算电路和处理芯片 |
TWI778537B (zh) * | 2021-03-05 | 2022-09-21 | 國立臺灣科技大學 | 神經網路加速單元的動態設計方法 |
US11880682B2 (en) | 2021-06-30 | 2024-01-23 | Amazon Technologies, Inc. | Systolic array with efficient input reduction and extended array performance |
CN113535638B (zh) * | 2021-07-20 | 2022-11-15 | 珠海市一微星科技有限公司 | 一种并行运算加速系统及其运行方法 |
CN113535637B (zh) * | 2021-07-20 | 2022-11-15 | 珠海市一微星科技有限公司 | 一种运算加速单元及其运行方法 |
US20230065528A1 (en) * | 2021-08-31 | 2023-03-02 | Samsung Electronics Co., Ltd. | Apparatus and method with multi-format data support |
KR20230063791A (ko) * | 2021-11-02 | 2023-05-09 | 리벨리온 주식회사 | 인공지능 코어, 인공지능 코어 시스템 및 인공지능 코어 시스템의 로드/스토어 방법 |
US20230205488A1 (en) * | 2021-12-23 | 2023-06-29 | Samsung Electronics Co., Ltd. | Efficient circuit for neural network processing |
WO2023212390A1 (en) * | 2022-04-29 | 2023-11-02 | University Of Southern California | Neural network methods |
GB2621196A (en) * | 2022-08-01 | 2024-02-07 | Advanced Risc Mach Ltd | Broadcasting machine learning data |
CN117910523A (zh) * | 2022-10-19 | 2024-04-19 | 联发科技股份有限公司 | 将暂存存储器分配给异构设备的方法和系统 |
CN117132450B (zh) * | 2023-10-24 | 2024-02-20 | 芯动微电子科技(武汉)有限公司 | 一种可实现数据共享的计算装置和图形处理器 |
CN117492693B (zh) * | 2024-01-03 | 2024-03-22 | 沐曦集成电路(上海)有限公司 | 一种用于滤波器的浮点数据处理系统 |
CN117850882B (zh) * | 2024-03-07 | 2024-05-24 | 北京壁仞科技开发有限公司 | 单指令多线程的处理装置及方法 |
CN117931123B (zh) * | 2024-03-25 | 2024-06-14 | 中科亿海微电子科技(苏州)有限公司 | 一种应用于fpga的低功耗可变精度嵌入式dsp硬核结构 |
Family Cites Families (190)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
BE568342A (pl) | 1957-06-07 | |||
US3872442A (en) * | 1972-12-14 | 1975-03-18 | Sperry Rand Corp | System for conversion between coded byte and floating point format |
US4476523A (en) | 1981-06-11 | 1984-10-09 | Data General Corporation | Fixed point and floating point computation units using commonly shared control fields |
US4852048A (en) * | 1985-12-12 | 1989-07-25 | Itt Corporation | Single instruction multiple data (SIMD) cellular array processing apparatus employing a common bus where a first number of bits manifest a first bus portion and a second number of bits manifest a second bus portion |
US4823260A (en) | 1987-11-12 | 1989-04-18 | Intel Corporation | Mixed-precision floating point operations from a single instruction opcode |
US5268856A (en) * | 1988-06-06 | 1993-12-07 | Applied Intelligent Systems, Inc. | Bit serial floating point parallel processing system and method |
JP2581236B2 (ja) | 1989-11-16 | 1997-02-12 | 三菱電機株式会社 | データ処理装置 |
JP2682232B2 (ja) * | 1990-11-21 | 1997-11-26 | 松下電器産業株式会社 | 浮動小数点演算処理装置 |
US5450607A (en) * | 1993-05-17 | 1995-09-12 | Mips Technologies Inc. | Unified floating point and integer datapath for a RISC processor |
US5574928A (en) | 1993-10-29 | 1996-11-12 | Advanced Micro Devices, Inc. | Mixed integer/floating point processor core for a superscalar microprocessor with a plurality of operand buses for transferring operand segments |
US5627985A (en) * | 1994-01-04 | 1997-05-06 | Intel Corporation | Speculative and committed resource files in an out-of-order processor |
US5673407A (en) | 1994-03-08 | 1997-09-30 | Texas Instruments Incorporated | Data processor having capability to perform both floating point operations and memory access in response to a single instruction |
US5983257A (en) * | 1995-12-26 | 1999-11-09 | Intel Corporation | System for signal processing using multiply-add operations |
US5940311A (en) * | 1996-04-30 | 1999-08-17 | Texas Instruments Incorporated | Immediate floating-point operand reformatting in a microprocessor |
US5917741A (en) * | 1996-08-29 | 1999-06-29 | Intel Corporation | Method and apparatus for performing floating-point rounding operations for multiple precisions using incrementers |
JP3790307B2 (ja) * | 1996-10-16 | 2006-06-28 | 株式会社ルネサステクノロジ | データプロセッサ及びデータ処理システム |
US5887160A (en) | 1996-12-10 | 1999-03-23 | Fujitsu Limited | Method and apparatus for communicating integer and floating point data over a shared data path in a single instruction pipeline processor |
US5880984A (en) * | 1997-01-13 | 1999-03-09 | International Business Machines Corporation | Method and apparatus for performing high-precision multiply-add calculations using independent multiply and add instruments |
US6078940A (en) | 1997-01-24 | 2000-06-20 | Texas Instruments Incorporated | Microprocessor with an instruction for multiply and left shift with saturate |
US5926406A (en) * | 1997-04-30 | 1999-07-20 | Hewlett-Packard, Co. | System and method for calculating floating point exponential values in a geometry accelerator |
US6253311B1 (en) * | 1997-11-29 | 2001-06-26 | Jp First Llc | Instruction set for bi-directional conversion and transfer of integer and floating point data |
US6049865A (en) * | 1997-12-18 | 2000-04-11 | Motorola, Inc. | Method and apparatus for implementing floating point projection instructions |
US6260008B1 (en) * | 1998-01-08 | 2001-07-10 | Sharp Kabushiki Kaisha | Method of and system for disambiguating syntactic word multiples |
US6591084B1 (en) * | 1998-04-27 | 2003-07-08 | General Dynamics Decision Systems, Inc. | Satellite based data transfer and delivery system |
US6728839B1 (en) | 1998-10-28 | 2004-04-27 | Cisco Technology, Inc. | Attribute based memory pre-fetching technique |
US6480872B1 (en) * | 1999-01-21 | 2002-11-12 | Sandcraft, Inc. | Floating-point and integer multiply-add and multiply-accumulate |
US7941647B2 (en) * | 1999-01-28 | 2011-05-10 | Ati Technologies Ulc | Computer for executing two instruction sets and adds a macroinstruction end marker for performing iterations after loop termination |
JP2002536763A (ja) * | 1999-02-12 | 2002-10-29 | エムアイピーエス テクノロジーズ, インコーポレイテッド | 命令セット構造の比較拡張を有するプロセッサ |
US6529928B1 (en) * | 1999-03-23 | 2003-03-04 | Silicon Graphics, Inc. | Floating-point adder performing floating-point and integer operations |
US6788738B1 (en) * | 1999-05-07 | 2004-09-07 | Xilinx, Inc. | Filter accelerator for a digital signal processor |
US7499053B2 (en) | 2000-06-19 | 2009-03-03 | Mental Images Gmbh | Real-time precision ray tracing |
US6678806B1 (en) | 2000-08-23 | 2004-01-13 | Chipwrights Design, Inc. | Apparatus and method for using tagged pointers for extract, insert and format operations |
US7127482B2 (en) * | 2001-11-19 | 2006-10-24 | Intel Corporation | Performance optimized approach for efficient downsampling operations |
US7225216B1 (en) * | 2002-07-09 | 2007-05-29 | Nvidia Corporation | Method and system for a floating point multiply-accumulator |
US7373369B2 (en) * | 2003-06-05 | 2008-05-13 | International Business Machines Corporation | Advanced execution of extended floating-point add operations in a narrow dataflow |
CN1584821A (zh) * | 2003-08-19 | 2005-02-23 | 中国科学院微电子中心 | 并行处理的可分割的乘法累加单元 |
US7272624B2 (en) * | 2003-09-30 | 2007-09-18 | International Business Machines Corporation | Fused booth encoder multiplexer |
GB2409068A (en) | 2003-12-09 | 2005-06-15 | Advanced Risc Mach Ltd | Data element size control within parallel lanes of processing |
KR100800468B1 (ko) * | 2004-01-29 | 2008-02-01 | 삼성전자주식회사 | 저전력 고속 동작을 위한 하드웨어 암호화/복호화 장치 및그 방법 |
US8253750B1 (en) | 2004-02-14 | 2012-08-28 | Nvidia Corporation | Digital media processor |
US7873812B1 (en) | 2004-04-05 | 2011-01-18 | Tibet MIMAR | Method and system for efficient matrix multiplication in a SIMD processor architecture |
US7428566B2 (en) * | 2004-11-10 | 2008-09-23 | Nvidia Corporation | Multipurpose functional unit with multiply-add and format conversion pipeline |
US20060101244A1 (en) * | 2004-11-10 | 2006-05-11 | Nvidia Corporation | Multipurpose functional unit with combined integer and floating-point multiply-add pipeline |
US20060179092A1 (en) * | 2005-02-10 | 2006-08-10 | Schmookler Martin S | System and method for executing fixed point divide operations using a floating point multiply-add pipeline |
EP1889178A2 (en) * | 2005-05-13 | 2008-02-20 | Provost, Fellows and Scholars of the College of the Holy and Undivided Trinity of Queen Elizabeth near Dublin | A data processing system and method |
US8250348B2 (en) * | 2005-05-19 | 2012-08-21 | International Business Machines Corporation | Methods and apparatus for dynamically switching processor mode |
US20070030277A1 (en) | 2005-08-08 | 2007-02-08 | Via Technologies, Inc. | Method for processing vertex, triangle, and pixel graphics data packets |
US7659899B2 (en) | 2005-08-08 | 2010-02-09 | Via Technologies, Inc. | System and method to manage data processing stages of a logical graphics pipeline |
US20070074008A1 (en) * | 2005-09-28 | 2007-03-29 | Donofrio David D | Mixed mode floating-point pipeline with extended functions |
US8327115B2 (en) * | 2006-04-12 | 2012-12-04 | Soft Machines, Inc. | Plural matrices of execution units for processing matrices of row dependent instructions in single clock cycle in super or separate mode |
US8146066B2 (en) * | 2006-06-20 | 2012-03-27 | Google Inc. | Systems and methods for caching compute kernels for an application running on a parallel-processing computer system |
US7467280B2 (en) | 2006-07-05 | 2008-12-16 | International Business Machines Corporation | Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache |
US20080071851A1 (en) | 2006-09-20 | 2008-03-20 | Ronen Zohar | Instruction and logic for performing a dot-product operation |
US8122078B2 (en) | 2006-10-06 | 2012-02-21 | Calos Fund, LLC | Processor with enhanced combined-arithmetic capability |
US8781110B2 (en) * | 2007-06-30 | 2014-07-15 | Intel Corporation | Unified system architecture for elliptic-curve cryptography |
US7783859B2 (en) | 2007-07-12 | 2010-08-24 | Qnx Software Systems Gmbh & Co. Kg | Processing system implementing variable page size memory organization |
US20100281235A1 (en) * | 2007-11-17 | 2010-11-04 | Martin Vorbach | Reconfigurable floating-point and bit-level data processing unit |
US8106914B2 (en) * | 2007-12-07 | 2012-01-31 | Nvidia Corporation | Fused multiply-add functional unit |
KR20090071823A (ko) * | 2007-12-28 | 2009-07-02 | 한국과학기술원 | 다기능 연산장치 및 방법 |
US9678775B1 (en) * | 2008-04-09 | 2017-06-13 | Nvidia Corporation | Allocating memory for local variables of a multi-threaded program for execution in a single-threaded environment |
US8633936B2 (en) | 2008-04-21 | 2014-01-21 | Qualcomm Incorporated | Programmable streaming processor with mixed precision instruction execution |
US8078833B2 (en) * | 2008-05-29 | 2011-12-13 | Axis Semiconductor, Inc. | Microprocessor with highly configurable pipeline and executional unit internal hierarchal structures, optimizable for different types of computational functions |
US7945768B2 (en) * | 2008-06-05 | 2011-05-17 | Motorola Mobility, Inc. | Method and apparatus for nested instruction looping using implicit predicates |
US8340280B2 (en) * | 2008-06-13 | 2012-12-25 | Intel Corporation | Using a single instruction multiple data (SIMD) instruction to speed up galois counter mode (GCM) computations |
US8219757B2 (en) | 2008-09-30 | 2012-07-10 | Intel Corporation | Apparatus and method for low touch cache management |
US8290882B2 (en) * | 2008-10-09 | 2012-10-16 | Microsoft Corporation | Evaluating decision trees on a GPU |
US8645634B1 (en) * | 2009-01-16 | 2014-02-04 | Nvidia Corporation | Zero-copy data sharing by cooperating asymmetric coprocessors |
US20100185816A1 (en) | 2009-01-21 | 2010-07-22 | Sauber William F | Multiple Cache Line Size |
US8266409B2 (en) | 2009-03-03 | 2012-09-11 | Qualcomm Incorporated | Configurable cache and method to configure same |
US8655937B1 (en) * | 2009-04-29 | 2014-02-18 | Nvidia Corporation | High precision integer division using low precision hardware operations and rounding techniques |
US8577950B2 (en) * | 2009-08-17 | 2013-11-05 | International Business Machines Corporation | Matrix multiplication operations with data pre-conditioning in a high performance computing architecture |
US8364739B2 (en) | 2009-09-30 | 2013-01-29 | International Business Machines Corporation | Sparse matrix-vector multiplication on graphics processor units |
US8984043B2 (en) * | 2009-12-23 | 2015-03-17 | Intel Corporation | Multiplying and adding matrices |
US8669990B2 (en) | 2009-12-31 | 2014-03-11 | Intel Corporation | Sharing resources between a CPU and GPU |
US20110208505A1 (en) | 2010-02-24 | 2011-08-25 | Advanced Micro Devices, Inc. | Assigning floating-point operations to a floating-point unit and an arithmetic logic unit |
US20110249744A1 (en) | 2010-04-12 | 2011-10-13 | Neil Bailey | Method and System for Video Processing Utilizing N Scalar Cores and a Single Vector Core |
US8812575B2 (en) * | 2010-07-06 | 2014-08-19 | Silminds, Llc, Egypt | Decimal floating-point square-root unit using Newton-Raphson iterations |
CN201927837U (zh) | 2010-08-10 | 2011-08-10 | 富士康(昆山)电脑接插件有限公司 | 连接器模组 |
US20120059866A1 (en) * | 2010-09-03 | 2012-03-08 | Advanced Micro Devices, Inc. | Method and apparatus for performing floating-point division |
US8667042B2 (en) | 2010-09-24 | 2014-03-04 | Intel Corporation | Functional unit for vector integer multiply add instruction |
US8488055B2 (en) * | 2010-09-30 | 2013-07-16 | Apple Inc. | Flash synchronization using image sensor interface timing signal |
TWI428833B (zh) * | 2010-11-10 | 2014-03-01 | Via Tech Inc | 多執行緒處理器及其指令執行及同步方法及其電腦程式產品 |
US8745111B2 (en) * | 2010-11-16 | 2014-06-03 | Apple Inc. | Methods and apparatuses for converting floating point representations |
CN101986264B (zh) * | 2010-11-25 | 2013-07-31 | 中国人民解放军国防科学技术大学 | 用于simd向量微处理器的多功能浮点乘加运算装置 |
GB2488985A (en) | 2011-03-08 | 2012-09-19 | Advanced Risc Mach Ltd | Mixed size data processing operation with integrated operand conversion instructions |
US8667222B2 (en) | 2011-04-01 | 2014-03-04 | Intel Corporation | Bypass and insertion algorithms for exclusive last-level caches |
US9501392B1 (en) | 2011-05-12 | 2016-11-22 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Management of a non-volatile memory module |
CN102214160B (zh) * | 2011-07-08 | 2013-04-17 | 中国科学技术大学 | 一种基于龙芯3a的单精度矩阵乘法优化方法 |
US9727336B2 (en) * | 2011-09-16 | 2017-08-08 | International Business Machines Corporation | Fine-grained instruction enablement at sub-function granularity based on an indicated subrange of registers |
US20130099946A1 (en) | 2011-10-21 | 2013-04-25 | International Business Machines Corporation | Data Compression Utilizing Variable and Limited Length Codes |
US8935478B2 (en) | 2011-11-01 | 2015-01-13 | International Business Machines Corporation | Variable cache line size management |
US9021237B2 (en) * | 2011-12-20 | 2015-04-28 | International Business Machines Corporation | Low latency variable transfer network communicating variable written to source processing core variable register allocated to destination thread to destination processing core variable register allocated to source thread |
US9960917B2 (en) | 2011-12-22 | 2018-05-01 | Intel Corporation | Matrix multiply accumulate instruction |
CN114721721A (zh) * | 2011-12-23 | 2022-07-08 | 英特尔公司 | 用于混洗浮点或整数值的装置和方法 |
CN104011664B (zh) * | 2011-12-23 | 2016-12-28 | 英特尔公司 | 使用三个标量项的超级乘加(超级madd)指令 |
WO2013101120A1 (en) | 2011-12-29 | 2013-07-04 | Intel Corporation | Online learning based algorithms to increase retention and reuse of gpu-generated dynamic surfaces in outer-level caches |
EP2812802A4 (en) | 2012-02-08 | 2016-04-27 | Intel Corp | DYNAMIC CPU GPU LOAD BALANCING USING POWER |
US20130218938A1 (en) * | 2012-02-17 | 2013-08-22 | Qualcomm Incorporated | Floating-point adder with operand shifting based on a predicted exponent difference |
US9213523B2 (en) | 2012-06-29 | 2015-12-15 | Intel Corporation | Double rounded combined floating-point multiply and add |
US8892619B2 (en) * | 2012-07-24 | 2014-11-18 | The Board Of Trustees Of The Leland Stanford Junior University | Floating-point multiply-add unit using cascade design |
US9298456B2 (en) * | 2012-08-21 | 2016-03-29 | Apple Inc. | Mechanism for performing speculative predicated instructions |
US20140075163A1 (en) * | 2012-09-07 | 2014-03-13 | Paul N. Loewenstein | Load-monitor mwait |
US9582287B2 (en) * | 2012-09-27 | 2017-02-28 | Intel Corporation | Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions |
US9152382B2 (en) * | 2012-10-31 | 2015-10-06 | Intel Corporation | Reducing power consumption in a fused multiply-add (FMA) unit responsive to input data values |
US11150721B2 (en) * | 2012-11-07 | 2021-10-19 | Nvidia Corporation | Providing hints to an execution unit to prepare for predicted subsequent arithmetic operations |
US9829956B2 (en) * | 2012-11-21 | 2017-11-28 | Nvidia Corporation | Approach to power reduction in floating-point operations |
US9183144B2 (en) | 2012-12-14 | 2015-11-10 | Intel Corporation | Power gating a portion of a cache memory |
US10289418B2 (en) | 2012-12-27 | 2019-05-14 | Nvidia Corporation | Cooperative thread array granularity context switch during trap handling |
US9317251B2 (en) * | 2012-12-31 | 2016-04-19 | Nvidia Corporation | Efficient correction of normalizer shift amount errors in fused multiply add operations |
US9298457B2 (en) * | 2013-01-22 | 2016-03-29 | Altera Corporation | SIMD instructions for data compression and decompression |
US9525586B2 (en) * | 2013-03-15 | 2016-12-20 | Intel Corporation | QoS based binary translation and application streaming |
GB2514397B (en) * | 2013-05-23 | 2017-10-11 | Linear Algebra Tech Ltd | Corner detection |
EP3005078A2 (en) | 2013-05-24 | 2016-04-13 | Coherent Logix Incorporated | Memory-network processor with programmable optimizations |
US9264066B2 (en) * | 2013-07-30 | 2016-02-16 | Apple Inc. | Type conversion using floating-point unit |
US9092345B2 (en) * | 2013-08-08 | 2015-07-28 | Arm Limited | Data processing systems |
US9710380B2 (en) | 2013-08-29 | 2017-07-18 | Intel Corporation | Managing shared cache by multi-core processor |
TWI676898B (zh) | 2013-12-09 | 2019-11-11 | 安然國際科技有限公司 | 分散式記憶體磁碟群集儲存系統運作方法 |
US9465578B2 (en) * | 2013-12-13 | 2016-10-11 | Nvidia Corporation | Logic circuitry configurable to perform 32-bit or dual 16-bit floating-point operations |
US9461667B2 (en) * | 2013-12-30 | 2016-10-04 | Samsung Electronics Co., Ltd. | Rounding injection scheme for floating-point to integer conversion |
US20150193358A1 (en) | 2014-01-06 | 2015-07-09 | Nvidia Corporation | Prioritized Memory Reads |
US10528357B2 (en) | 2014-01-17 | 2020-01-07 | L3 Technologies, Inc. | Web-based recorder configuration utility |
US20150268963A1 (en) * | 2014-03-23 | 2015-09-24 | Technion Research & Development Foundation Ltd. | Execution of data-parallel programs on coarse-grained reconfigurable architecture hardware |
WO2015147895A1 (en) * | 2014-03-26 | 2015-10-01 | Intel Corporation | Three source operand floating point addition processors, methods, systems, and instructions |
JP6248808B2 (ja) | 2014-05-22 | 2017-12-20 | 富士通株式会社 | 情報処理装置、情報処理システム、情報処理装置の制御方法、及び、情報処理装置の制御プログラム |
US10061592B2 (en) | 2014-06-27 | 2018-08-28 | Samsung Electronics Co., Ltd. | Architecture and execution for efficient mixed precision computations in single instruction multiple data/thread (SIMD/T) devices |
US9520192B2 (en) | 2014-06-30 | 2016-12-13 | Intel Corporation | Resistive memory write operation with merged reset |
US10032244B2 (en) * | 2014-08-21 | 2018-07-24 | Intel Corporation | Method and apparatus for implementing a nearest neighbor search on a graphics processing unit (GPU) |
US10223333B2 (en) * | 2014-08-29 | 2019-03-05 | Nvidia Corporation | Performing multi-convolution operations in a parallel processing system |
CN104407836B (zh) * | 2014-10-14 | 2017-05-31 | 中国航天科技集团公司第九研究院第七七一研究所 | 利用定点乘法器进行级联乘累加运算的装置和方法 |
JP2016091242A (ja) | 2014-10-31 | 2016-05-23 | 富士通株式会社 | キャッシュメモリ、キャッシュメモリへのアクセス方法及び制御プログラム |
US20160124709A1 (en) * | 2014-11-04 | 2016-05-05 | International Business Machines Corporation | Fast, energy-efficient exponential computations in simd architectures |
CN104461449B (zh) * | 2014-11-14 | 2018-02-27 | 中国科学院数据与通信保护研究教育中心 | 基于向量指令的大整数乘法实现方法及装置 |
US10282227B2 (en) | 2014-11-18 | 2019-05-07 | Intel Corporation | Efficient preemption for graphics processors |
US9491112B1 (en) * | 2014-12-10 | 2016-11-08 | Amazon Technologies, Inc. | Allocating processor resources based on a task identifier |
WO2016097812A1 (en) | 2014-12-14 | 2016-06-23 | Via Alliance Semiconductor Co., Ltd. | Cache memory budgeted by chunks based on memory access type |
JP6207766B2 (ja) | 2014-12-14 | 2017-10-04 | ヴィア アライアンス セミコンダクター カンパニー リミテッド | ヘテロジニアス置換ポリシーを用いるセット・アソシエイティブ・キャッシュ・メモリ |
US9910785B2 (en) | 2014-12-14 | 2018-03-06 | Via Alliance Semiconductor Co., Ltd | Cache memory budgeted by ways based on memory access type |
US9928034B2 (en) * | 2014-12-17 | 2018-03-27 | Nvidia Corporation | Work-efficient, load-balanced, merge-based parallelized consumption of sequences of sequences |
US10297001B2 (en) * | 2014-12-26 | 2019-05-21 | Intel Corporation | Reduced power implementation of computer instructions |
US9710228B2 (en) * | 2014-12-29 | 2017-07-18 | Imagination Technologies Limited | Unified multiply unit |
US20170061279A1 (en) * | 2015-01-14 | 2017-03-02 | Intel Corporation | Updating an artificial neural network using flexible fixed point representation |
US10002455B2 (en) * | 2015-04-20 | 2018-06-19 | Intel Corporation | Optimized depth buffer cache apparatus and method |
US10262259B2 (en) * | 2015-05-08 | 2019-04-16 | Qualcomm Incorporated | Bit width selection for fixed point neural networks |
US9804666B2 (en) | 2015-05-26 | 2017-10-31 | Samsung Electronics Co., Ltd. | Warp clustering |
US20190073582A1 (en) * | 2015-09-23 | 2019-03-07 | Yi Yang | Apparatus and method for local quantization for convolutional neural networks (cnns) |
WO2017049592A1 (en) | 2015-09-25 | 2017-03-30 | Intel Corporation | Method and apparatus to improve shared memory efficiency |
US20170177336A1 (en) * | 2015-12-22 | 2017-06-22 | Intel Corporation | Hardware cancellation monitor for floating point operations |
US9996320B2 (en) * | 2015-12-23 | 2018-06-12 | Intel Corporation | Fused multiply-add (FMA) low functional unit |
US20170323042A1 (en) * | 2016-05-04 | 2017-11-09 | Chengdu Haicun Ip Technology Llc | Simulation Processor with Backside Look-Up Table |
US20170308800A1 (en) * | 2016-04-26 | 2017-10-26 | Smokescreen Intelligence, LLC | Interchangeable Artificial Intelligence Perception Systems and Methods |
US10509732B2 (en) | 2016-04-27 | 2019-12-17 | Advanced Micro Devices, Inc. | Selecting cache aging policy for prefetches based on cache test regions |
GB201607713D0 (en) * | 2016-05-03 | 2016-06-15 | Imagination Tech Ltd | Convolutional neural network |
US9846579B1 (en) * | 2016-06-13 | 2017-12-19 | Apple Inc. | Unified integer and floating-point compare circuitry |
JP6665720B2 (ja) | 2016-07-14 | 2020-03-13 | 富士通株式会社 | 情報処理装置、コンパイルプログラム、コンパイル方法、およびキャッシュ制御方法 |
US10997496B2 (en) | 2016-08-11 | 2021-05-04 | Nvidia Corporation | Sparse convolutional neural network accelerator |
US10891538B2 (en) | 2016-08-11 | 2021-01-12 | Nvidia Corporation | Sparse convolutional neural network accelerator |
US10242311B2 (en) | 2016-08-11 | 2019-03-26 | Vivante Corporation | Zero coefficient skipping convolution neural network engine |
US11315018B2 (en) | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
US10216479B2 (en) * | 2016-12-06 | 2019-02-26 | Arm Limited | Apparatus and method for performing arithmetic operations to accumulate floating-point numbers |
US20180183577A1 (en) * | 2016-12-28 | 2018-06-28 | Intel Corporation | Techniques for secure message authentication with unified hardware acceleration |
US10558575B2 (en) * | 2016-12-30 | 2020-02-11 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
GB2560159B (en) * | 2017-02-23 | 2019-12-25 | Advanced Risc Mach Ltd | Widening arithmetic in a data processing apparatus |
KR102499396B1 (ko) | 2017-03-03 | 2023-02-13 | 삼성전자 주식회사 | 뉴럴 네트워크 장치 및 뉴럴 네트워크 장치의 동작 방법 |
US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
US10474458B2 (en) * | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
US10338919B2 (en) * | 2017-05-08 | 2019-07-02 | Nvidia Corporation | Generalized acceleration of matrix multiply accumulate operations |
US10969740B2 (en) | 2017-06-27 | 2021-04-06 | Nvidia Corporation | System and method for near-eye light field rendering for wide field of view interactive three-dimensional computer graphics |
US10394456B2 (en) | 2017-08-23 | 2019-08-27 | Micron Technology, Inc. | On demand memory page size |
US11232531B2 (en) | 2017-08-29 | 2022-01-25 | Intel Corporation | Method and apparatus for efficient loop processing in a graphics hardware front end |
US10503507B2 (en) | 2017-08-31 | 2019-12-10 | Nvidia Corporation | Inline data inspection for workload simplification |
US10725740B2 (en) | 2017-08-31 | 2020-07-28 | Qualcomm Incorporated | Providing efficient multiplication of sparse matrices in matrix-processor-based devices |
US11222256B2 (en) | 2017-10-17 | 2022-01-11 | Xilinx, Inc. | Neural network processing system having multiple processors and a neural network accelerator |
GB2569098B (en) | 2017-10-20 | 2020-01-08 | Graphcore Ltd | Combining states of multiple threads in a multi-threaded processor |
GB2569274B (en) | 2017-10-20 | 2020-07-15 | Graphcore Ltd | Synchronization amongst processor tiles |
GB2569271B (en) | 2017-10-20 | 2020-05-13 | Graphcore Ltd | Synchronization with a host processor |
GB2569844B (en) | 2017-10-20 | 2021-01-06 | Graphcore Ltd | Sending data off-chip |
US11977974B2 (en) | 2017-11-30 | 2024-05-07 | International Business Machines Corporation | Compression of fully connected / recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression |
US10678508B2 (en) | 2018-03-23 | 2020-06-09 | Amazon Technologies, Inc. | Accelerated quantized multiply-and-add operations |
CN111937401B (zh) | 2018-04-13 | 2022-08-16 | 皇家Kpn公司 | 基于块级超分辨率的视频编码的方法和装置 |
US10769526B2 (en) * | 2018-04-24 | 2020-09-08 | Intel Corporation | Machine learning accelerator architecture |
US11269805B2 (en) | 2018-05-15 | 2022-03-08 | Intel Corporation | Signal pathways in multi-tile processors |
CN113190791A (zh) | 2018-08-06 | 2021-07-30 | 华为技术有限公司 | 矩阵的处理方法、装置及逻辑电路 |
KR20200022118A (ko) | 2018-08-22 | 2020-03-03 | 에스케이하이닉스 주식회사 | 데이터 저장 장치 및 그 동작 방법 |
US11093248B2 (en) | 2018-09-10 | 2021-08-17 | International Business Machines Corporation | Prefetch queue allocation protection bubble in a processor |
US11294626B2 (en) | 2018-09-27 | 2022-04-05 | Intel Corporation | Floating-point dynamic range expansion |
GB2580151B (en) | 2018-12-21 | 2021-02-24 | Graphcore Ltd | Identifying processing units in a processor |
US10915461B2 (en) | 2019-03-05 | 2021-02-09 | International Business Machines Corporation | Multilevel cache eviction management |
KR20210136994A (ko) | 2019-03-15 | 2021-11-17 | 인텔 코포레이션 | 매트릭스 가속기 아키텍처 내에서의 시스톨릭 분리 |
EP3938894B1 (en) | 2019-03-15 | 2023-08-30 | INTEL Corporation | Multi-tile memory management for detecting cross tile access, providing multi-tile inference scaling, and providing optimal page migration |
US11574239B2 (en) * | 2019-03-18 | 2023-02-07 | Microsoft Technology Licensing, Llc | Outlier quantization for training and inference |
-
2017
- 2017-10-18 US US15/787,129 patent/US10474458B2/en active Active
- 2017-11-21 US US15/819,152 patent/US10353706B2/en active Active
-
2018
- 2018-02-22 TW TW109144479A patent/TWI784372B/zh active
- 2018-02-22 TW TW112124508A patent/TWI834576B/zh active
- 2018-02-22 TW TW111130374A patent/TWI819748B/zh active
- 2018-02-22 TW TW107105949A patent/TWI760443B/zh active
- 2018-02-22 TW TW110127153A patent/TWI793685B/zh active
- 2018-03-26 EP EP20207059.5A patent/EP3796154A1/en active Pending
- 2018-03-26 EP EP23182458.2A patent/EP4242838A3/en active Pending
- 2018-03-26 PL PL19214143T patent/PL3637246T3/pl unknown
- 2018-03-26 EP EP21165109.6A patent/EP3859519B1/en active Active
- 2018-03-26 ES ES21165109T patent/ES2925598T3/es active Active
- 2018-03-26 EP EP19214829.4A patent/EP3637247B1/en active Active
- 2018-03-26 EP EP22210195.8A patent/EP4160387A1/en active Pending
- 2018-03-26 ES ES21195277T patent/ES2934080T3/es active Active
- 2018-03-26 ES ES19214143T patent/ES2915607T3/es active Active
- 2018-03-26 EP EP18164093.9A patent/EP3396524A1/en active Pending
- 2018-03-26 PL PL19214829.4T patent/PL3637247T3/pl unknown
- 2018-03-26 EP EP19214143.0A patent/EP3637246B1/en active Active
- 2018-03-26 ES ES19214829T patent/ES2929797T3/es active Active
- 2018-03-26 EP EP22198967.6A patent/EP4130976A1/en active Pending
- 2018-03-26 PL PL21195277.5T patent/PL3937004T3/pl unknown
- 2018-03-26 EP EP21195277.5A patent/EP3937004B1/en active Active
- 2018-03-26 PL PL21165109.6T patent/PL3859519T3/pl unknown
- 2018-04-27 CN CN202211446828.0A patent/CN115826916A/zh active Pending
- 2018-04-27 CN CN202110250102.9A patent/CN112947893A/zh active Pending
- 2018-04-27 CN CN202011533036.8A patent/CN112527243A/zh active Pending
- 2018-04-27 CN CN202110906984.XA patent/CN113672197B/zh active Active
- 2018-04-27 CN CN202010498935.2A patent/CN111666066B/zh active Active
- 2018-04-27 CN CN201810394160.7A patent/CN108804077A/zh active Pending
- 2018-04-27 CN CN202210949334.8A patent/CN115185484A/zh active Pending
- 2018-04-27 CN CN202110256528.5A patent/CN112947894A/zh active Pending
- 2018-04-27 CN CN202310795238.7A patent/CN116755656A/zh active Pending
-
2019
- 2019-06-05 US US16/432,402 patent/US11169799B2/en active Active
-
2020
- 2020-12-09 US US17/115,989 patent/US20210124579A1/en active Pending
-
2021
- 2021-02-05 US US17/169,232 patent/US11080046B2/en active Active
- 2021-07-06 US US17/305,355 patent/US11360767B2/en active Active
-
2022
- 2022-06-07 US US17/834,482 patent/US11720355B2/en active Active
-
2023
- 2023-12-04 US US18/528,340 patent/US20240184572A1/en active Pending
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
PL3637247T3 (pl) | Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego | |
GB2543429B (en) | Machine learning for visual processing | |
EP3394722A4 (en) | INSTRUCTIONS AND LOGIC FOR INDEX LOADING AND GROUP PREPARATION OPERATIONS | |
EP3238046A4 (en) | Instruction and logic to perform a fused single cycle increment-compare-jump | |
EP3274817A4 (en) | Instructions and logic to provide atomic range operations | |
GB2527822B (en) | Graphics processing | |
EP3394728A4 (en) | INSTRUCTIONS AND LOGIC FOR LOAD INDICES AND COLLECTIVE OPERATIONS | |
EP3391203A4 (en) | COMMANDS AND LOGIC FOR LOAD INDICES AND PREFETCH SCATTER OPERATIONS | |
GB201419805D0 (en) | Work piece processing arrangements | |
GB2546073B (en) | Graphics processing | |
EP3394742A4 (en) | INSTRUCTIONS AND LOGIC FOR BROADCASTING AND LOAD INDEX OPERATIONS | |
EP3391236A4 (en) | INSTRUCTIONS AND LOGIC FOR OBTAINING OPERATIONS OF MULTIPLE VECTORIAL ELEMENTS | |
IL261429B (en) | A compound multiplication instruction | |
GB201710873D0 (en) | Graphics processing | |
GB2547242B (en) | Graphics processing | |
GB2533224B (en) | A hydraulic arrangement for a work machine and a process for a hydraulic arrangement | |
EP3391234A4 (en) | INSTRUCTIONS AND LOGIC FOR OPERATIONS DEFINING MULTIPLE VECTORIAL ELEMENTS | |
SG11201704466QA (en) | Methods, apparatus, instructions and logic to provide vector packed tuple cross-comparison functionality | |
GB2525495B (en) | Labelling machine and labelling methods | |
EP3391235A4 (en) | INSTRUCTIONS AND LOGIC FOR JUST AND ODD VECTOR GET OPERATIONS | |
EP3391201A4 (en) | COMMAND AND LOGIC FOR PARTIAL REDUCTION OPERATIONS | |
EP3198401A4 (en) | Instruction and logic for a vector format for processing computations | |
GB201714752D0 (en) | Graphics processing | |
GB201721202D0 (en) | Graphics processing | |
IL272483A (en) | An improved technique for computer visual learning |