PL3637246T3 - Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego - Google Patents

Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego

Info

Publication number
PL3637246T3
PL3637246T3 PL19214143T PL19214143T PL3637246T3 PL 3637246 T3 PL3637246 T3 PL 3637246T3 PL 19214143 T PL19214143 T PL 19214143T PL 19214143 T PL19214143 T PL 19214143T PL 3637246 T3 PL3637246 T3 PL 3637246T3
Authority
PL
Poland
Prior art keywords
logic
instructions
point
machine learning
integer operations
Prior art date
Application number
PL19214143T
Other languages
English (en)
Inventor
Himanshu Kaul
Mark A. Anders
Sanu K. Mathew
Anbang YAO
Joydeep RAY
Ping T. Tang
Michael S. Strickland
Xiaoming Chen
Tatiana Shpeisman
Abhishek R. APPU
Altug Koker
Kamal SINHA
Balaji Vembu
Eriko Nurvitadhi
Rajkishore Barik
Tsung-Han Lin
Vasanth Ranganathan
Sanjeev Jahagirdar
Nicolas C. Galoppo Von Borries
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Publication of PL3637246T3 publication Critical patent/PL3637246T3/pl

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/39Control of the bit-mapped memory
    • G09G5/393Arrangements for updating the contents of the bit-mapped memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • G06F2207/3804Details
    • G06F2207/3808Details concerning the type of numbers or the way they are handled
    • G06F2207/3812Devices capable of handling different types of numbers
    • G06F2207/3824Accepting both fixed-point and floating-point numbers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/3013Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Nonlinear Science (AREA)
  • Neurology (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • Image Generation (AREA)
  • Advance Control (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)
  • Computer Graphics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Numerical Control (AREA)
PL19214143T 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego PL3637246T3 (pl)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762491699P 2017-04-28 2017-04-28
US15/787,129 US10474458B2 (en) 2017-04-28 2017-10-18 Instructions and logic to perform floating-point and integer operations for machine learning
EP18164093.9A EP3396524A1 (en) 2017-04-28 2018-03-26 Instructions and logic to perform floating-point and integer operations for machine learning
EP19214143.0A EP3637246B1 (en) 2017-04-28 2018-03-26 Instructions and logic to perform floating-point and integer operations for machine learning

Publications (1)

Publication Number Publication Date
PL3637246T3 true PL3637246T3 (pl) 2022-07-04

Family

ID=61827531

Family Applications (4)

Application Number Title Priority Date Filing Date
PL19214143T PL3637246T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
PL19214829.4T PL3637247T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
PL21195277.5T PL3937004T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika przeprowadzania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
PL21165109.6T PL3859519T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego

Family Applications After (3)

Application Number Title Priority Date Filing Date
PL19214829.4T PL3637247T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
PL21195277.5T PL3937004T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika przeprowadzania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
PL21165109.6T PL3859519T3 (pl) 2017-04-28 2018-03-26 Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego

Country Status (6)

Country Link
US (8) US10474458B2 (pl)
EP (9) EP3796154A1 (pl)
CN (9) CN115826916A (pl)
ES (4) ES2925598T3 (pl)
PL (4) PL3637246T3 (pl)
TW (5) TWI784372B (pl)

Families Citing this family (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037330B2 (en) * 2017-04-08 2021-06-15 Intel Corporation Low rank matrix compression
US10474458B2 (en) 2017-04-28 2019-11-12 Intel Corporation Instructions and logic to perform floating-point and integer operations for machine learning
DE102018110607A1 (de) 2017-05-08 2018-11-08 Nvidia Corporation Verallgemeinerte Beschleunigung von Matrix-Multiplikations-und-Akkumulations-Operationen
US10338919B2 (en) 2017-05-08 2019-07-02 Nvidia Corporation Generalized acceleration of matrix multiply accumulate operations
CN108228696B (zh) * 2017-08-31 2021-03-23 深圳市商汤科技有限公司 人脸图像检索方法和系统、拍摄装置、计算机存储介质
US11216250B2 (en) * 2017-12-06 2022-01-04 Advanced Micro Devices, Inc. Dynamic, variable bit-width numerical precision on field-programmable gate arrays for machine learning tasks
US11048644B1 (en) * 2017-12-11 2021-06-29 Amazon Technologies, Inc. Memory mapping in an access device for non-volatile memory
US10671147B2 (en) * 2017-12-18 2020-06-02 Facebook, Inc. Dynamic power management for artificial intelligence hardware accelerators
US10474430B2 (en) * 2017-12-29 2019-11-12 Facebook, Inc. Mixed-precision processing elements, systems, and methods for computational models
KR102637735B1 (ko) * 2018-01-09 2024-02-19 삼성전자주식회사 근사 곱셈기를 구비하는 뉴럴 네트워크 처리 장치 및 이를 포함하는 시스템온 칩
US10311861B1 (en) * 2018-01-15 2019-06-04 Gyrfalcon Technology Inc. System and method for encoding data in a voice recognition integrated circuit solution
CN108388446A (zh) * 2018-02-05 2018-08-10 上海寒武纪信息科技有限公司 运算模块以及方法
US11514306B1 (en) * 2018-03-14 2022-11-29 Meta Platforms, Inc. Static memory allocation in neural networks
US11216732B2 (en) * 2018-05-31 2022-01-04 Neuralmagic Inc. Systems and methods for generation of sparse code for convolutional neural networks
US10684824B2 (en) * 2018-06-06 2020-06-16 Nvidia Corporation Stochastic rounding of numerical values
US10803141B2 (en) * 2018-07-05 2020-10-13 Gsi Technology Inc. In-memory stochastic rounder
US10769310B2 (en) * 2018-07-20 2020-09-08 Nxp B.V. Method for making a machine learning model more difficult to copy
US10636484B2 (en) * 2018-09-12 2020-04-28 Winbond Electronics Corporation Circuit and method for memory operation
US11455766B2 (en) * 2018-09-18 2022-09-27 Advanced Micro Devices, Inc. Variable precision computing system
US10922203B1 (en) * 2018-09-21 2021-02-16 Nvidia Corporation Fault injection architecture for resilient GPU computing
US10853067B2 (en) 2018-09-27 2020-12-01 Intel Corporation Computer processor for higher precision computations using a mixed-precision decomposition of operations
US11468291B2 (en) 2018-09-28 2022-10-11 Nxp B.V. Method for protecting a machine learning ensemble from copying
US11243743B2 (en) * 2018-10-18 2022-02-08 Facebook, Inc. Optimization of neural networks using hardware calculation efficiency and adjustment factors
US11366663B2 (en) * 2018-11-09 2022-06-21 Intel Corporation Systems and methods for performing 16-bit floating-point vector dot product instructions
CN109710211B (zh) * 2018-11-15 2021-03-19 珠海市杰理科技股份有限公司 浮点数据类型转换方法、装置、存储介质及计算机设备
US11568235B2 (en) * 2018-11-19 2023-01-31 International Business Machines Corporation Data driven mixed precision learning for neural networks
US11449268B2 (en) * 2018-11-20 2022-09-20 Samsung Electronics Co., Ltd. Deep solid state device (deep-SSD): a neural network based persistent data storage
US11537853B1 (en) * 2018-11-28 2022-12-27 Amazon Technologies, Inc. Decompression and compression of neural network data using different compression schemes
CN111258641B (zh) * 2018-11-30 2022-12-09 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN109754084B (zh) * 2018-12-29 2020-06-12 中科寒武纪科技股份有限公司 网络结构的处理方法、装置及相关产品
US10963219B2 (en) * 2019-02-06 2021-03-30 International Business Machines Corporation Hybrid floating point representation for deep learning acceleration
US11651192B2 (en) * 2019-02-12 2023-05-16 Apple Inc. Compressed convolutional neural network models
US11074100B2 (en) * 2019-02-27 2021-07-27 Micron Technology, Inc. Arithmetic and logical operations in a multi-user network
EP3938894B1 (en) 2019-03-15 2023-08-30 INTEL Corporation Multi-tile memory management for detecting cross tile access, providing multi-tile inference scaling, and providing optimal page migration
KR20210136994A (ko) 2019-03-15 2021-11-17 인텔 코포레이션 매트릭스 가속기 아키텍처 내에서의 시스톨릭 분리
US11934342B2 (en) 2019-03-15 2024-03-19 Intel Corporation Assistance for hardware prefetch in cache access
US10884736B1 (en) * 2019-03-15 2021-01-05 Cadence Design Systems, Inc. Method and apparatus for a low energy programmable vector processing unit for neural networks backend processing
US11768664B2 (en) 2019-03-15 2023-09-26 Advanced Micro Devices, Inc. Processing unit with mixed precision operations
US10853129B1 (en) * 2019-03-19 2020-12-01 Amazon Technologies, Inc. Accelerator based inference service
CN111767980B (zh) * 2019-04-02 2024-03-05 杭州海康威视数字技术股份有限公司 模型优化方法、装置及设备
CN110334801A (zh) * 2019-05-09 2019-10-15 苏州浪潮智能科技有限公司 一种卷积神经网络的硬件加速方法、装置、设备及系统
US11288040B2 (en) 2019-06-07 2022-03-29 Intel Corporation Floating-point dot-product hardware with wide multiply-adder tree for machine learning accelerators
FR3097993B1 (fr) * 2019-06-25 2021-10-22 Kalray Opérateur de produit scalaire de nombres à virgule flottante réalisant un arrondi correct
FR3097992B1 (fr) 2019-06-25 2021-06-25 Kalray Opérateur d’addition et multiplication fusionnées pour nombres à virgule flottante de précision mixte réalisant un arrondi correct
EP3991357A1 (en) * 2019-06-25 2022-05-04 Marvell Asia Pte, Ltd. Automotive network switch with anomaly detection
EP3764286A1 (fr) * 2019-07-10 2021-01-13 STMicroelectronics (Rousset) SAS Procédé et outil informatique de détermination de fonctions de transferts entre des paires de couches successives d'un réseau de neurones
CN110399972B (zh) * 2019-07-22 2021-05-25 上海商汤智能科技有限公司 数据处理方法、装置及电子设备
US11704231B2 (en) * 2019-07-26 2023-07-18 Microsoft Technology Licensing, Llc Techniques for conformance testing computational operations
CN112394997A (zh) * 2019-08-13 2021-02-23 上海寒武纪信息科技有限公司 八位整形转半精度浮点指令处理装置、方法及相关产品
CN110503195A (zh) * 2019-08-14 2019-11-26 北京中科寒武纪科技有限公司 利用人工智能处理器执行任务的方法及其相关产品
CN110598172B (zh) * 2019-08-22 2022-10-25 瑞芯微电子股份有限公司 一种基于csa加法器的卷积运算方法和电路
WO2021036905A1 (zh) * 2019-08-27 2021-03-04 安徽寒武纪信息科技有限公司 数据处理方法、装置、计算机设备和存储介质
US11842169B1 (en) * 2019-09-25 2023-12-12 Amazon Technologies, Inc. Systolic multiply delayed accumulate processor architecture
US20210089316A1 (en) * 2019-09-25 2021-03-25 Intel Corporation Deep learning implementations using systolic arrays and fused operations
US11663444B2 (en) 2019-09-27 2023-05-30 Microsoft Technology Licensing, Llc Pipelined neural network processing with continuous and asynchronous updates
US11676010B2 (en) * 2019-10-14 2023-06-13 Micron Technology, Inc. Memory sub-system with a bus to transmit data for a machine learning operation and another bus to transmit host data
CN110764733B (zh) * 2019-10-15 2023-06-30 天津津航计算技术研究所 一种基于fpga的多种分布随机数生成装置
US11288220B2 (en) 2019-10-18 2022-03-29 Achronix Semiconductor Corporation Cascade communications between FPGA tiles
CN112783520A (zh) * 2019-11-04 2021-05-11 阿里巴巴集团控股有限公司 执行方法、装置、电子设备及存储介质
CN110888623B (zh) * 2019-11-25 2021-11-23 集美大学 数据转换方法、乘法器、加法器、终端设备及存储介质
US11334317B2 (en) * 2019-11-27 2022-05-17 Core Concept Technologies Inc. Information processing apparatus, program, and information processing method configured to handle a high-precision computer number
US11816446B2 (en) * 2019-11-27 2023-11-14 Amazon Technologies, Inc. Systolic array component combining multiple integer and floating-point data types
US11467806B2 (en) 2019-11-27 2022-10-11 Amazon Technologies, Inc. Systolic array including fused multiply accumulate with efficient prenormalization and extended dynamic range
TWI774110B (zh) * 2019-11-29 2022-08-11 凌華科技股份有限公司 適於工業自動化設備之共享記憶體的資料分發服務之系統及其運作方法
US11282192B2 (en) * 2019-12-19 2022-03-22 Varian Medical Systems International Ag Training deep learning engines for radiotherapy treatment planning
CN111186139B (zh) * 2019-12-25 2022-03-15 西北工业大学 一种3d打印模型的多层次并行切片方法
US11861492B1 (en) * 2019-12-26 2024-01-02 Cadence Design Systems, Inc. Quantizing trained neural networks with removal of normalization
CN111242293B (zh) * 2020-01-13 2023-07-18 腾讯科技(深圳)有限公司 一种处理部件、数据处理的方法以及电子设备
US11922292B2 (en) 2020-01-27 2024-03-05 Google Llc Shared scratchpad memory with parallel load-store
US20210241080A1 (en) * 2020-02-05 2021-08-05 Macronix International Co., Ltd. Artificial intelligence accelerator and operation thereof
US11360772B2 (en) 2020-03-31 2022-06-14 International Business Machines Corporation Instruction sequence merging and splitting for optimized accelerator implementation
JP6896306B1 (ja) * 2020-04-13 2021-06-30 LeapMind株式会社 ニューラルネットワーク回路、エッジデバイスおよびニューラルネットワーク演算方法
CN111582465B (zh) * 2020-05-08 2023-04-07 中国科学院上海高等研究院 基于fpga的卷积神经网络加速处理系统、方法以及终端
US11232062B1 (en) 2020-06-29 2022-01-25 Amazon Technologies, Inc. Parallelism within a systolic array using multiple accumulate busses
US11113233B1 (en) 2020-06-29 2021-09-07 Amazon Technologies, Inc. Multiple busses in a grouped systolic array
US11308026B1 (en) 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple busses interleaved in a systolic array
US11308027B1 (en) 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple accumulate busses in a systolic array
US11422773B1 (en) 2020-06-29 2022-08-23 Amazon Technologies, Inc. Multiple busses within a systolic array processing element
JP2022016795A (ja) * 2020-07-13 2022-01-25 富士通株式会社 情報処理装置、情報処理プログラムおよび情報処理方法
CN111930342B (zh) * 2020-09-15 2021-01-19 浙江大学 一种针对规格化浮点数的误差无偏近似乘法器及其实现方法
CN112784969B (zh) * 2021-02-01 2024-05-14 东北大学 用于图像特征提取的卷积神经网络加速学习方法
CN112579519B (zh) * 2021-03-01 2021-05-25 湖北芯擎科技有限公司 数据运算电路和处理芯片
TWI778537B (zh) * 2021-03-05 2022-09-21 國立臺灣科技大學 神經網路加速單元的動態設計方法
US11880682B2 (en) 2021-06-30 2024-01-23 Amazon Technologies, Inc. Systolic array with efficient input reduction and extended array performance
CN113535638B (zh) * 2021-07-20 2022-11-15 珠海市一微星科技有限公司 一种并行运算加速系统及其运行方法
CN113535637B (zh) * 2021-07-20 2022-11-15 珠海市一微星科技有限公司 一种运算加速单元及其运行方法
US20230065528A1 (en) * 2021-08-31 2023-03-02 Samsung Electronics Co., Ltd. Apparatus and method with multi-format data support
KR20230063791A (ko) * 2021-11-02 2023-05-09 리벨리온 주식회사 인공지능 코어, 인공지능 코어 시스템 및 인공지능 코어 시스템의 로드/스토어 방법
US20230205488A1 (en) * 2021-12-23 2023-06-29 Samsung Electronics Co., Ltd. Efficient circuit for neural network processing
WO2023212390A1 (en) * 2022-04-29 2023-11-02 University Of Southern California Neural network methods
GB2621196A (en) * 2022-08-01 2024-02-07 Advanced Risc Mach Ltd Broadcasting machine learning data
CN117910523A (zh) * 2022-10-19 2024-04-19 联发科技股份有限公司 将暂存存储器分配给异构设备的方法和系统
CN117132450B (zh) * 2023-10-24 2024-02-20 芯动微电子科技(武汉)有限公司 一种可实现数据共享的计算装置和图形处理器
CN117492693B (zh) * 2024-01-03 2024-03-22 沐曦集成电路(上海)有限公司 一种用于滤波器的浮点数据处理系统
CN117850882B (zh) * 2024-03-07 2024-05-24 北京壁仞科技开发有限公司 单指令多线程的处理装置及方法
CN117931123B (zh) * 2024-03-25 2024-06-14 中科亿海微电子科技(苏州)有限公司 一种应用于fpga的低功耗可变精度嵌入式dsp硬核结构

Family Cites Families (190)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BE568342A (pl) 1957-06-07
US3872442A (en) * 1972-12-14 1975-03-18 Sperry Rand Corp System for conversion between coded byte and floating point format
US4476523A (en) 1981-06-11 1984-10-09 Data General Corporation Fixed point and floating point computation units using commonly shared control fields
US4852048A (en) * 1985-12-12 1989-07-25 Itt Corporation Single instruction multiple data (SIMD) cellular array processing apparatus employing a common bus where a first number of bits manifest a first bus portion and a second number of bits manifest a second bus portion
US4823260A (en) 1987-11-12 1989-04-18 Intel Corporation Mixed-precision floating point operations from a single instruction opcode
US5268856A (en) * 1988-06-06 1993-12-07 Applied Intelligent Systems, Inc. Bit serial floating point parallel processing system and method
JP2581236B2 (ja) 1989-11-16 1997-02-12 三菱電機株式会社 データ処理装置
JP2682232B2 (ja) * 1990-11-21 1997-11-26 松下電器産業株式会社 浮動小数点演算処理装置
US5450607A (en) * 1993-05-17 1995-09-12 Mips Technologies Inc. Unified floating point and integer datapath for a RISC processor
US5574928A (en) 1993-10-29 1996-11-12 Advanced Micro Devices, Inc. Mixed integer/floating point processor core for a superscalar microprocessor with a plurality of operand buses for transferring operand segments
US5627985A (en) * 1994-01-04 1997-05-06 Intel Corporation Speculative and committed resource files in an out-of-order processor
US5673407A (en) 1994-03-08 1997-09-30 Texas Instruments Incorporated Data processor having capability to perform both floating point operations and memory access in response to a single instruction
US5983257A (en) * 1995-12-26 1999-11-09 Intel Corporation System for signal processing using multiply-add operations
US5940311A (en) * 1996-04-30 1999-08-17 Texas Instruments Incorporated Immediate floating-point operand reformatting in a microprocessor
US5917741A (en) * 1996-08-29 1999-06-29 Intel Corporation Method and apparatus for performing floating-point rounding operations for multiple precisions using incrementers
JP3790307B2 (ja) * 1996-10-16 2006-06-28 株式会社ルネサステクノロジ データプロセッサ及びデータ処理システム
US5887160A (en) 1996-12-10 1999-03-23 Fujitsu Limited Method and apparatus for communicating integer and floating point data over a shared data path in a single instruction pipeline processor
US5880984A (en) * 1997-01-13 1999-03-09 International Business Machines Corporation Method and apparatus for performing high-precision multiply-add calculations using independent multiply and add instruments
US6078940A (en) 1997-01-24 2000-06-20 Texas Instruments Incorporated Microprocessor with an instruction for multiply and left shift with saturate
US5926406A (en) * 1997-04-30 1999-07-20 Hewlett-Packard, Co. System and method for calculating floating point exponential values in a geometry accelerator
US6253311B1 (en) * 1997-11-29 2001-06-26 Jp First Llc Instruction set for bi-directional conversion and transfer of integer and floating point data
US6049865A (en) * 1997-12-18 2000-04-11 Motorola, Inc. Method and apparatus for implementing floating point projection instructions
US6260008B1 (en) * 1998-01-08 2001-07-10 Sharp Kabushiki Kaisha Method of and system for disambiguating syntactic word multiples
US6591084B1 (en) * 1998-04-27 2003-07-08 General Dynamics Decision Systems, Inc. Satellite based data transfer and delivery system
US6728839B1 (en) 1998-10-28 2004-04-27 Cisco Technology, Inc. Attribute based memory pre-fetching technique
US6480872B1 (en) * 1999-01-21 2002-11-12 Sandcraft, Inc. Floating-point and integer multiply-add and multiply-accumulate
US7941647B2 (en) * 1999-01-28 2011-05-10 Ati Technologies Ulc Computer for executing two instruction sets and adds a macroinstruction end marker for performing iterations after loop termination
JP2002536763A (ja) * 1999-02-12 2002-10-29 エムアイピーエス テクノロジーズ, インコーポレイテッド 命令セット構造の比較拡張を有するプロセッサ
US6529928B1 (en) * 1999-03-23 2003-03-04 Silicon Graphics, Inc. Floating-point adder performing floating-point and integer operations
US6788738B1 (en) * 1999-05-07 2004-09-07 Xilinx, Inc. Filter accelerator for a digital signal processor
US7499053B2 (en) 2000-06-19 2009-03-03 Mental Images Gmbh Real-time precision ray tracing
US6678806B1 (en) 2000-08-23 2004-01-13 Chipwrights Design, Inc. Apparatus and method for using tagged pointers for extract, insert and format operations
US7127482B2 (en) * 2001-11-19 2006-10-24 Intel Corporation Performance optimized approach for efficient downsampling operations
US7225216B1 (en) * 2002-07-09 2007-05-29 Nvidia Corporation Method and system for a floating point multiply-accumulator
US7373369B2 (en) * 2003-06-05 2008-05-13 International Business Machines Corporation Advanced execution of extended floating-point add operations in a narrow dataflow
CN1584821A (zh) * 2003-08-19 2005-02-23 中国科学院微电子中心 并行处理的可分割的乘法累加单元
US7272624B2 (en) * 2003-09-30 2007-09-18 International Business Machines Corporation Fused booth encoder multiplexer
GB2409068A (en) 2003-12-09 2005-06-15 Advanced Risc Mach Ltd Data element size control within parallel lanes of processing
KR100800468B1 (ko) * 2004-01-29 2008-02-01 삼성전자주식회사 저전력 고속 동작을 위한 하드웨어 암호화/복호화 장치 및그 방법
US8253750B1 (en) 2004-02-14 2012-08-28 Nvidia Corporation Digital media processor
US7873812B1 (en) 2004-04-05 2011-01-18 Tibet MIMAR Method and system for efficient matrix multiplication in a SIMD processor architecture
US7428566B2 (en) * 2004-11-10 2008-09-23 Nvidia Corporation Multipurpose functional unit with multiply-add and format conversion pipeline
US20060101244A1 (en) * 2004-11-10 2006-05-11 Nvidia Corporation Multipurpose functional unit with combined integer and floating-point multiply-add pipeline
US20060179092A1 (en) * 2005-02-10 2006-08-10 Schmookler Martin S System and method for executing fixed point divide operations using a floating point multiply-add pipeline
EP1889178A2 (en) * 2005-05-13 2008-02-20 Provost, Fellows and Scholars of the College of the Holy and Undivided Trinity of Queen Elizabeth near Dublin A data processing system and method
US8250348B2 (en) * 2005-05-19 2012-08-21 International Business Machines Corporation Methods and apparatus for dynamically switching processor mode
US20070030277A1 (en) 2005-08-08 2007-02-08 Via Technologies, Inc. Method for processing vertex, triangle, and pixel graphics data packets
US7659899B2 (en) 2005-08-08 2010-02-09 Via Technologies, Inc. System and method to manage data processing stages of a logical graphics pipeline
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions
US8327115B2 (en) * 2006-04-12 2012-12-04 Soft Machines, Inc. Plural matrices of execution units for processing matrices of row dependent instructions in single clock cycle in super or separate mode
US8146066B2 (en) * 2006-06-20 2012-03-27 Google Inc. Systems and methods for caching compute kernels for an application running on a parallel-processing computer system
US7467280B2 (en) 2006-07-05 2008-12-16 International Business Machines Corporation Method for reconfiguring cache memory based on at least analysis of heat generated during runtime, at least by associating an access bit with a cache line and associating a granularity bit with a cache line in level-2 cache
US20080071851A1 (en) 2006-09-20 2008-03-20 Ronen Zohar Instruction and logic for performing a dot-product operation
US8122078B2 (en) 2006-10-06 2012-02-21 Calos Fund, LLC Processor with enhanced combined-arithmetic capability
US8781110B2 (en) * 2007-06-30 2014-07-15 Intel Corporation Unified system architecture for elliptic-curve cryptography
US7783859B2 (en) 2007-07-12 2010-08-24 Qnx Software Systems Gmbh & Co. Kg Processing system implementing variable page size memory organization
US20100281235A1 (en) * 2007-11-17 2010-11-04 Martin Vorbach Reconfigurable floating-point and bit-level data processing unit
US8106914B2 (en) * 2007-12-07 2012-01-31 Nvidia Corporation Fused multiply-add functional unit
KR20090071823A (ko) * 2007-12-28 2009-07-02 한국과학기술원 다기능 연산장치 및 방법
US9678775B1 (en) * 2008-04-09 2017-06-13 Nvidia Corporation Allocating memory for local variables of a multi-threaded program for execution in a single-threaded environment
US8633936B2 (en) 2008-04-21 2014-01-21 Qualcomm Incorporated Programmable streaming processor with mixed precision instruction execution
US8078833B2 (en) * 2008-05-29 2011-12-13 Axis Semiconductor, Inc. Microprocessor with highly configurable pipeline and executional unit internal hierarchal structures, optimizable for different types of computational functions
US7945768B2 (en) * 2008-06-05 2011-05-17 Motorola Mobility, Inc. Method and apparatus for nested instruction looping using implicit predicates
US8340280B2 (en) * 2008-06-13 2012-12-25 Intel Corporation Using a single instruction multiple data (SIMD) instruction to speed up galois counter mode (GCM) computations
US8219757B2 (en) 2008-09-30 2012-07-10 Intel Corporation Apparatus and method for low touch cache management
US8290882B2 (en) * 2008-10-09 2012-10-16 Microsoft Corporation Evaluating decision trees on a GPU
US8645634B1 (en) * 2009-01-16 2014-02-04 Nvidia Corporation Zero-copy data sharing by cooperating asymmetric coprocessors
US20100185816A1 (en) 2009-01-21 2010-07-22 Sauber William F Multiple Cache Line Size
US8266409B2 (en) 2009-03-03 2012-09-11 Qualcomm Incorporated Configurable cache and method to configure same
US8655937B1 (en) * 2009-04-29 2014-02-18 Nvidia Corporation High precision integer division using low precision hardware operations and rounding techniques
US8577950B2 (en) * 2009-08-17 2013-11-05 International Business Machines Corporation Matrix multiplication operations with data pre-conditioning in a high performance computing architecture
US8364739B2 (en) 2009-09-30 2013-01-29 International Business Machines Corporation Sparse matrix-vector multiplication on graphics processor units
US8984043B2 (en) * 2009-12-23 2015-03-17 Intel Corporation Multiplying and adding matrices
US8669990B2 (en) 2009-12-31 2014-03-11 Intel Corporation Sharing resources between a CPU and GPU
US20110208505A1 (en) 2010-02-24 2011-08-25 Advanced Micro Devices, Inc. Assigning floating-point operations to a floating-point unit and an arithmetic logic unit
US20110249744A1 (en) 2010-04-12 2011-10-13 Neil Bailey Method and System for Video Processing Utilizing N Scalar Cores and a Single Vector Core
US8812575B2 (en) * 2010-07-06 2014-08-19 Silminds, Llc, Egypt Decimal floating-point square-root unit using Newton-Raphson iterations
CN201927837U (zh) 2010-08-10 2011-08-10 富士康(昆山)电脑接插件有限公司 连接器模组
US20120059866A1 (en) * 2010-09-03 2012-03-08 Advanced Micro Devices, Inc. Method and apparatus for performing floating-point division
US8667042B2 (en) 2010-09-24 2014-03-04 Intel Corporation Functional unit for vector integer multiply add instruction
US8488055B2 (en) * 2010-09-30 2013-07-16 Apple Inc. Flash synchronization using image sensor interface timing signal
TWI428833B (zh) * 2010-11-10 2014-03-01 Via Tech Inc 多執行緒處理器及其指令執行及同步方法及其電腦程式產品
US8745111B2 (en) * 2010-11-16 2014-06-03 Apple Inc. Methods and apparatuses for converting floating point representations
CN101986264B (zh) * 2010-11-25 2013-07-31 中国人民解放军国防科学技术大学 用于simd向量微处理器的多功能浮点乘加运算装置
GB2488985A (en) 2011-03-08 2012-09-19 Advanced Risc Mach Ltd Mixed size data processing operation with integrated operand conversion instructions
US8667222B2 (en) 2011-04-01 2014-03-04 Intel Corporation Bypass and insertion algorithms for exclusive last-level caches
US9501392B1 (en) 2011-05-12 2016-11-22 Avago Technologies General Ip (Singapore) Pte. Ltd. Management of a non-volatile memory module
CN102214160B (zh) * 2011-07-08 2013-04-17 中国科学技术大学 一种基于龙芯3a的单精度矩阵乘法优化方法
US9727336B2 (en) * 2011-09-16 2017-08-08 International Business Machines Corporation Fine-grained instruction enablement at sub-function granularity based on an indicated subrange of registers
US20130099946A1 (en) 2011-10-21 2013-04-25 International Business Machines Corporation Data Compression Utilizing Variable and Limited Length Codes
US8935478B2 (en) 2011-11-01 2015-01-13 International Business Machines Corporation Variable cache line size management
US9021237B2 (en) * 2011-12-20 2015-04-28 International Business Machines Corporation Low latency variable transfer network communicating variable written to source processing core variable register allocated to destination thread to destination processing core variable register allocated to source thread
US9960917B2 (en) 2011-12-22 2018-05-01 Intel Corporation Matrix multiply accumulate instruction
CN114721721A (zh) * 2011-12-23 2022-07-08 英特尔公司 用于混洗浮点或整数值的装置和方法
CN104011664B (zh) * 2011-12-23 2016-12-28 英特尔公司 使用三个标量项的超级乘加(超级madd)指令
WO2013101120A1 (en) 2011-12-29 2013-07-04 Intel Corporation Online learning based algorithms to increase retention and reuse of gpu-generated dynamic surfaces in outer-level caches
EP2812802A4 (en) 2012-02-08 2016-04-27 Intel Corp DYNAMIC CPU GPU LOAD BALANCING USING POWER
US20130218938A1 (en) * 2012-02-17 2013-08-22 Qualcomm Incorporated Floating-point adder with operand shifting based on a predicted exponent difference
US9213523B2 (en) 2012-06-29 2015-12-15 Intel Corporation Double rounded combined floating-point multiply and add
US8892619B2 (en) * 2012-07-24 2014-11-18 The Board Of Trustees Of The Leland Stanford Junior University Floating-point multiply-add unit using cascade design
US9298456B2 (en) * 2012-08-21 2016-03-29 Apple Inc. Mechanism for performing speculative predicated instructions
US20140075163A1 (en) * 2012-09-07 2014-03-13 Paul N. Loewenstein Load-monitor mwait
US9582287B2 (en) * 2012-09-27 2017-02-28 Intel Corporation Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions
US9152382B2 (en) * 2012-10-31 2015-10-06 Intel Corporation Reducing power consumption in a fused multiply-add (FMA) unit responsive to input data values
US11150721B2 (en) * 2012-11-07 2021-10-19 Nvidia Corporation Providing hints to an execution unit to prepare for predicted subsequent arithmetic operations
US9829956B2 (en) * 2012-11-21 2017-11-28 Nvidia Corporation Approach to power reduction in floating-point operations
US9183144B2 (en) 2012-12-14 2015-11-10 Intel Corporation Power gating a portion of a cache memory
US10289418B2 (en) 2012-12-27 2019-05-14 Nvidia Corporation Cooperative thread array granularity context switch during trap handling
US9317251B2 (en) * 2012-12-31 2016-04-19 Nvidia Corporation Efficient correction of normalizer shift amount errors in fused multiply add operations
US9298457B2 (en) * 2013-01-22 2016-03-29 Altera Corporation SIMD instructions for data compression and decompression
US9525586B2 (en) * 2013-03-15 2016-12-20 Intel Corporation QoS based binary translation and application streaming
GB2514397B (en) * 2013-05-23 2017-10-11 Linear Algebra Tech Ltd Corner detection
EP3005078A2 (en) 2013-05-24 2016-04-13 Coherent Logix Incorporated Memory-network processor with programmable optimizations
US9264066B2 (en) * 2013-07-30 2016-02-16 Apple Inc. Type conversion using floating-point unit
US9092345B2 (en) * 2013-08-08 2015-07-28 Arm Limited Data processing systems
US9710380B2 (en) 2013-08-29 2017-07-18 Intel Corporation Managing shared cache by multi-core processor
TWI676898B (zh) 2013-12-09 2019-11-11 安然國際科技有限公司 分散式記憶體磁碟群集儲存系統運作方法
US9465578B2 (en) * 2013-12-13 2016-10-11 Nvidia Corporation Logic circuitry configurable to perform 32-bit or dual 16-bit floating-point operations
US9461667B2 (en) * 2013-12-30 2016-10-04 Samsung Electronics Co., Ltd. Rounding injection scheme for floating-point to integer conversion
US20150193358A1 (en) 2014-01-06 2015-07-09 Nvidia Corporation Prioritized Memory Reads
US10528357B2 (en) 2014-01-17 2020-01-07 L3 Technologies, Inc. Web-based recorder configuration utility
US20150268963A1 (en) * 2014-03-23 2015-09-24 Technion Research & Development Foundation Ltd. Execution of data-parallel programs on coarse-grained reconfigurable architecture hardware
WO2015147895A1 (en) * 2014-03-26 2015-10-01 Intel Corporation Three source operand floating point addition processors, methods, systems, and instructions
JP6248808B2 (ja) 2014-05-22 2017-12-20 富士通株式会社 情報処理装置、情報処理システム、情報処理装置の制御方法、及び、情報処理装置の制御プログラム
US10061592B2 (en) 2014-06-27 2018-08-28 Samsung Electronics Co., Ltd. Architecture and execution for efficient mixed precision computations in single instruction multiple data/thread (SIMD/T) devices
US9520192B2 (en) 2014-06-30 2016-12-13 Intel Corporation Resistive memory write operation with merged reset
US10032244B2 (en) * 2014-08-21 2018-07-24 Intel Corporation Method and apparatus for implementing a nearest neighbor search on a graphics processing unit (GPU)
US10223333B2 (en) * 2014-08-29 2019-03-05 Nvidia Corporation Performing multi-convolution operations in a parallel processing system
CN104407836B (zh) * 2014-10-14 2017-05-31 中国航天科技集团公司第九研究院第七七一研究所 利用定点乘法器进行级联乘累加运算的装置和方法
JP2016091242A (ja) 2014-10-31 2016-05-23 富士通株式会社 キャッシュメモリ、キャッシュメモリへのアクセス方法及び制御プログラム
US20160124709A1 (en) * 2014-11-04 2016-05-05 International Business Machines Corporation Fast, energy-efficient exponential computations in simd architectures
CN104461449B (zh) * 2014-11-14 2018-02-27 中国科学院数据与通信保护研究教育中心 基于向量指令的大整数乘法实现方法及装置
US10282227B2 (en) 2014-11-18 2019-05-07 Intel Corporation Efficient preemption for graphics processors
US9491112B1 (en) * 2014-12-10 2016-11-08 Amazon Technologies, Inc. Allocating processor resources based on a task identifier
WO2016097812A1 (en) 2014-12-14 2016-06-23 Via Alliance Semiconductor Co., Ltd. Cache memory budgeted by chunks based on memory access type
JP6207766B2 (ja) 2014-12-14 2017-10-04 ヴィア アライアンス セミコンダクター カンパニー リミテッド ヘテロジニアス置換ポリシーを用いるセット・アソシエイティブ・キャッシュ・メモリ
US9910785B2 (en) 2014-12-14 2018-03-06 Via Alliance Semiconductor Co., Ltd Cache memory budgeted by ways based on memory access type
US9928034B2 (en) * 2014-12-17 2018-03-27 Nvidia Corporation Work-efficient, load-balanced, merge-based parallelized consumption of sequences of sequences
US10297001B2 (en) * 2014-12-26 2019-05-21 Intel Corporation Reduced power implementation of computer instructions
US9710228B2 (en) * 2014-12-29 2017-07-18 Imagination Technologies Limited Unified multiply unit
US20170061279A1 (en) * 2015-01-14 2017-03-02 Intel Corporation Updating an artificial neural network using flexible fixed point representation
US10002455B2 (en) * 2015-04-20 2018-06-19 Intel Corporation Optimized depth buffer cache apparatus and method
US10262259B2 (en) * 2015-05-08 2019-04-16 Qualcomm Incorporated Bit width selection for fixed point neural networks
US9804666B2 (en) 2015-05-26 2017-10-31 Samsung Electronics Co., Ltd. Warp clustering
US20190073582A1 (en) * 2015-09-23 2019-03-07 Yi Yang Apparatus and method for local quantization for convolutional neural networks (cnns)
WO2017049592A1 (en) 2015-09-25 2017-03-30 Intel Corporation Method and apparatus to improve shared memory efficiency
US20170177336A1 (en) * 2015-12-22 2017-06-22 Intel Corporation Hardware cancellation monitor for floating point operations
US9996320B2 (en) * 2015-12-23 2018-06-12 Intel Corporation Fused multiply-add (FMA) low functional unit
US20170323042A1 (en) * 2016-05-04 2017-11-09 Chengdu Haicun Ip Technology Llc Simulation Processor with Backside Look-Up Table
US20170308800A1 (en) * 2016-04-26 2017-10-26 Smokescreen Intelligence, LLC Interchangeable Artificial Intelligence Perception Systems and Methods
US10509732B2 (en) 2016-04-27 2019-12-17 Advanced Micro Devices, Inc. Selecting cache aging policy for prefetches based on cache test regions
GB201607713D0 (en) * 2016-05-03 2016-06-15 Imagination Tech Ltd Convolutional neural network
US9846579B1 (en) * 2016-06-13 2017-12-19 Apple Inc. Unified integer and floating-point compare circuitry
JP6665720B2 (ja) 2016-07-14 2020-03-13 富士通株式会社 情報処理装置、コンパイルプログラム、コンパイル方法、およびキャッシュ制御方法
US10997496B2 (en) 2016-08-11 2021-05-04 Nvidia Corporation Sparse convolutional neural network accelerator
US10891538B2 (en) 2016-08-11 2021-01-12 Nvidia Corporation Sparse convolutional neural network accelerator
US10242311B2 (en) 2016-08-11 2019-03-26 Vivante Corporation Zero coefficient skipping convolution neural network engine
US11315018B2 (en) 2016-10-21 2022-04-26 Nvidia Corporation Systems and methods for pruning neural networks for resource efficient inference
US10216479B2 (en) * 2016-12-06 2019-02-26 Arm Limited Apparatus and method for performing arithmetic operations to accumulate floating-point numbers
US20180183577A1 (en) * 2016-12-28 2018-06-28 Intel Corporation Techniques for secure message authentication with unified hardware acceleration
US10558575B2 (en) * 2016-12-30 2020-02-11 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
GB2560159B (en) * 2017-02-23 2019-12-25 Advanced Risc Mach Ltd Widening arithmetic in a data processing apparatus
KR102499396B1 (ko) 2017-03-03 2023-02-13 삼성전자 주식회사 뉴럴 네트워크 장치 및 뉴럴 네트워크 장치의 동작 방법
US10409614B2 (en) 2017-04-24 2019-09-10 Intel Corporation Instructions having support for floating point and integer data types in the same register
US10474458B2 (en) * 2017-04-28 2019-11-12 Intel Corporation Instructions and logic to perform floating-point and integer operations for machine learning
US10338919B2 (en) * 2017-05-08 2019-07-02 Nvidia Corporation Generalized acceleration of matrix multiply accumulate operations
US10969740B2 (en) 2017-06-27 2021-04-06 Nvidia Corporation System and method for near-eye light field rendering for wide field of view interactive three-dimensional computer graphics
US10394456B2 (en) 2017-08-23 2019-08-27 Micron Technology, Inc. On demand memory page size
US11232531B2 (en) 2017-08-29 2022-01-25 Intel Corporation Method and apparatus for efficient loop processing in a graphics hardware front end
US10503507B2 (en) 2017-08-31 2019-12-10 Nvidia Corporation Inline data inspection for workload simplification
US10725740B2 (en) 2017-08-31 2020-07-28 Qualcomm Incorporated Providing efficient multiplication of sparse matrices in matrix-processor-based devices
US11222256B2 (en) 2017-10-17 2022-01-11 Xilinx, Inc. Neural network processing system having multiple processors and a neural network accelerator
GB2569098B (en) 2017-10-20 2020-01-08 Graphcore Ltd Combining states of multiple threads in a multi-threaded processor
GB2569274B (en) 2017-10-20 2020-07-15 Graphcore Ltd Synchronization amongst processor tiles
GB2569271B (en) 2017-10-20 2020-05-13 Graphcore Ltd Synchronization with a host processor
GB2569844B (en) 2017-10-20 2021-01-06 Graphcore Ltd Sending data off-chip
US11977974B2 (en) 2017-11-30 2024-05-07 International Business Machines Corporation Compression of fully connected / recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression
US10678508B2 (en) 2018-03-23 2020-06-09 Amazon Technologies, Inc. Accelerated quantized multiply-and-add operations
CN111937401B (zh) 2018-04-13 2022-08-16 皇家Kpn公司 基于块级超分辨率的视频编码的方法和装置
US10769526B2 (en) * 2018-04-24 2020-09-08 Intel Corporation Machine learning accelerator architecture
US11269805B2 (en) 2018-05-15 2022-03-08 Intel Corporation Signal pathways in multi-tile processors
CN113190791A (zh) 2018-08-06 2021-07-30 华为技术有限公司 矩阵的处理方法、装置及逻辑电路
KR20200022118A (ko) 2018-08-22 2020-03-03 에스케이하이닉스 주식회사 데이터 저장 장치 및 그 동작 방법
US11093248B2 (en) 2018-09-10 2021-08-17 International Business Machines Corporation Prefetch queue allocation protection bubble in a processor
US11294626B2 (en) 2018-09-27 2022-04-05 Intel Corporation Floating-point dynamic range expansion
GB2580151B (en) 2018-12-21 2021-02-24 Graphcore Ltd Identifying processing units in a processor
US10915461B2 (en) 2019-03-05 2021-02-09 International Business Machines Corporation Multilevel cache eviction management
KR20210136994A (ko) 2019-03-15 2021-11-17 인텔 코포레이션 매트릭스 가속기 아키텍처 내에서의 시스톨릭 분리
EP3938894B1 (en) 2019-03-15 2023-08-30 INTEL Corporation Multi-tile memory management for detecting cross tile access, providing multi-tile inference scaling, and providing optimal page migration
US11574239B2 (en) * 2019-03-18 2023-02-07 Microsoft Technology Licensing, Llc Outlier quantization for training and inference

Also Published As

Publication number Publication date
TWI784372B (zh) 2022-11-21
EP3859519B1 (en) 2022-05-25
EP3937004A1 (en) 2022-01-12
EP3396524A1 (en) 2018-10-31
CN115185484A (zh) 2022-10-14
TW202247188A (zh) 2022-12-01
EP3796154A1 (en) 2021-03-24
EP3637247A1 (en) 2020-04-15
ES2929797T3 (es) 2022-12-01
EP3637247B1 (en) 2022-08-17
CN112947894A (zh) 2021-06-11
TWI819748B (zh) 2023-10-21
EP3637246A1 (en) 2020-04-15
US20190369988A1 (en) 2019-12-05
EP4242838A2 (en) 2023-09-13
PL3637247T3 (pl) 2022-11-21
EP4160387A1 (en) 2023-04-05
TW202141513A (zh) 2021-11-01
CN111666066B (zh) 2021-11-09
US20180315398A1 (en) 2018-11-01
PL3937004T3 (pl) 2023-01-23
US20210124579A1 (en) 2021-04-29
TWI834576B (zh) 2024-03-01
ES2925598T3 (es) 2022-10-18
EP3859519A1 (en) 2021-08-04
US20220019431A1 (en) 2022-01-20
US10474458B2 (en) 2019-11-12
CN115826916A (zh) 2023-03-21
CN112527243A (zh) 2021-03-19
EP4130976A1 (en) 2023-02-08
EP4242838A3 (en) 2023-09-27
US20240184572A1 (en) 2024-06-06
PL3859519T3 (pl) 2022-09-05
CN113672197B (zh) 2024-07-02
CN113672197A (zh) 2021-11-19
US11360767B2 (en) 2022-06-14
TWI760443B (zh) 2022-04-11
US11169799B2 (en) 2021-11-09
TWI793685B (zh) 2023-02-21
EP3637246B1 (en) 2022-04-06
CN111666066A (zh) 2020-09-15
US11720355B2 (en) 2023-08-08
TW201839642A (zh) 2018-11-01
US20210182058A1 (en) 2021-06-17
ES2934080T3 (es) 2023-02-16
CN112947893A (zh) 2021-06-11
US10353706B2 (en) 2019-07-16
TW202123253A (zh) 2021-06-16
CN116755656A (zh) 2023-09-15
EP3937004B1 (en) 2022-10-05
US20230046506A1 (en) 2023-02-16
CN108804077A (zh) 2018-11-13
US20220357945A1 (en) 2022-11-10
ES2915607T3 (es) 2022-06-23
TW202343467A (zh) 2023-11-01
US11080046B2 (en) 2021-08-03
US20180315399A1 (en) 2018-11-01

Similar Documents

Publication Publication Date Title
PL3637247T3 (pl) Instrukcje i logika do wykonywania operacji zmiennoprzecinkowych i całkowitoliczbowych dla uczenia maszynowego
GB2543429B (en) Machine learning for visual processing
EP3394722A4 (en) INSTRUCTIONS AND LOGIC FOR INDEX LOADING AND GROUP PREPARATION OPERATIONS
EP3238046A4 (en) Instruction and logic to perform a fused single cycle increment-compare-jump
EP3274817A4 (en) Instructions and logic to provide atomic range operations
GB2527822B (en) Graphics processing
EP3394728A4 (en) INSTRUCTIONS AND LOGIC FOR LOAD INDICES AND COLLECTIVE OPERATIONS
EP3391203A4 (en) COMMANDS AND LOGIC FOR LOAD INDICES AND PREFETCH SCATTER OPERATIONS
GB201419805D0 (en) Work piece processing arrangements
GB2546073B (en) Graphics processing
EP3394742A4 (en) INSTRUCTIONS AND LOGIC FOR BROADCASTING AND LOAD INDEX OPERATIONS
EP3391236A4 (en) INSTRUCTIONS AND LOGIC FOR OBTAINING OPERATIONS OF MULTIPLE VECTORIAL ELEMENTS
IL261429B (en) A compound multiplication instruction
GB201710873D0 (en) Graphics processing
GB2547242B (en) Graphics processing
GB2533224B (en) A hydraulic arrangement for a work machine and a process for a hydraulic arrangement
EP3391234A4 (en) INSTRUCTIONS AND LOGIC FOR OPERATIONS DEFINING MULTIPLE VECTORIAL ELEMENTS
SG11201704466QA (en) Methods, apparatus, instructions and logic to provide vector packed tuple cross-comparison functionality
GB2525495B (en) Labelling machine and labelling methods
EP3391235A4 (en) INSTRUCTIONS AND LOGIC FOR JUST AND ODD VECTOR GET OPERATIONS
EP3391201A4 (en) COMMAND AND LOGIC FOR PARTIAL REDUCTION OPERATIONS
EP3198401A4 (en) Instruction and logic for a vector format for processing computations
GB201714752D0 (en) Graphics processing
GB201721202D0 (en) Graphics processing
IL272483A (en) An improved technique for computer visual learning