MX2022001501A - Acelerador de hardware basado en tensores que incluye una unidad de procesamiento escalar. - Google Patents

Acelerador de hardware basado en tensores que incluye una unidad de procesamiento escalar.

Info

Publication number
MX2022001501A
MX2022001501A MX2022001501A MX2022001501A MX2022001501A MX 2022001501 A MX2022001501 A MX 2022001501A MX 2022001501 A MX2022001501 A MX 2022001501A MX 2022001501 A MX2022001501 A MX 2022001501A MX 2022001501 A MX2022001501 A MX 2022001501A
Authority
MX
Mexico
Prior art keywords
tensor
scalar
vectors
hardware accelerator
operations
Prior art date
Application number
MX2022001501A
Other languages
English (en)
Inventor
Steven Karl Reinhardt
Ii Joseph Anthony Mayer
Dan Zhang
Original Assignee
Microsoft Technology Licensing Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing Llc filed Critical Microsoft Technology Licensing Llc
Publication of MX2022001501A publication Critical patent/MX2022001501A/es

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8053Vector processors
    • G06F15/8061Details on data memory access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/3013Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
    • G06F9/3881Arrangements for communication of instructions and data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Advance Control (AREA)
  • Auxiliary Drives, Propulsion Controls, And Safety Devices (AREA)

Abstract

En el presente documento se describe un sistema de cómputo que acelera las operaciones de redes neuronales profundas (DNN) u otras operaciones de procesamiento utilizando un acelerador de hardware. El acelerador de hardware, a su vez, incluye un motor de procesamiento tensorial que funciona en conjunto con una unidad de procesamiento escalar (SPU). El motor de procesamiento tensorial maneja varios tipos de operaciones basadas en tensores requeridas por la DNN, como multiplicar vectores por matrices, combinar vectores con otros vectores, transformar vectores individuales, etcétera. La SPU realiza operaciones basadas en escalares, como formar el recíproco de un escalar, generar la raíz cuadrada de un escalar, etcétera. De acuerdo con una implementación ilustrativa, el sistema de cómputo utiliza la misma interfaz programática basada en vectores para interactuar tanto con el motor de procesamiento tensorial como con la SPU.
MX2022001501A 2019-08-06 2020-06-10 Acelerador de hardware basado en tensores que incluye una unidad de procesamiento escalar. MX2022001501A (es)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/533,237 US10997116B2 (en) 2019-08-06 2019-08-06 Tensor-based hardware accelerator including a scalar-processing unit
PCT/US2020/036873 WO2021025767A1 (en) 2019-08-06 2020-06-10 Tensor-based hardware accelerator including a scalar-processing unit

Publications (1)

Publication Number Publication Date
MX2022001501A true MX2022001501A (es) 2022-03-11

Family

ID=71899961

Family Applications (1)

Application Number Title Priority Date Filing Date
MX2022001501A MX2022001501A (es) 2019-08-06 2020-06-10 Acelerador de hardware basado en tensores que incluye una unidad de procesamiento escalar.

Country Status (6)

Country Link
US (1) US10997116B2 (es)
EP (1) EP4010794A1 (es)
CN (1) CN114207579A (es)
CA (1) CA3146416A1 (es)
MX (1) MX2022001501A (es)
WO (1) WO2021025767A1 (es)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11481604B2 (en) * 2019-10-24 2022-10-25 Alibaba Group Holding Limited Apparatus and method for neural network processing
US11455144B1 (en) * 2019-11-21 2022-09-27 Xilinx, Inc. Softmax calculation and architecture using a modified coordinate rotation digital computer (CORDIC) approach
US11734214B2 (en) * 2021-02-01 2023-08-22 Microsoft Technology Licensing, Llc Semi-programmable and reconfigurable co-accelerator for a deep neural network with normalization or non-linearity
EP4285286A1 (en) * 2021-02-01 2023-12-06 Microsoft Technology Licensing, LLC Semi-programmable and reconfigurable co-accelerator for a deep neural network with normalization or non-linearity
GB2608986B (en) * 2021-06-28 2024-01-24 Imagination Tech Ltd Implementation of Argmax or Argmin in hardware
EP4120142A1 (en) * 2021-06-28 2023-01-18 Imagination Technologies Limited Implementation of argmax or argmin in hardware
GB2608591B (en) * 2021-06-28 2024-01-24 Imagination Tech Ltd Implementation of pooling and unpooling or reverse pooling in hardware
US20240004647A1 (en) * 2022-07-01 2024-01-04 Andes Technology Corporation Vector processor with vector and element reduction method
KR20240007495A (ko) * 2022-07-08 2024-01-16 리벨리온 주식회사 뉴럴 코어, 이를 포함하는 뉴럴 프로세싱 장치 및 뉴럴 프로세싱 장치의 데이터 로드 방법
CN116029332B (zh) * 2023-02-22 2023-08-22 南京大学 一种基于lstm网络的片上微调方法及装置

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04336378A (ja) * 1991-05-14 1992-11-24 Nec Corp 情報処理装置
GB2382887B (en) * 2001-10-31 2005-09-28 Alphamosaic Ltd Instruction execution in a processor
GB2390700B (en) 2002-04-15 2006-03-15 Alphamosaic Ltd Narrow/wide cache
US7146486B1 (en) * 2003-01-29 2006-12-05 S3 Graphics Co., Ltd. SIMD processor with scalar arithmetic logic units
US9367462B2 (en) * 2009-12-29 2016-06-14 Empire Technology Development Llc Shared memories for energy efficient multi-core processors
US20150067273A1 (en) 2013-08-30 2015-03-05 Microsoft Corporation Computation hardware with high-bandwidth memory interface
US9367519B2 (en) 2013-08-30 2016-06-14 Microsoft Technology Licensing, Llc Sparse matrix data structure
US20160125263A1 (en) 2014-11-03 2016-05-05 Texas Instruments Incorporated Method to compute sliding window block sum using instruction based selective horizontal addition in vector processor
CN111651203B (zh) * 2016-04-26 2024-05-07 中科寒武纪科技股份有限公司 一种用于执行向量四则运算的装置和方法
CN107315715B (zh) * 2016-04-26 2020-11-03 中科寒武纪科技股份有限公司 一种用于执行矩阵加/减运算的装置和方法
US11055063B2 (en) * 2016-05-02 2021-07-06 Marvell Asia Pte, Ltd. Systems and methods for deep learning processor
EP3563307B1 (en) * 2017-02-23 2023-04-12 Cerebras Systems Inc. Accelerated deep learning
US10108581B1 (en) * 2017-04-03 2018-10-23 Google Llc Vector reduction processor
EP4361832A3 (en) * 2017-05-17 2024-08-07 Google LLC Special purpose neural network training chip
US10331445B2 (en) 2017-05-24 2019-06-25 Microsoft Technology Licensing, Llc Multifunction vector processor circuits
US10467324B2 (en) 2017-05-24 2019-11-05 Microsoft Technology Licensing, Llc Data packing techniques for hard-wired multiplier circuits
US10372456B2 (en) 2017-05-24 2019-08-06 Microsoft Technology Licensing, Llc Tensor processor instruction set architecture
US10338925B2 (en) 2017-05-24 2019-07-02 Microsoft Technology Licensing, Llc Tensor register files
JP2019057249A (ja) 2017-09-22 2019-04-11 富士通株式会社 演算処理装置および演算処理方法
US11372804B2 (en) * 2018-05-16 2022-06-28 Qualcomm Incorporated System and method of loading and replication of sub-vector values

Also Published As

Publication number Publication date
US20210042260A1 (en) 2021-02-11
US10997116B2 (en) 2021-05-04
WO2021025767A1 (en) 2021-02-11
EP4010794A1 (en) 2022-06-15
CN114207579A (zh) 2022-03-18
CA3146416A1 (en) 2021-02-11

Similar Documents

Publication Publication Date Title
MX2022001501A (es) Acelerador de hardware basado en tensores que incluye una unidad de procesamiento escalar.
EP4357979A3 (en) Superpixel methods for convolutional neural networks
SG10201804284XA (en) Performing Kernel Striding In Hardware
WO2017139683A8 (en) Techniques for control of quantum systems and related systems and methods
WO2019046317A8 (en) Key data processing method and apparatus, and server
AU2017330333B2 (en) Transforming attributes for training automated modeling systems
WO2021067665A3 (en) Enhancing artificial intelligence routines using 3d data
GB2581728A (en) Facilitating neural network efficiency
WO2015060915A3 (en) Quantum processor problem compilation
WO2011097225A3 (en) Generating advertising account entries using variables
WO2015121619A3 (en) Client-server communication system
RU2015123670A (ru) Вычислительное устройство, конфигурируемое с помощью табличной сети
Matsuya et al. Spatial pattern of discrete and ultradiscrete Gray-Scott model
WO2021046356A9 (en) Autonomous operations in oil and gas fields
Ng On arc index and maximal Thurston–Bennequin number
WO2019071041A3 (en) System and method for compact tree representation for machine learning
Johnpillai et al. Exact solutions of the mKdV equation with time-dependent coefficients
ATE377307T1 (de) Gegenmassnahmeverfahren in einem elektronischen baustein zur ausführung eines krypto-algorithmus mit geheimschlüssel
Ma et al. A counterpart of the Wadati–Konno–Ichikawa soliton hierarchy associated with so (3, R)
Sebbar et al. Eisenstein series and modular differential equations
Kleinert et al. Green function of the double-fractional Fokker-Planck equation: Path integral and stochastic differential equations
Magalakwe et al. Generalized double sinh-Gordon equation: Symmetry reductions, exact solutions and conservation laws
WO2017106606A3 (en) Methods for computation of discrete periodic radon transform
WO2021037284A3 (zh) 一种螺旋桨翼型的设计方法及终端设备
Eslami et al. Exact solutions for fifth-order KdV-type equations with time-dependent coefficients using the Kudryashov method