CN106537330B - 通过使用数据索引化累加器使标量操作并行化的方法和处理器 - Google Patents

通过使用数据索引化累加器使标量操作并行化的方法和处理器 Download PDF

Info

Publication number
CN106537330B
CN106537330B CN201580039295.6A CN201580039295A CN106537330B CN 106537330 B CN106537330 B CN 106537330B CN 201580039295 A CN201580039295 A CN 201580039295A CN 106537330 B CN106537330 B CN 106537330B
Authority
CN
China
Prior art keywords
vector
write
input data
index
accumulator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201580039295.6A
Other languages
English (en)
Chinese (zh)
Other versions
CN106537330A (zh
Inventor
卢西恩·科德雷斯库
艾瑞克·韦恩·马胡林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN106537330A publication Critical patent/CN106537330A/zh
Application granted granted Critical
Publication of CN106537330B publication Critical patent/CN106537330B/zh
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/82Architectures of general purpose stored program computers data or demand driven
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • G06F9/30109Register structure having multiple operands in a single register
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Complex Calculations (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)
CN201580039295.6A 2014-07-25 2015-06-26 通过使用数据索引化累加器使标量操作并行化的方法和处理器 Expired - Fee Related CN106537330B (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201462029039P 2014-07-25 2014-07-25
US62/029,039 2014-07-25
US14/486,326 US20160026607A1 (en) 2014-07-25 2014-09-15 Parallelization of scalar operations by vector processors using data-indexed accumulators in vector register files, and related circuits, methods, and computer-readable media
US14/486,326 2014-09-15
PCT/US2015/038013 WO2016014213A1 (en) 2014-07-25 2015-06-26 Parallelization of scalar operations by vector processors using data-indexed accumulators in vector register files, and related circuits, methods, and computer-readable media

Publications (2)

Publication Number Publication Date
CN106537330A CN106537330A (zh) 2017-03-22
CN106537330B true CN106537330B (zh) 2019-02-05

Family

ID=53674312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580039295.6A Expired - Fee Related CN106537330B (zh) 2014-07-25 2015-06-26 通过使用数据索引化累加器使标量操作并行化的方法和处理器

Country Status (5)

Country Link
US (1) US20160026607A1 (enExample)
EP (1) EP3172659B1 (enExample)
JP (1) JP6571752B2 (enExample)
CN (1) CN106537330B (enExample)
WO (1) WO2016014213A1 (enExample)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11544214B2 (en) * 2015-02-02 2023-01-03 Optimum Semiconductor Technologies, Inc. Monolithic vector processor configured to operate on variable length vectors using a vector length register
US9875213B2 (en) * 2015-06-26 2018-01-23 Intel Corporation Methods, apparatus, instructions and logic to provide vector packed histogram functionality
US10417730B2 (en) * 2016-12-21 2019-09-17 Intel Corporation Single input multiple data processing mechanism
US10877925B2 (en) * 2019-03-18 2020-12-29 Micron Technology, Inc. Vector processor with vector first and multiple lane configuration
US11327862B2 (en) 2019-05-20 2022-05-10 Micron Technology, Inc. Multi-lane solutions for addressing vector elements using vector index registers
US11507374B2 (en) 2019-05-20 2022-11-22 Micron Technology, Inc. True/false vector index registers and methods of populating thereof
US11340904B2 (en) 2019-05-20 2022-05-24 Micron Technology, Inc. Vector index registers
US11403256B2 (en) 2019-05-20 2022-08-02 Micron Technology, Inc. Conditional operations in a vector processor having true and false vector index registers
US11157278B2 (en) * 2019-05-27 2021-10-26 Texas Instruments Incorporated Histogram operation
CN110245756B (zh) 2019-06-14 2021-10-26 第四范式(北京)技术有限公司 用于处理数据组的可编程器件及处理数据组的方法
CN112463215B (zh) * 2019-09-09 2024-06-07 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机可读存储介质和计算机设备
US11119772B2 (en) 2019-12-06 2021-09-14 International Business Machines Corporation Check pointing of accumulator register results in a microprocessor
US11900116B1 (en) 2021-09-29 2024-02-13 International Business Machines Corporation Loosely-coupled slice target file data
US12438556B2 (en) * 2022-09-30 2025-10-07 Qualcomm Incorporated Single instruction multiple data (SIMD) sparse decompression with variable density

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2216307A (en) * 1988-03-01 1989-10-04 Ardent Computer Corp Vector register file
US6931511B1 (en) * 2001-12-31 2005-08-16 Apple Computer, Inc. Parallel vector table look-up with replicated index element vector
CN1820246A (zh) * 2003-05-09 2006-08-16 杉桥技术公司 执行饱和或不执行饱和地累加多操作数的处理器还原单元
US7506135B1 (en) * 2002-06-03 2009-03-17 Mimar Tibet Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements
US20090249026A1 (en) * 2008-03-28 2009-10-01 Mikhail Smelyanskiy Vector instructions to enable efficient synchronization and parallel reduction operations
CN102197369A (zh) * 2008-10-08 2011-09-21 Arm有限公司 用于执行simd乘法-累积运算的装置及方法
US20120159130A1 (en) * 2010-12-21 2012-06-21 Mikhail Smelyanskiy Mechanism for conflict detection using simd
US20120327260A1 (en) * 2011-06-27 2012-12-27 Renesas Electronics Corporation Parallel operation histogramming device and microcomputer
US20130185538A1 (en) * 2011-07-14 2013-07-18 Texas Instruments Incorporated Processor with inter-processing path communication
US20130212353A1 (en) * 2002-02-04 2013-08-15 Tibet MIMAR System for implementing vector look-up table operations in a SIMD processor
US20140108480A1 (en) * 2011-12-22 2014-04-17 Elmoustapha Ould-Ahmed-Vall Apparatus and method for vector compute and accumulate
US20140189320A1 (en) * 2012-12-28 2014-07-03 Shih Shigjong KUO Instruction for Determining Histograms
US20140189309A1 (en) * 2012-12-29 2014-07-03 Christopher J. Hughes Methods, apparatus, instructions, and logic to provide permute controls with leading zero count functionality

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06105459B2 (ja) * 1988-08-11 1994-12-21 日本電気株式会社 ベクトル処理装置
WO2001009717A1 (en) * 1999-08-02 2001-02-08 Morton Steven G Video digital signal processor chip
US6625685B1 (en) * 2000-09-20 2003-09-23 Broadcom Corporation Memory controller with programmable configuration
US8959292B1 (en) * 2005-12-22 2015-02-17 The Board Of Trustees Of The Leland Stanford Junior University Atomic memory access hardware implementations

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2216307A (en) * 1988-03-01 1989-10-04 Ardent Computer Corp Vector register file
US6931511B1 (en) * 2001-12-31 2005-08-16 Apple Computer, Inc. Parallel vector table look-up with replicated index element vector
US20130212353A1 (en) * 2002-02-04 2013-08-15 Tibet MIMAR System for implementing vector look-up table operations in a SIMD processor
US7506135B1 (en) * 2002-06-03 2009-03-17 Mimar Tibet Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements
CN1820246A (zh) * 2003-05-09 2006-08-16 杉桥技术公司 执行饱和或不执行饱和地累加多操作数的处理器还原单元
US20090249026A1 (en) * 2008-03-28 2009-10-01 Mikhail Smelyanskiy Vector instructions to enable efficient synchronization and parallel reduction operations
CN102197369A (zh) * 2008-10-08 2011-09-21 Arm有限公司 用于执行simd乘法-累积运算的装置及方法
US20120159130A1 (en) * 2010-12-21 2012-06-21 Mikhail Smelyanskiy Mechanism for conflict detection using simd
US20120327260A1 (en) * 2011-06-27 2012-12-27 Renesas Electronics Corporation Parallel operation histogramming device and microcomputer
US20130185538A1 (en) * 2011-07-14 2013-07-18 Texas Instruments Incorporated Processor with inter-processing path communication
US20140108480A1 (en) * 2011-12-22 2014-04-17 Elmoustapha Ould-Ahmed-Vall Apparatus and method for vector compute and accumulate
US20140189320A1 (en) * 2012-12-28 2014-07-03 Shih Shigjong KUO Instruction for Determining Histograms
US20140189309A1 (en) * 2012-12-29 2014-07-03 Christopher J. Hughes Methods, apparatus, instructions, and logic to provide permute controls with leading zero count functionality

Also Published As

Publication number Publication date
WO2016014213A1 (en) 2016-01-28
JP6571752B2 (ja) 2019-09-04
JP2017527886A (ja) 2017-09-21
CN106537330A (zh) 2017-03-22
EP3172659B1 (en) 2020-02-26
EP3172659A1 (en) 2017-05-31
US20160026607A1 (en) 2016-01-28

Similar Documents

Publication Publication Date Title
CN106537330B (zh) 通过使用数据索引化累加器使标量操作并行化的方法和处理器
US10140123B2 (en) SIMD processing lanes storing input pixel operand data in local register file for thread execution of image processing operations
US9886459B2 (en) Methods and systems for fast set-membership tests using one or more processors that support single instruction multiple data instructions
CN111580864B (zh) 一种向量运算装置及运算方法
US11048509B2 (en) Providing multi-element multi-vector (MEMV) register file access in vector-processor-based devices
Rocki et al. Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem
CN108369511A (zh) 用于基于通道的跨步存储操作的指令和逻辑
CN106462394B (zh) 使用共享硬件资源的群集处理器核心中硬件线程的动态负载平衡以及相关的电路、方法和计算机可读媒体
CN108369509A (zh) 用于基于通道的跨步分散操作的指令和逻辑
US11030714B2 (en) Wide key hash table for a graphics processing unit
US10691456B2 (en) Vector store instruction having instruction-specified byte count to be stored supporting big and little endian processing
TW201723811A (zh) 於指令集架構中排序資料及合併經排序資料之技術
CN108369573A (zh) 用于设置多个向量元素的操作的指令和逻辑
CN109478175A (zh) 在simd架构中用于通道混洗的混洗器电路
US20240403621A1 (en) Processing sequential inputs using neural network accelerators
US10691453B2 (en) Vector load with instruction-specified byte count less than a vector size for big and little endian processing
US9824012B2 (en) Providing coherent merging of committed store queue entries in unordered store queues of block-based computer processors
CN107209664A (zh) 具有根据指令位选择性地使用专用寄存器作为源操作数的指令的保留站
CN108369571A (zh) 用于偶数和奇数向量get操作的指令和逻辑
CN104335167B (zh) 用于处理计算机指令的方法和处理器
US9841979B2 (en) Method and apparatus for shuffling data using hierarchical shuffle units
CN112348182A (zh) 一种神经网络maxout层计算装置
US12399849B2 (en) Data processing methods, apparatuses, electronic devices and computer-readable storage media
CN109375952A (zh) 用于存储数据的方法和装置
CN120723486B (zh) 一种规约结果的获取方法及计算设备

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190205

Termination date: 20210626

CF01 Termination of patent right due to non-payment of annual fee