CN107873091B - 用于滑动窗口运算的方法和设备 - Google Patents

用于滑动窗口运算的方法和设备 Download PDF

Info

Publication number
CN107873091B
CN107873091B CN201680039649.1A CN201680039649A CN107873091B CN 107873091 B CN107873091 B CN 107873091B CN 201680039649 A CN201680039649 A CN 201680039649A CN 107873091 B CN107873091 B CN 107873091B
Authority
CN
China
Prior art keywords
register
data element
input data
pass
data elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201680039649.1A
Other languages
English (en)
Chinese (zh)
Other versions
CN107873091A (zh
Inventor
艾瑞克·马胡林
雅各布·帕维尔·戈拉布
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN107873091A publication Critical patent/CN107873091A/zh
Application granted granted Critical
Publication of CN107873091B publication Critical patent/CN107873091B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/82Architectures of general purpose stored program computers data or demand driven
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30021Compare instructions, e.g. Greater-Than, Equal-To, MINMAX
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30029Logical and Boolean instructions, e.g. XOR, NOT
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • G06F9/30112Register structure comprising data of variable length
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/30134Register stacks; shift registers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Executing Machine-Instructions (AREA)
  • Complex Calculations (AREA)
  • Advance Control (AREA)
CN201680039649.1A 2015-07-20 2016-07-11 用于滑动窗口运算的方法和设备 Active CN107873091B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/803,728 2015-07-20
US14/803,728 US10459731B2 (en) 2015-07-20 2015-07-20 Sliding window operation
PCT/US2016/041779 WO2017014979A1 (en) 2015-07-20 2016-07-11 Simd sliding window operation

Publications (2)

Publication Number Publication Date
CN107873091A CN107873091A (zh) 2018-04-03
CN107873091B true CN107873091B (zh) 2021-05-28

Family

ID=56511938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680039649.1A Active CN107873091B (zh) 2015-07-20 2016-07-11 用于滑动窗口运算的方法和设备

Country Status (8)

Country Link
US (1) US10459731B2 (enExample)
EP (1) EP3326061B1 (enExample)
JP (1) JP6737869B2 (enExample)
KR (1) KR102092049B1 (enExample)
CN (1) CN107873091B (enExample)
BR (1) BR112018001183A2 (enExample)
CA (1) CA2990249A1 (enExample)
WO (1) WO2017014979A1 (enExample)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11803377B2 (en) * 2017-09-08 2023-10-31 Oracle International Corporation Efficient direct convolution using SIMD instructions
US11403727B2 (en) 2020-01-28 2022-08-02 Nxp Usa, Inc. System and method for convolving an image
CN113887695A (zh) * 2020-07-03 2022-01-04 北京君正集成电路股份有限公司 一种基于simd的卷积运算的并行优化方法
US11586442B2 (en) 2020-08-06 2023-02-21 Nxp Usa, Inc. System and method for convolving image with sparse kernels

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6009505A (en) * 1996-12-02 1999-12-28 Compaq Computer Corp. System and method for routing one operand to arithmetic logic units from fixed register slots and another operand from any register slot
CN102012893A (zh) * 2010-11-25 2011-04-13 中国人民解放军国防科学技术大学 一种可扩展向量运算簇
CN104699458A (zh) * 2015-03-30 2015-06-10 哈尔滨工业大学 定点向量处理器及其向量数据访存控制方法

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5743239A (en) * 1980-08-27 1982-03-11 Hitachi Ltd Data processor
US5933650A (en) * 1997-10-09 1999-08-03 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
US6230253B1 (en) * 1998-03-31 2001-05-08 Intel Corporation Executing partial-width packed data instructions
EP1261912A2 (en) * 2000-03-08 2002-12-04 Sun Microsystems, Inc. Processing architecture having sub-word shuffling and opcode modification
DE10023319B4 (de) * 2000-05-12 2008-07-10 Man Roland Druckmaschinen Ag Verfahren zum Umspulen von Thermotransferband zum Bebildern von Druckformen
US7126991B1 (en) 2003-02-03 2006-10-24 Tibet MIMAR Method for programmable motion estimation in a SIMD processor
US7275147B2 (en) * 2003-03-31 2007-09-25 Hitachi, Ltd. Method and apparatus for data alignment and parsing in SIMD computer architecture
GB2409065B (en) 2003-12-09 2006-10-25 Advanced Risc Mach Ltd Multiplexing operations in SIMD processing
US7353244B2 (en) * 2004-04-16 2008-04-01 Marvell International Ltd. Dual-multiply-accumulator operation optimized for even and odd multisample calculations
US7933405B2 (en) 2005-04-08 2011-04-26 Icera Inc. Data access and permute unit
US7761694B2 (en) * 2006-06-30 2010-07-20 Intel Corporation Execution unit for performing shuffle and other operations
US20080100628A1 (en) 2006-10-31 2008-05-01 International Business Machines Corporation Single Precision Vector Permute Immediate with "Word" Vector Write Mask
GB2444744B (en) * 2006-12-12 2011-05-25 Advanced Risc Mach Ltd Apparatus and method for performing re-arrangement operations on data
WO2008077803A1 (en) 2006-12-22 2008-07-03 Telefonaktiebolaget L M Ericsson (Publ) Simd processor with reduction unit
CN101021832A (zh) * 2007-03-19 2007-08-22 中国人民解放军国防科学技术大学 支持局部寄存和条件执行的64位浮点整数融合运算群
CN100461095C (zh) * 2007-11-20 2009-02-11 浙江大学 一种支持多模式的媒体增强流水线乘法单元设计方法
JP2009282744A (ja) * 2008-05-22 2009-12-03 Toshiba Corp 演算器及び半導体集積回路装置
US9081501B2 (en) * 2010-01-08 2015-07-14 International Business Machines Corporation Multi-petascale highly efficient parallel supercomputer
US9588766B2 (en) * 2012-09-28 2017-03-07 Intel Corporation Accelerated interlane vector reduction instructions
US9424031B2 (en) 2013-03-13 2016-08-23 Intel Corporation Techniques for enabling bit-parallel wide string matching with a SIMD register

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6009505A (en) * 1996-12-02 1999-12-28 Compaq Computer Corp. System and method for routing one operand to arithmetic logic units from fixed register slots and another operand from any register slot
CN102012893A (zh) * 2010-11-25 2011-04-13 中国人民解放军国防科学技术大学 一种可扩展向量运算簇
CN104699458A (zh) * 2015-03-30 2015-06-10 哈尔滨工业大学 定点向量处理器及其向量数据访存控制方法

Also Published As

Publication number Publication date
CA2990249A1 (en) 2017-01-26
US20170024218A1 (en) 2017-01-26
CN107873091A (zh) 2018-04-03
JP6737869B2 (ja) 2020-08-12
KR20180030540A (ko) 2018-03-23
WO2017014979A1 (en) 2017-01-26
JP2018525730A (ja) 2018-09-06
US10459731B2 (en) 2019-10-29
KR102092049B1 (ko) 2020-04-20
EP3326061A1 (en) 2018-05-30
BR112018001183A2 (en) 2018-09-11
EP3326061B1 (en) 2022-05-25

Similar Documents

Publication Publication Date Title
EP3033670B1 (en) Vector accumulation method and apparatus
JP4750157B2 (ja) データを右方向平行シフトマージする方法及び装置
US9411726B2 (en) Low power computation architecture
CN112119375B (zh) 加载和复制子矢量值的系统和方法
CN107873091B (zh) 用于滑动窗口运算的方法和设备
JP7387017B2 (ja) アドレス生成方法及びユニット、深層学習処理器、チップ、電子機器並びにコンピュータプログラム
JP2005025718A (ja) Simd整数乗算上位丸めシフト
KR20140073553A (ko) Fifo 로드 명령
CN114746840B (zh) 用于乘法和累加操作的处理器单元
CN113052304A (zh) 用于具有部分读取/写入的脉动阵列的系统和方法
EP3326060B1 (en) Mixed-width simd operations having even-element and odd-element operations using register pair for wide data elements
US11288768B2 (en) Application processor including reconfigurable scaler and devices including the processor
US12242853B1 (en) Configurable vector compute engine
WO2022191859A1 (en) Vector processing using vector-specific data type
US12271732B1 (en) Configuration of a deep vector engine using an opcode table, control table, and datapath table
US12468507B1 (en) Reducing power consumption in integrated circuits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant