JP2016541057A - ベクトルデータメモリに記憶される出力ベクトルデータのインフライト並び替えを提供するために実行ユニットとベクトルデータメモリとの間でデータフローパスにおいて並び替え回路を利用するベクトル処理エンジン(vpe)および関連するベクトルプロセッサシステムと方法 - Google Patents

ベクトルデータメモリに記憶される出力ベクトルデータのインフライト並び替えを提供するために実行ユニットとベクトルデータメモリとの間でデータフローパスにおいて並び替え回路を利用するベクトル処理エンジン(vpe)および関連するベクトルプロセッサシステムと方法 Download PDF

Info

Publication number
JP2016541057A
JP2016541057A JP2016530912A JP2016530912A JP2016541057A JP 2016541057 A JP2016541057 A JP 2016541057A JP 2016530912 A JP2016530912 A JP 2016530912A JP 2016530912 A JP2016530912 A JP 2016530912A JP 2016541057 A JP2016541057 A JP 2016541057A
Authority
JP
Japan
Prior art keywords
vector data
data sample
vector
input
sample set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2016530912A
Other languages
English (en)
Japanese (ja)
Other versions
JP2016541057A5 (enExample
Inventor
カーン、ラヘール
ムジャヒド、ファハド・アリ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of JP2016541057A publication Critical patent/JP2016541057A/ja
Publication of JP2016541057A5 publication Critical patent/JP2016541057A5/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8053Vector processors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • G06F9/38873Iterative single instructions for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3893Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
    • G06F9/3895Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
    • G06F9/3897Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Advance Control (AREA)
  • Complex Calculations (AREA)
  • Executing Machine-Instructions (AREA)
JP2016530912A 2013-11-15 2014-11-13 ベクトルデータメモリに記憶される出力ベクトルデータのインフライト並び替えを提供するために実行ユニットとベクトルデータメモリとの間でデータフローパスにおいて並び替え回路を利用するベクトル処理エンジン(vpe)および関連するベクトルプロセッサシステムと方法 Pending JP2016541057A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/082,081 US9977676B2 (en) 2013-11-15 2013-11-15 Vector processing engines (VPEs) employing reordering circuitry in data flow paths between execution units and vector data memory to provide in-flight reordering of output vector data stored to vector data memory, and related vector processor systems and methods
US14/082,081 2013-11-15
PCT/US2014/065412 WO2015073646A1 (en) 2013-11-15 2014-11-13 Vector processing engine employing reordering circuitry in data flow paths between vector data memory and execution units, and related method

Publications (2)

Publication Number Publication Date
JP2016541057A true JP2016541057A (ja) 2016-12-28
JP2016541057A5 JP2016541057A5 (enExample) 2018-08-02

Family

ID=52023626

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2016530912A Pending JP2016541057A (ja) 2013-11-15 2014-11-13 ベクトルデータメモリに記憶される出力ベクトルデータのインフライト並び替えを提供するために実行ユニットとベクトルデータメモリとの間でデータフローパスにおいて並び替え回路を利用するベクトル処理エンジン(vpe)および関連するベクトルプロセッサシステムと方法

Country Status (6)

Country Link
US (1) US9977676B2 (enExample)
EP (1) EP3069233A1 (enExample)
JP (1) JP2016541057A (enExample)
KR (1) KR20160085335A (enExample)
CN (1) CN105765523B (enExample)
WO (1) WO2015073646A1 (enExample)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495154B2 (en) 2013-03-13 2016-11-15 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods
US9880845B2 (en) 2013-11-15 2018-01-30 Qualcomm Incorporated Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations, and related vector processor systems and methods
US9619227B2 (en) 2013-11-15 2017-04-11 Qualcomm Incorporated Vector processing engines (VPEs) employing tapped-delay line(s) for providing precision correlation / covariance vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
US9684509B2 (en) 2013-11-15 2017-06-20 Qualcomm Incorporated Vector processing engines (VPEs) employing merging circuitry in data flow paths between execution units and vector data memory to provide in-flight merging of output vector data stored to vector data memory, and related vector processing instructions, systems, and methods
US9792118B2 (en) 2013-11-15 2017-10-17 Qualcomm Incorporated Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
KR102240728B1 (ko) * 2015-01-27 2021-04-16 한국전자통신연구원 길이가 64800이며, 부호율이 4/15인 ldpc 부호어 및 64-심볼 맵핑을 위한 비트 인터리버 및 이를 이용한 비트 인터리빙 방법
KR102287614B1 (ko) * 2015-02-12 2021-08-10 한국전자통신연구원 길이가 64800이며, 부호율이 2/15인 ldpc 부호어 및 16-심볼 맵핑을 위한 비트 인터리버 및 이를 이용한 비트 인터리빙 방법
KR102287630B1 (ko) * 2015-02-17 2021-08-10 한국전자통신연구원 길이가 16200이며, 부호율이 3/15인 ldpc 부호어 및 16-심볼 맵핑을 위한 비트 인터리버 및 이를 이용한 비트 인터리빙 방법
EP3699826A1 (en) * 2017-04-20 2020-08-26 Shanghai Cambricon Information Technology Co., Ltd Operation device and related products
KR102343652B1 (ko) * 2017-05-25 2021-12-24 삼성전자주식회사 벡터 프로세서의 서열 정렬 방법
GB2569844B (en) 2017-10-20 2021-01-06 Graphcore Ltd Sending data off-chip
GB2569271B (en) 2017-10-20 2020-05-13 Graphcore Ltd Synchronization with a host processor
GB2569775B (en) 2017-10-20 2020-02-26 Graphcore Ltd Synchronization in a multi-tile, multi-chip processing arrangement
US11277455B2 (en) 2018-06-07 2022-03-15 Mellanox Technologies, Ltd. Streaming system
GB2575294B8 (en) 2018-07-04 2022-07-20 Graphcore Ltd Host Proxy On Gateway
US20200106828A1 (en) * 2018-10-02 2020-04-02 Mellanox Technologies, Ltd. Parallel Computation Network Device
GB2579412B (en) 2018-11-30 2020-12-23 Graphcore Ltd Gateway pull model
US11625393B2 (en) 2019-02-19 2023-04-11 Mellanox Technologies, Ltd. High performance computing system
EP3699770B1 (en) 2019-02-25 2025-05-21 Mellanox Technologies, Ltd. Collective communication system and methods
CN110795687A (zh) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 一种自相关算法的层次化分割系统及方法
US11750699B2 (en) 2020-01-15 2023-09-05 Mellanox Technologies, Ltd. Small message aggregation
US11252027B2 (en) 2020-01-23 2022-02-15 Mellanox Technologies, Ltd. Network element supporting flexible data reduction operations
US11876885B2 (en) 2020-07-02 2024-01-16 Mellanox Technologies, Ltd. Clock queue with arming and/or self-arming features
US11556378B2 (en) 2020-12-14 2023-01-17 Mellanox Technologies, Ltd. Offloading execution of a multi-task parameter-dependent operation to a network device
US12309070B2 (en) 2022-04-07 2025-05-20 Nvidia Corporation In-network message aggregation for efficient small message transport
US11922237B1 (en) 2022-09-12 2024-03-05 Mellanox Technologies, Ltd. Single-step collective operations
US12489657B2 (en) 2023-08-17 2025-12-02 Mellanox Technologies, Ltd. In-network compute operation spreading

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008108220A (ja) * 2006-10-27 2008-05-08 Toshiba Corp 演算装置
US20080140750A1 (en) * 2006-12-12 2008-06-12 Arm Limited Apparatus and method for performing rearrangement and arithmetic operations on data
JP2012505455A (ja) * 2008-10-08 2012-03-01 アーム・リミテッド Simd積和演算動作を行うための装置及び方法
WO2013057856A1 (ja) * 2011-10-17 2013-04-25 パナソニック株式会社 適応等化器

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5524256A (en) * 1993-05-07 1996-06-04 Apple Computer, Inc. Method and system for reordering bytes in a data stream
GB9509989D0 (en) 1995-05-17 1995-07-12 Sgs Thomson Microelectronics Manipulation of data
GB9801713D0 (en) * 1998-01-27 1998-03-25 Sgs Thomson Microelectronics Executing permutations
US7395538B1 (en) 2003-03-07 2008-07-01 Juniper Networks, Inc. Scalable packet processing systems and methods
GB2399900B (en) 2003-03-27 2005-10-05 Micron Technology Inc Data reording processor and method for use in an active memory device
US20050050303A1 (en) 2003-06-30 2005-03-03 Roni Rosner Hierarchical reorder buffers for controlling speculative execution in a multi-cluster system
GB2409063B (en) 2003-12-09 2006-07-12 Advanced Risc Mach Ltd Vector by scalar operations
US7933405B2 (en) 2005-04-08 2011-04-26 Icera Inc. Data access and permute unit
JP2007034731A (ja) 2005-07-27 2007-02-08 Toshiba Corp パイプラインプロセッサ
US8140932B2 (en) 2007-11-26 2012-03-20 Motorola Mobility, Inc. Data interleaving circuit and method for vectorized turbo decoder
US8078834B2 (en) 2008-01-09 2011-12-13 Analog Devices, Inc. Processor architectures for enhanced computational capability
US8868885B2 (en) 2010-11-18 2014-10-21 Ceva D.S.P. Ltd. On-the-fly permutation of vector elements for executing successive elemental instructions
US9275014B2 (en) 2013-03-13 2016-03-01 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods
US9495154B2 (en) 2013-03-13 2016-11-15 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode vector processing, and related vector processors, systems, and methods
US20140280407A1 (en) 2013-03-13 2014-09-18 Qualcomm Incorporated Vector processing carry-save accumulators employing redundant carry-save format to reduce carry propagation, and related vector processors, systems, and methods
US9619227B2 (en) 2013-11-15 2017-04-11 Qualcomm Incorporated Vector processing engines (VPEs) employing tapped-delay line(s) for providing precision correlation / covariance vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
US9880845B2 (en) 2013-11-15 2018-01-30 Qualcomm Incorporated Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations, and related vector processor systems and methods
US20150143076A1 (en) 2013-11-15 2015-05-21 Qualcomm Incorporated VECTOR PROCESSING ENGINES (VPEs) EMPLOYING DESPREADING CIRCUITRY IN DATA FLOW PATHS BETWEEN EXECUTION UNITS AND VECTOR DATA MEMORY TO PROVIDE IN-FLIGHT DESPREADING OF SPREAD-SPECTRUM SEQUENCES, AND RELATED VECTOR PROCESSING INSTRUCTIONS, SYSTEMS, AND METHODS
US9684509B2 (en) 2013-11-15 2017-06-20 Qualcomm Incorporated Vector processing engines (VPEs) employing merging circuitry in data flow paths between execution units and vector data memory to provide in-flight merging of output vector data stored to vector data memory, and related vector processing instructions, systems, and methods
US9792118B2 (en) 2013-11-15 2017-10-17 Qualcomm Incorporated Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008108220A (ja) * 2006-10-27 2008-05-08 Toshiba Corp 演算装置
US20080140750A1 (en) * 2006-12-12 2008-06-12 Arm Limited Apparatus and method for performing rearrangement and arithmetic operations on data
JP2012505455A (ja) * 2008-10-08 2012-03-01 アーム・リミテッド Simd積和演算動作を行うための装置及び方法
WO2013057856A1 (ja) * 2011-10-17 2013-04-25 パナソニック株式会社 適応等化器

Also Published As

Publication number Publication date
US20150143085A1 (en) 2015-05-21
WO2015073646A1 (en) 2015-05-21
EP3069233A1 (en) 2016-09-21
US9977676B2 (en) 2018-05-22
CN105765523A (zh) 2016-07-13
KR20160085335A (ko) 2016-07-15
CN105765523B (zh) 2018-07-17

Similar Documents

Publication Publication Date Title
JP6373991B2 (ja) フィルタベクトル処理動作のためのタップ付き遅延線を利用するベクトル処理エンジンと、関連するベクトル処理システムおよび方法
JP6339197B2 (ja) 実行ユニットとベクトルデータメモリとの間のマージング回路を備えるベクトル処理エンジンおよび関連する方法
CN105765523B (zh) 在向量数据存储器与执行单元之间的数据流路径中采用重排序电路系统的向量处理引擎以及相关的方法
JP2016537724A (ja) ベクトル処理動作のために実行ユニットに入力ベクトルデータのインフライトフォーマット変換を提供するためにベクトルデータメモリと実行ユニットとの間でデータフローパスにおいてフォーマット変換回路を利用するベクトル処理エンジン(vpe)および関連するベクトル処理システムと方法
JP2016537725A (ja) 実行ユニットとベクトルデータメモリとの間のデータフローパスにおいて逆拡散回路を利用するベクトル処理エンジン、および関連する方法
JP2016537723A (ja) フィルタベクトル処理動作のためのタップ付き遅延線を利用するベクトル処理エンジンと、関連するベクトル処理システムおよび方法
JP6243000B2 (ja) マルチモードベクトル処理を提供するためのプログラム可能データ経路構成を有するベクトル処理エンジン、ならびに関連ベクトルプロセッサ、システム、および方法
EP2972988A2 (en) Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20171017

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20171017

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20180625

A871 Explanation of circumstances concerning accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A871

Effective date: 20180625

A975 Report on accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A971005

Effective date: 20180719

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20180720

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20180731

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20190312