JP2019521445A - Simdアーキテクチャにおけるレーンのシャッフルのためのシャッフラー回路 - Google Patents

Simdアーキテクチャにおけるレーンのシャッフルのためのシャッフラー回路 Download PDF

Info

Publication number
JP2019521445A
JP2019521445A JP2019500593A JP2019500593A JP2019521445A JP 2019521445 A JP2019521445 A JP 2019521445A JP 2019500593 A JP2019500593 A JP 2019500593A JP 2019500593 A JP2019500593 A JP 2019500593A JP 2019521445 A JP2019521445 A JP 2019521445A
Authority
JP
Japan
Prior art keywords
processing
data
lane
lanes
processing lanes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2019500593A
Other languages
English (en)
Japanese (ja)
Other versions
JP2019521445A5 (enExample
Inventor
リアン・ハン
シアンドン・ジン
リン・チェン
ユン・ドゥ
アレクセイ・ヴラディミロヴィチ・ボード
Original Assignee
クアルコム,インコーポレイテッド
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by クアルコム,インコーポレイテッド filed Critical クアルコム,インコーポレイテッド
Publication of JP2019521445A publication Critical patent/JP2019521445A/ja
Publication of JP2019521445A5 publication Critical patent/JP2019521445A5/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4009Coupling between buses with data restructuring
    • G06F13/4013Coupling between buses with data restructuring with data re-ordering, e.g. Endian conversion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8053Vector processors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3888Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Advance Control (AREA)
  • Image Processing (AREA)
  • Executing Machine-Instructions (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)
JP2019500593A 2016-07-13 2017-05-19 Simdアーキテクチャにおけるレーンのシャッフルのためのシャッフラー回路 Pending JP2019521445A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15/209,057 2016-07-13
US15/209,057 US10592468B2 (en) 2016-07-13 2016-07-13 Shuffler circuit for lane shuffle in SIMD architecture
PCT/US2017/033663 WO2018013219A1 (en) 2016-07-13 2017-05-19 Shuffler circuit for lane shuffle in simd architecture

Publications (2)

Publication Number Publication Date
JP2019521445A true JP2019521445A (ja) 2019-07-25
JP2019521445A5 JP2019521445A5 (enExample) 2020-06-18

Family

ID=58779363

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2019500593A Pending JP2019521445A (ja) 2016-07-13 2017-05-19 Simdアーキテクチャにおけるレーンのシャッフルのためのシャッフラー回路

Country Status (6)

Country Link
US (1) US10592468B2 (enExample)
EP (1) EP3485385B1 (enExample)
JP (1) JP2019521445A (enExample)
KR (1) KR102118836B1 (enExample)
CN (1) CN109478175B (enExample)
WO (1) WO2018013219A1 (enExample)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10957095B2 (en) * 2018-08-06 2021-03-23 Intel Corporation Programmable ray tracing with hardware acceleration on a graphics processor
US10963300B2 (en) * 2018-12-06 2021-03-30 Raytheon Company Accelerating dataflow signal processing applications across heterogeneous CPU/GPU systems
US11397624B2 (en) * 2019-01-22 2022-07-26 Arm Limited Execution of cross-lane operations in data processing systems
US11294672B2 (en) * 2019-08-22 2022-04-05 Apple Inc. Routing circuitry for permutation of single-instruction multiple-data operands
US11256518B2 (en) 2019-10-09 2022-02-22 Apple Inc. Datapath circuitry for math operations using SIMD pipelines
US20210349717A1 (en) * 2020-05-05 2021-11-11 Intel Corporation Compaction of diverged lanes for efficient use of alus
US20220197649A1 (en) * 2020-12-22 2022-06-23 Advanced Micro Devices, Inc. General purpose register hierarchy system and method
US11360897B1 (en) * 2021-04-15 2022-06-14 Qualcomm Incorporated Adaptive memory access management
CN115793958A (zh) * 2021-09-10 2023-03-14 腾讯科技(深圳)有限公司 一种混洗数据的处理方法、相关装置、设备以及存储介质
CN115061731B (zh) * 2022-06-23 2023-05-23 摩尔线程智能科技(北京)有限责任公司 混洗电路和方法、以及芯片和集成电路装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007526536A (ja) * 2003-06-30 2007-09-13 インテル コーポレイション データをシャッフルするための方法及び装置
US20090172358A1 (en) * 2007-12-30 2009-07-02 Zeev Sperber In-lane vector shuffle instructions
US20130339664A1 (en) * 2011-12-23 2013-12-19 Elmoustapha Ould-Ahmed-Vall Instruction execution unit that broadcasts data values at different levels of granularity
US20140059323A1 (en) * 2012-08-23 2014-02-27 Qualcomm Incorporated Systems and methods of data extraction in a vector processor
JP2016510461A (ja) * 2013-01-23 2016-04-07 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Vectorelementrotateandinsertundermask命令を処理するためのコンピュータ・プログラム、コンピュータ・システム及び方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2078912A1 (en) * 1992-01-07 1993-07-08 Robert Edward Cypher Hierarchical interconnection networks for parallel processing
US7343389B2 (en) 2002-05-02 2008-03-11 Intel Corporation Apparatus and method for SIMD modular multiplication
US9557994B2 (en) * 2004-07-13 2017-01-31 Arm Limited Data processing apparatus and method for performing N-way interleaving and de-interleaving operations where N is an odd plural number
US7761694B2 (en) * 2006-06-30 2010-07-20 Intel Corporation Execution unit for performing shuffle and other operations
GB2444744B (en) 2006-12-12 2011-05-25 Advanced Risc Mach Ltd Apparatus and method for performing re-arrangement operations on data
US9436469B2 (en) * 2011-12-15 2016-09-06 Intel Corporation Methods to optimize a program loop via vector instructions using a shuffle table and a mask store table
US9218182B2 (en) 2012-06-29 2015-12-22 Intel Corporation Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op)
US20140149480A1 (en) 2012-11-28 2014-05-29 Nvidia Corporation System, method, and computer program product for transposing a matrix
US9405539B2 (en) 2013-07-31 2016-08-02 Intel Corporation Providing vector sub-byte decompression functionality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007526536A (ja) * 2003-06-30 2007-09-13 インテル コーポレイション データをシャッフルするための方法及び装置
US20090172358A1 (en) * 2007-12-30 2009-07-02 Zeev Sperber In-lane vector shuffle instructions
US20130339664A1 (en) * 2011-12-23 2013-12-19 Elmoustapha Ould-Ahmed-Vall Instruction execution unit that broadcasts data values at different levels of granularity
US20140059323A1 (en) * 2012-08-23 2014-02-27 Qualcomm Incorporated Systems and methods of data extraction in a vector processor
JP2016510461A (ja) * 2013-01-23 2016-04-07 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Vectorelementrotateandinsertundermask命令を処理するためのコンピュータ・プログラム、コンピュータ・システム及び方法

Also Published As

Publication number Publication date
US10592468B2 (en) 2020-03-17
KR20190028426A (ko) 2019-03-18
KR102118836B1 (ko) 2020-06-03
WO2018013219A1 (en) 2018-01-18
BR112019000120A8 (pt) 2023-01-31
BR112019000120A2 (pt) 2019-04-09
CN109478175B (zh) 2022-07-12
CN109478175A (zh) 2019-03-15
US20180018299A1 (en) 2018-01-18
EP3485385B1 (en) 2020-04-22
EP3485385A1 (en) 2019-05-22

Similar Documents

Publication Publication Date Title
KR102118836B1 (ko) Simd 아키텍처에서 레인 셔플을 위한 셔플러 회로
CN103221939B (zh) 移动数据的方法和装置
RU2636675C2 (ru) Команды, процессоры, способы и системы доступа множественных регистров к памяти
US10678541B2 (en) Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions
KR20100122493A (ko) 프로세서
US9513908B2 (en) Streaming memory transpose operations
US8572355B2 (en) Support for non-local returns in parallel thread SIMD engine
CN102279818A (zh) 支持有限共享的向量数据访存控制方法及向量存储器
JP7507304B2 (ja) レジスタデータの消去
US9632783B2 (en) Operand conflict resolution for reduced port general purpose register
JP2009516870A (ja) マルチ状態論理を用いるvliw加速システム
US9639362B2 (en) Integrated circuit device and methods of performing bit manipulation therefor
CN117785287A (zh) 多线程计算中的私有存储器模式顺序存储器访问
JP5659772B2 (ja) 演算処理装置
US11775310B2 (en) Data processing system having distrubuted registers
KR20170007742A (ko) 중간 스토리지로서 파이프라인 레지스터들의 활용
US20250378617A1 (en) Shuffle accelerator for graphics processing unit
CN111831338B (zh) 临时寄存器中的按通道动态索引
BR112019000120B1 (pt) Circuito de embaralhamento para embaralhar faixa em arquitetura simd
CN105404588B (zh) 处理器和其中生成数据存储操作的一个或多个地址的方法
WO2009136402A2 (en) Register file system and method thereof for enabling a substantially direct memory access
WO2009136401A2 (en) Improved processing unit implementing both a local register file system and spread register file system, and a method thereof

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20190116

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20200501

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20200501

A871 Explanation of circumstances concerning accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A871

Effective date: 20200501

A975 Report on accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A971005

Effective date: 20200729

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20200803

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20210315