KR102118836B1 - Simd 아키텍처에서 레인 셔플을 위한 셔플러 회로 - Google Patents
Simd 아키텍처에서 레인 셔플을 위한 셔플러 회로 Download PDFInfo
- Publication number
- KR102118836B1 KR102118836B1 KR1020197000601A KR20197000601A KR102118836B1 KR 102118836 B1 KR102118836 B1 KR 102118836B1 KR 1020197000601 A KR1020197000601 A KR 1020197000601A KR 20197000601 A KR20197000601 A KR 20197000601A KR 102118836 B1 KR102118836 B1 KR 102118836B1
- Authority
- KR
- South Korea
- Prior art keywords
- data
- processing
- processing lanes
- lanes
- lane
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4009—Coupling between buses with data restructuring
- G06F13/4013—Coupling between buses with data restructuring with data re-ordering, e.g. Endian conversion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8053—Vector processors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30032—Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Advance Control (AREA)
- Image Processing (AREA)
- Executing Machine-Instructions (AREA)
- Time-Division Multiplex Systems (AREA)
- Synchronisation In Digital Transmission Systems (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/209,057 | 2016-07-13 | ||
| US15/209,057 US10592468B2 (en) | 2016-07-13 | 2016-07-13 | Shuffler circuit for lane shuffle in SIMD architecture |
| PCT/US2017/033663 WO2018013219A1 (en) | 2016-07-13 | 2017-05-19 | Shuffler circuit for lane shuffle in simd architecture |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR20190028426A KR20190028426A (ko) | 2019-03-18 |
| KR102118836B1 true KR102118836B1 (ko) | 2020-06-03 |
Family
ID=58779363
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020197000601A Active KR102118836B1 (ko) | 2016-07-13 | 2017-05-19 | Simd 아키텍처에서 레인 셔플을 위한 셔플러 회로 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US10592468B2 (enExample) |
| EP (1) | EP3485385B1 (enExample) |
| JP (1) | JP2019521445A (enExample) |
| KR (1) | KR102118836B1 (enExample) |
| CN (1) | CN109478175B (enExample) |
| WO (1) | WO2018013219A1 (enExample) |
Families Citing this family (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10957095B2 (en) * | 2018-08-06 | 2021-03-23 | Intel Corporation | Programmable ray tracing with hardware acceleration on a graphics processor |
| US10963300B2 (en) * | 2018-12-06 | 2021-03-30 | Raytheon Company | Accelerating dataflow signal processing applications across heterogeneous CPU/GPU systems |
| US11397624B2 (en) * | 2019-01-22 | 2022-07-26 | Arm Limited | Execution of cross-lane operations in data processing systems |
| US11294672B2 (en) * | 2019-08-22 | 2022-04-05 | Apple Inc. | Routing circuitry for permutation of single-instruction multiple-data operands |
| US11256518B2 (en) | 2019-10-09 | 2022-02-22 | Apple Inc. | Datapath circuitry for math operations using SIMD pipelines |
| US20210349717A1 (en) * | 2020-05-05 | 2021-11-11 | Intel Corporation | Compaction of diverged lanes for efficient use of alus |
| US20220197649A1 (en) * | 2020-12-22 | 2022-06-23 | Advanced Micro Devices, Inc. | General purpose register hierarchy system and method |
| US11360897B1 (en) * | 2021-04-15 | 2022-06-14 | Qualcomm Incorporated | Adaptive memory access management |
| CN115793958A (zh) * | 2021-09-10 | 2023-03-14 | 腾讯科技(深圳)有限公司 | 一种混洗数据的处理方法、相关装置、设备以及存储介质 |
| CN115061731B (zh) * | 2022-06-23 | 2023-05-23 | 摩尔线程智能科技(北京)有限责任公司 | 混洗电路和方法、以及芯片和集成电路装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040054877A1 (en) | 2001-10-29 | 2004-03-18 | Macy William W. | Method and apparatus for shuffling data |
| US20130339664A1 (en) | 2011-12-23 | 2013-12-19 | Elmoustapha Ould-Ahmed-Vall | Instruction execution unit that broadcasts data values at different levels of granularity |
| US20140059323A1 (en) | 2012-08-23 | 2014-02-27 | Qualcomm Incorporated | Systems and methods of data extraction in a vector processor |
| US20140208067A1 (en) | 2013-01-23 | 2014-07-24 | International Business Machines Corporation | Vector element rotate and insert under mask instruction |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2078912A1 (en) * | 1992-01-07 | 1993-07-08 | Robert Edward Cypher | Hierarchical interconnection networks for parallel processing |
| US7343389B2 (en) | 2002-05-02 | 2008-03-11 | Intel Corporation | Apparatus and method for SIMD modular multiplication |
| US9557994B2 (en) * | 2004-07-13 | 2017-01-31 | Arm Limited | Data processing apparatus and method for performing N-way interleaving and de-interleaving operations where N is an odd plural number |
| US7761694B2 (en) * | 2006-06-30 | 2010-07-20 | Intel Corporation | Execution unit for performing shuffle and other operations |
| GB2444744B (en) | 2006-12-12 | 2011-05-25 | Advanced Risc Mach Ltd | Apparatus and method for performing re-arrangement operations on data |
| US8078836B2 (en) | 2007-12-30 | 2011-12-13 | Intel Corporation | Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a common set of per-lane control bits |
| US9436469B2 (en) * | 2011-12-15 | 2016-09-06 | Intel Corporation | Methods to optimize a program loop via vector instructions using a shuffle table and a mask store table |
| US9218182B2 (en) | 2012-06-29 | 2015-12-22 | Intel Corporation | Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op) |
| US20140149480A1 (en) | 2012-11-28 | 2014-05-29 | Nvidia Corporation | System, method, and computer program product for transposing a matrix |
| US9405539B2 (en) | 2013-07-31 | 2016-08-02 | Intel Corporation | Providing vector sub-byte decompression functionality |
-
2016
- 2016-07-13 US US15/209,057 patent/US10592468B2/en active Active
-
2017
- 2017-05-19 EP EP17726175.7A patent/EP3485385B1/en active Active
- 2017-05-19 KR KR1020197000601A patent/KR102118836B1/ko active Active
- 2017-05-19 CN CN201780042845.9A patent/CN109478175B/zh active Active
- 2017-05-19 WO PCT/US2017/033663 patent/WO2018013219A1/en not_active Ceased
- 2017-05-19 JP JP2019500593A patent/JP2019521445A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040054877A1 (en) | 2001-10-29 | 2004-03-18 | Macy William W. | Method and apparatus for shuffling data |
| US20130339664A1 (en) | 2011-12-23 | 2013-12-19 | Elmoustapha Ould-Ahmed-Vall | Instruction execution unit that broadcasts data values at different levels of granularity |
| US20140059323A1 (en) | 2012-08-23 | 2014-02-27 | Qualcomm Incorporated | Systems and methods of data extraction in a vector processor |
| US20140208067A1 (en) | 2013-01-23 | 2014-07-24 | International Business Machines Corporation | Vector element rotate and insert under mask instruction |
Also Published As
| Publication number | Publication date |
|---|---|
| US10592468B2 (en) | 2020-03-17 |
| KR20190028426A (ko) | 2019-03-18 |
| WO2018013219A1 (en) | 2018-01-18 |
| BR112019000120A8 (pt) | 2023-01-31 |
| BR112019000120A2 (pt) | 2019-04-09 |
| CN109478175B (zh) | 2022-07-12 |
| CN109478175A (zh) | 2019-03-15 |
| US20180018299A1 (en) | 2018-01-18 |
| EP3485385B1 (en) | 2020-04-22 |
| JP2019521445A (ja) | 2019-07-25 |
| EP3485385A1 (en) | 2019-05-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102118836B1 (ko) | Simd 아키텍처에서 레인 셔플을 위한 셔플러 회로 | |
| US8984043B2 (en) | Multiplying and adding matrices | |
| CA2693344C (en) | Scheme for varying packing and linking in graphics systems | |
| US9513908B2 (en) | Streaming memory transpose operations | |
| KR20100122493A (ko) | 프로세서 | |
| US20140181466A1 (en) | Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions | |
| CN107533460B (zh) | 紧缩有限冲激响应(fir)滤波处理器、方法、系统和指令 | |
| CN102279818A (zh) | 支持有限共享的向量数据访存控制方法及向量存储器 | |
| JP7507304B2 (ja) | レジスタデータの消去 | |
| US20080244238A1 (en) | Stream processing accelerator | |
| US9632783B2 (en) | Operand conflict resolution for reduced port general purpose register | |
| US9350584B2 (en) | Element selection unit and a method therein | |
| US9569210B2 (en) | Apparatus and method of execution unit for calculating multiple rounds of a skein hashing algorithm | |
| CN104011617B (zh) | 用于对数据字内的数据进行重新定位的可重配置设备 | |
| US12395187B2 (en) | Computer architecture with data decompression support for neural network computing | |
| JP5659772B2 (ja) | 演算処理装置 | |
| KR101863483B1 (ko) | 중간 스토리지로서 파이프라인 레지스터들의 활용 | |
| BR112019000120B1 (pt) | Circuito de embaralhamento para embaralhar faixa em arquitetura simd | |
| CN111831338B (zh) | 临时寄存器中的按通道动态索引 | |
| CN117150192A (zh) | 一种高带宽利用率的稀疏矩阵向量相乘加速装置 | |
| US20160070505A1 (en) | Efficient loading and storing of data |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
St.27 status event code: A-0-1-A10-A15-nap-PA0105 |
|
| PG1501 | Laying open of application |
St.27 status event code: A-1-1-Q10-Q12-nap-PG1501 |
|
| A201 | Request for examination | ||
| E13-X000 | Pre-grant limitation requested |
St.27 status event code: A-2-3-E10-E13-lim-X000 |
|
| P11-X000 | Amendment of application requested |
St.27 status event code: A-2-2-P10-P11-nap-X000 |
|
| P13-X000 | Application amended |
St.27 status event code: A-2-2-P10-P13-nap-X000 |
|
| PA0201 | Request for examination |
St.27 status event code: A-1-2-D10-D11-exm-PA0201 |
|
| PA0302 | Request for accelerated examination |
St.27 status event code: A-1-2-D10-D17-exm-PA0302 St.27 status event code: A-1-2-D10-D16-exm-PA0302 |
|
| E701 | Decision to grant or registration of patent right | ||
| PE0701 | Decision of registration |
St.27 status event code: A-1-2-D10-D22-exm-PE0701 |
|
| GRNT | Written decision to grant | ||
| PR0701 | Registration of establishment |
St.27 status event code: A-2-4-F10-F11-exm-PR0701 |
|
| PR1002 | Payment of registration fee |
St.27 status event code: A-2-2-U10-U12-oth-PR1002 Fee payment year number: 1 |
|
| PG1601 | Publication of registration |
St.27 status event code: A-4-4-Q10-Q13-nap-PG1601 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 4 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 5 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-4-4-P10-P22-nap-X000 |