JP2017515228A - Simd処理システムにおける直列実行のための技法 - Google Patents
Simd処理システムにおける直列実行のための技法 Download PDFInfo
- Publication number
- JP2017515228A JP2017515228A JP2016563817A JP2016563817A JP2017515228A JP 2017515228 A JP2017515228 A JP 2017515228A JP 2016563817 A JP2016563817 A JP 2016563817A JP 2016563817 A JP2016563817 A JP 2016563817A JP 2017515228 A JP2017515228 A JP 2017515228A
- Authority
- JP
- Japan
- Prior art keywords
- instruction
- thread
- threads
- active
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/3009—Thread control instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Advance Control (AREA)
- Executing Machine-Instructions (AREA)
- Hardware Redundancy (AREA)
- Devices For Executing Special Programs (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/268,215 | 2014-05-02 | ||
| US14/268,215 US10133572B2 (en) | 2014-05-02 | 2014-05-02 | Techniques for serialized execution in a SIMD processing system |
| PCT/US2015/025362 WO2015167777A1 (en) | 2014-05-02 | 2015-04-10 | Techniques for serialized execution in a simd processing system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2017515228A true JP2017515228A (ja) | 2017-06-08 |
| JP2017515228A5 JP2017515228A5 (enExample) | 2018-05-10 |
Family
ID=53039617
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2016563817A Pending JP2017515228A (ja) | 2014-05-02 | 2015-04-10 | Simd処理システムにおける直列実行のための技法 |
Country Status (8)
| Country | Link |
|---|---|
| US (1) | US10133572B2 (enExample) |
| EP (1) | EP3137988B1 (enExample) |
| JP (1) | JP2017515228A (enExample) |
| KR (1) | KR20160148673A (enExample) |
| CN (1) | CN106233248B (enExample) |
| BR (1) | BR112016025511A2 (enExample) |
| ES (1) | ES2834573T3 (enExample) |
| WO (1) | WO2015167777A1 (enExample) |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9898348B2 (en) * | 2014-10-22 | 2018-02-20 | International Business Machines Corporation | Resource mapping in multi-threaded central processor units |
| US9921838B2 (en) * | 2015-10-02 | 2018-03-20 | Mediatek Inc. | System and method for managing static divergence in a SIMD computing architecture |
| CN107534445B (zh) * | 2016-04-19 | 2020-03-10 | 华为技术有限公司 | 用于分割哈希值计算的向量处理 |
| US10091904B2 (en) | 2016-07-22 | 2018-10-02 | Intel Corporation | Storage sled for data center |
| US10565017B2 (en) * | 2016-09-23 | 2020-02-18 | Samsung Electronics Co., Ltd. | Multi-thread processor and controlling method thereof |
| US10990409B2 (en) * | 2017-04-21 | 2021-04-27 | Intel Corporation | Control flow mechanism for execution of graphics processor instructions using active channel packing |
| CN108549583B (zh) * | 2018-04-17 | 2021-05-07 | 致云科技有限公司 | 大数据处理方法、装置、服务器及可读存储介质 |
| US12004257B2 (en) * | 2018-10-08 | 2024-06-04 | Interdigital Patent Holdings, Inc. | Device discovery and connectivity in a cellular network |
| US12314760B2 (en) * | 2021-09-27 | 2025-05-27 | Advanced Micro Devices, Inc. | Garbage collecting wavefront |
Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009230756A (ja) * | 2008-03-24 | 2009-10-08 | Nvidia Corp | 同期並列スレッドプロセッサにおける間接的な関数呼び出し命令 |
| US7634637B1 (en) * | 2005-12-16 | 2009-12-15 | Nvidia Corporation | Execution of parallel groups of threads with per-instruction serialization |
| US20110078690A1 (en) * | 2009-09-28 | 2011-03-31 | Brian Fahs | Opcode-Specified Predicatable Warp Post-Synchronization |
| WO2012155010A1 (en) * | 2011-05-11 | 2012-11-15 | Advanced Micro Devices, Inc. | Automatic load balancing for heterogeneous cores |
| WO2012158753A1 (en) * | 2011-05-16 | 2012-11-22 | Advanced Micro Devices, Inc. | Automatic kernel migration for heterogeneous cores |
| WO2014025480A1 (en) * | 2012-08-08 | 2014-02-13 | Qualcomm Incorporated | Selectively activating a resume check operation in a multi-threaded processing system |
| US20140075160A1 (en) * | 2012-09-10 | 2014-03-13 | Nvidia Corporation | System and method for synchronizing threads in a divergent region of code |
| WO2014039206A1 (en) * | 2012-09-10 | 2014-03-13 | Qualcomm Incorporated | Executing subroutines in a multi-threaded processing system |
| JP2014146335A (ja) * | 2013-01-28 | 2014-08-14 | Samsung Electronics Co Ltd | マルチモード支援プロセッサ及びマルチモードを支援する処理方法 |
| JP2015036983A (ja) * | 2013-08-13 | 2015-02-23 | 三星電子株式会社Samsung Electronics Co.,Ltd. | 多重スレッド実行プロセッサ、及びその動作方法 |
| JP2016532180A (ja) * | 2013-10-01 | 2016-10-13 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | Gpu発散バリア |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6947047B1 (en) | 2001-09-20 | 2005-09-20 | Nvidia Corporation | Method and system for programmable pipelined graphics processing with branching instructions |
| US7895328B2 (en) | 2002-12-13 | 2011-02-22 | International Business Machines Corporation | System and method for context-based serialization of messages in a parallel execution environment |
| WO2005072307A2 (en) | 2004-01-22 | 2005-08-11 | University Of Washington | Wavescalar architecture having a wave order memory |
| US7590830B2 (en) * | 2004-05-28 | 2009-09-15 | Sun Microsystems, Inc. | Method and structure for concurrent branch prediction in a processor |
| GB2437837A (en) | 2005-02-25 | 2007-11-07 | Clearspeed Technology Plc | Microprocessor architecture |
| US7761697B1 (en) * | 2005-07-13 | 2010-07-20 | Nvidia Corporation | Processing an indirect branch instruction in a SIMD architecture |
| US8176265B2 (en) | 2006-10-30 | 2012-05-08 | Nvidia Corporation | Shared single-access memory with management of multiple parallel requests |
| US10152329B2 (en) | 2012-02-09 | 2018-12-11 | Nvidia Corporation | Pre-scheduled replays of divergent operations |
-
2014
- 2014-05-02 US US14/268,215 patent/US10133572B2/en active Active
-
2015
- 2015-04-10 BR BR112016025511A patent/BR112016025511A2/pt not_active IP Right Cessation
- 2015-04-10 CN CN201580021777.9A patent/CN106233248B/zh active Active
- 2015-04-10 WO PCT/US2015/025362 patent/WO2015167777A1/en not_active Ceased
- 2015-04-10 ES ES15719929T patent/ES2834573T3/es active Active
- 2015-04-10 KR KR1020167033480A patent/KR20160148673A/ko not_active Withdrawn
- 2015-04-10 JP JP2016563817A patent/JP2017515228A/ja active Pending
- 2015-04-10 EP EP15719929.0A patent/EP3137988B1/en active Active
Patent Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7634637B1 (en) * | 2005-12-16 | 2009-12-15 | Nvidia Corporation | Execution of parallel groups of threads with per-instruction serialization |
| JP2009230756A (ja) * | 2008-03-24 | 2009-10-08 | Nvidia Corp | 同期並列スレッドプロセッサにおける間接的な関数呼び出し命令 |
| US20110078690A1 (en) * | 2009-09-28 | 2011-03-31 | Brian Fahs | Opcode-Specified Predicatable Warp Post-Synchronization |
| WO2012155010A1 (en) * | 2011-05-11 | 2012-11-15 | Advanced Micro Devices, Inc. | Automatic load balancing for heterogeneous cores |
| WO2012158753A1 (en) * | 2011-05-16 | 2012-11-22 | Advanced Micro Devices, Inc. | Automatic kernel migration for heterogeneous cores |
| WO2014025480A1 (en) * | 2012-08-08 | 2014-02-13 | Qualcomm Incorporated | Selectively activating a resume check operation in a multi-threaded processing system |
| US20140075160A1 (en) * | 2012-09-10 | 2014-03-13 | Nvidia Corporation | System and method for synchronizing threads in a divergent region of code |
| WO2014039206A1 (en) * | 2012-09-10 | 2014-03-13 | Qualcomm Incorporated | Executing subroutines in a multi-threaded processing system |
| JP2014146335A (ja) * | 2013-01-28 | 2014-08-14 | Samsung Electronics Co Ltd | マルチモード支援プロセッサ及びマルチモードを支援する処理方法 |
| JP2015036983A (ja) * | 2013-08-13 | 2015-02-23 | 三星電子株式会社Samsung Electronics Co.,Ltd. | 多重スレッド実行プロセッサ、及びその動作方法 |
| JP2016532180A (ja) * | 2013-10-01 | 2016-10-13 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | Gpu発散バリア |
Also Published As
| Publication number | Publication date |
|---|---|
| US20150317157A1 (en) | 2015-11-05 |
| EP3137988A1 (en) | 2017-03-08 |
| KR20160148673A (ko) | 2016-12-26 |
| US10133572B2 (en) | 2018-11-20 |
| CN106233248A (zh) | 2016-12-14 |
| ES2834573T3 (es) | 2021-06-17 |
| CN106233248B (zh) | 2018-11-13 |
| EP3137988B1 (en) | 2020-09-02 |
| WO2015167777A1 (en) | 2015-11-05 |
| BR112016025511A2 (pt) | 2017-08-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN106233248B (zh) | 用于在多线程处理器上执行发散操作的方法和设备 | |
| CN104603749B (zh) | 用于在单指令多数据simd处理系统中控制发散分支指令的方法和设备 | |
| JP5701487B2 (ja) | 同期並列スレッドプロセッサにおける間接的な関数呼び出し命令 | |
| US9256429B2 (en) | Selectively activating a resume check operation in a multi-threaded processing system | |
| US10706494B2 (en) | Uniform predicates in shaders for graphics processing units | |
| KR20040016829A (ko) | 파이프라인식 프로세서에서의 예외 취급 방법, 장치 및시스템 | |
| WO2013036341A1 (en) | Techniques for handling divergent threads in a multi-threaded processing system | |
| US8572355B2 (en) | Support for non-local returns in parallel thread SIMD engine | |
| WO2017204910A1 (en) | Per-instance preamble for graphics processing | |
| KR102495792B1 (ko) | 가변 파면 크기 | |
| BR112018016913B1 (pt) | Método e aparelho para processamento de dados e memória legível por computador |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20161104 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20180320 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20180320 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20181121 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20190111 |
|
| A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20190805 |