CN106233248B - 用于在多线程处理器上执行发散操作的方法和设备 - Google Patents

用于在多线程处理器上执行发散操作的方法和设备 Download PDF

Info

Publication number
CN106233248B
CN106233248B CN201580021777.9A CN201580021777A CN106233248B CN 106233248 B CN106233248 B CN 106233248B CN 201580021777 A CN201580021777 A CN 201580021777A CN 106233248 B CN106233248 B CN 106233248B
Authority
CN
China
Prior art keywords
instruction
threads
thread
operations
instructions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580021777.9A
Other languages
English (en)
Chinese (zh)
Other versions
CN106233248A (zh
Inventor
安德鲁·埃文·格鲁贝尔
陈林
杜云
阿列克谢·弗拉狄米罗维奇·布尔德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN106233248A publication Critical patent/CN106233248A/zh
Application granted granted Critical
Publication of CN106233248B publication Critical patent/CN106233248B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/3009Thread control instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3888Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)
  • Hardware Redundancy (AREA)
  • Devices For Executing Special Programs (AREA)
CN201580021777.9A 2014-05-02 2015-04-10 用于在多线程处理器上执行发散操作的方法和设备 Active CN106233248B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/268,215 2014-05-02
US14/268,215 US10133572B2 (en) 2014-05-02 2014-05-02 Techniques for serialized execution in a SIMD processing system
PCT/US2015/025362 WO2015167777A1 (en) 2014-05-02 2015-04-10 Techniques for serialized execution in a simd processing system

Publications (2)

Publication Number Publication Date
CN106233248A CN106233248A (zh) 2016-12-14
CN106233248B true CN106233248B (zh) 2018-11-13

Family

ID=53039617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580021777.9A Active CN106233248B (zh) 2014-05-02 2015-04-10 用于在多线程处理器上执行发散操作的方法和设备

Country Status (8)

Country Link
US (1) US10133572B2 (enExample)
EP (1) EP3137988B1 (enExample)
JP (1) JP2017515228A (enExample)
KR (1) KR20160148673A (enExample)
CN (1) CN106233248B (enExample)
BR (1) BR112016025511A2 (enExample)
ES (1) ES2834573T3 (enExample)
WO (1) WO2015167777A1 (enExample)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9898348B2 (en) 2014-10-22 2018-02-20 International Business Machines Corporation Resource mapping in multi-threaded central processor units
US9921838B2 (en) * 2015-10-02 2018-03-20 Mediatek Inc. System and method for managing static divergence in a SIMD computing architecture
CN107534445B (zh) * 2016-04-19 2020-03-10 华为技术有限公司 用于分割哈希值计算的向量处理
US10091904B2 (en) 2016-07-22 2018-10-02 Intel Corporation Storage sled for data center
US10565017B2 (en) * 2016-09-23 2020-02-18 Samsung Electronics Co., Ltd. Multi-thread processor and controlling method thereof
US10990409B2 (en) * 2017-04-21 2021-04-27 Intel Corporation Control flow mechanism for execution of graphics processor instructions using active channel packing
CN108549583B (zh) * 2018-04-17 2021-05-07 致云科技有限公司 大数据处理方法、装置、服务器及可读存储介质
US12004257B2 (en) * 2018-10-08 2024-06-04 Interdigital Patent Holdings, Inc. Device discovery and connectivity in a cellular network
US12314760B2 (en) * 2021-09-27 2025-05-27 Advanced Micro Devices, Inc. Garbage collecting wavefront

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6947047B1 (en) * 2001-09-20 2005-09-20 Nvidia Corporation Method and system for programmable pipelined graphics processing with branching instructions
US20050268075A1 (en) * 2004-05-28 2005-12-01 Sun Microsystems, Inc. Multiple branch predictions
CN101124569A (zh) * 2005-02-25 2008-02-13 ClearSpeed科技公司 微处理器结构
US7634637B1 (en) * 2005-12-16 2009-12-15 Nvidia Corporation Execution of parallel groups of threads with per-instruction serialization
US20110078690A1 (en) * 2009-09-28 2011-03-31 Brian Fahs Opcode-Specified Predicatable Warp Post-Synchronization
US20140047223A1 (en) * 2012-08-08 2014-02-13 Lin Chen Selectively activating a resume check operation in a multi-threaded processing system
US20140075160A1 (en) * 2012-09-10 2014-03-13 Nvidia Corporation System and method for synchronizing threads in a divergent region of code
US20140075165A1 (en) * 2012-09-10 2014-03-13 Qualcomm Incorporated Executing subroutines in a multi-threaded processing system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7895328B2 (en) 2002-12-13 2011-02-22 International Business Machines Corporation System and method for context-based serialization of messages in a parallel execution environment
WO2005072307A2 (en) 2004-01-22 2005-08-11 University Of Washington Wavescalar architecture having a wave order memory
US7761697B1 (en) * 2005-07-13 2010-07-20 Nvidia Corporation Processing an indirect branch instruction in a SIMD architecture
US8176265B2 (en) 2006-10-30 2012-05-08 Nvidia Corporation Shared single-access memory with management of multiple parallel requests
US8312254B2 (en) * 2008-03-24 2012-11-13 Nvidia Corporation Indirect function call instructions in a synchronous parallel thread processor
US8782645B2 (en) * 2011-05-11 2014-07-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
US8683468B2 (en) * 2011-05-16 2014-03-25 Advanced Micro Devices, Inc. Automatic kernel migration for heterogeneous cores
US10152329B2 (en) 2012-02-09 2018-12-11 Nvidia Corporation Pre-scheduled replays of divergent operations
KR101603752B1 (ko) * 2013-01-28 2016-03-28 삼성전자주식회사 멀티 모드 지원 프로세서 및 그 프로세서에서 멀티 모드를 지원하는 방법
KR20150019349A (ko) * 2013-08-13 2015-02-25 삼성전자주식회사 다중 쓰레드 실행 프로세서 및 이의 동작 방법
US9652284B2 (en) * 2013-10-01 2017-05-16 Qualcomm Incorporated GPU divergence barrier

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6947047B1 (en) * 2001-09-20 2005-09-20 Nvidia Corporation Method and system for programmable pipelined graphics processing with branching instructions
US20050268075A1 (en) * 2004-05-28 2005-12-01 Sun Microsystems, Inc. Multiple branch predictions
CN101124569A (zh) * 2005-02-25 2008-02-13 ClearSpeed科技公司 微处理器结构
US7634637B1 (en) * 2005-12-16 2009-12-15 Nvidia Corporation Execution of parallel groups of threads with per-instruction serialization
US20110078690A1 (en) * 2009-09-28 2011-03-31 Brian Fahs Opcode-Specified Predicatable Warp Post-Synchronization
US20140047223A1 (en) * 2012-08-08 2014-02-13 Lin Chen Selectively activating a resume check operation in a multi-threaded processing system
US20140075160A1 (en) * 2012-09-10 2014-03-13 Nvidia Corporation System and method for synchronizing threads in a divergent region of code
US20140075165A1 (en) * 2012-09-10 2014-03-13 Qualcomm Incorporated Executing subroutines in a multi-threaded processing system

Also Published As

Publication number Publication date
EP3137988B1 (en) 2020-09-02
US20150317157A1 (en) 2015-11-05
JP2017515228A (ja) 2017-06-08
WO2015167777A1 (en) 2015-11-05
US10133572B2 (en) 2018-11-20
ES2834573T3 (es) 2021-06-17
KR20160148673A (ko) 2016-12-26
EP3137988A1 (en) 2017-03-08
CN106233248A (zh) 2016-12-14
BR112016025511A2 (pt) 2017-08-15

Similar Documents

Publication Publication Date Title
CN106233248B (zh) 用于在多线程处理器上执行发散操作的方法和设备
JP7087029B2 (ja) 中央処理装置(cpu)と補助プロセッサとの間の改善した関数コールバック機構
JP5701487B2 (ja) 同期並列スレッドプロセッサにおける間接的な関数呼び出し命令
CN105453045B (zh) 使用动态宽度计算的壁垒同步
US9430807B2 (en) Execution model for heterogeneous computing
JP6073479B2 (ja) マルチスレッド処理システムにおいてサブルーチンを実行すること
JP2004516546A (ja) パイプライン方式のプロセッサにおける例外処理
US10706494B2 (en) Uniform predicates in shaders for graphics processing units
JP6077117B2 (ja) マルチスレッド処理システムにおけるレジュームチェック動作を選択的にアクティブ化すること
US8572355B2 (en) Support for non-local returns in parallel thread SIMD engine
US11132196B2 (en) Apparatus and method for managing address collisions when performing vector operations
US9251022B2 (en) System level architecture verification for transaction execution in a multi-processing environment
CN108628639B (zh) 处理器和指令调度方法
US20210096877A1 (en) Collapsing bubbles in a processing unit pipeline
BR112018016913B1 (pt) Método e aparelho para processamento de dados e memória legível por computador

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant