CN104583941B - 用于在多线程处理系统中选择性激活恢复检查操作的方法和设备 - Google Patents

用于在多线程处理系统中选择性激活恢复检查操作的方法和设备 Download PDF

Info

Publication number
CN104583941B
CN104583941B CN201380041768.7A CN201380041768A CN104583941B CN 104583941 B CN104583941 B CN 104583941B CN 201380041768 A CN201380041768 A CN 201380041768A CN 104583941 B CN104583941 B CN 104583941B
Authority
CN
China
Prior art keywords
instruction
recovery
counter value
inspection operation
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201380041768.7A
Other languages
English (en)
Chinese (zh)
Other versions
CN104583941A (zh
Inventor
陈琳
杜云
安德鲁·格鲁贝尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN104583941A publication Critical patent/CN104583941A/zh
Application granted granted Critical
Publication of CN104583941B publication Critical patent/CN104583941B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30072Arrangements for executing specific machine instructions to perform conditional operations, e.g. using predicates or guards
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/3009Thread control instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/323Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for indirect branch instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3888Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Executing Machine-Instructions (AREA)
  • Debugging And Monitoring (AREA)
  • Advance Control (AREA)
  • Power Sources (AREA)
CN201380041768.7A 2012-08-08 2013-07-08 用于在多线程处理系统中选择性激活恢复检查操作的方法和设备 Active CN104583941B (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201261680990P 2012-08-08 2012-08-08
US61/680,990 2012-08-08
US13/624,657 US9256429B2 (en) 2012-08-08 2012-09-21 Selectively activating a resume check operation in a multi-threaded processing system
US13/624,657 2012-09-21
PCT/US2013/049599 WO2014025480A1 (en) 2012-08-08 2013-07-08 Selectively activating a resume check operation in a multi-threaded processing system

Publications (2)

Publication Number Publication Date
CN104583941A CN104583941A (zh) 2015-04-29
CN104583941B true CN104583941B (zh) 2017-05-10

Family

ID=50067110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380041768.7A Active CN104583941B (zh) 2012-08-08 2013-07-08 用于在多线程处理系统中选择性激活恢复检查操作的方法和设备

Country Status (4)

Country Link
US (1) US9256429B2 (enExample)
JP (1) JP6077117B2 (enExample)
CN (1) CN104583941B (enExample)
WO (1) WO2014025480A1 (enExample)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8832417B2 (en) 2011-09-07 2014-09-09 Qualcomm Incorporated Program flow control for multiple divergent SIMD threads using a minimum resume counter
US9229721B2 (en) * 2012-09-10 2016-01-05 Qualcomm Incorporated Executing subroutines in a multi-threaded processing system
US9582321B2 (en) * 2013-11-08 2017-02-28 Swarm64 As System and method of data processing
US10133572B2 (en) * 2014-05-02 2018-11-20 Qualcomm Incorporated Techniques for serialized execution in a SIMD processing system
US9928076B2 (en) * 2014-09-26 2018-03-27 Intel Corporation Method and apparatus for unstructured control flow for SIMD execution engine
US9983884B2 (en) * 2014-09-26 2018-05-29 Intel Corporation Method and apparatus for SIMD structured branching
GB2563589B (en) * 2017-06-16 2019-06-12 Imagination Tech Ltd Scheduling tasks
CN110231909B (zh) * 2019-05-15 2021-03-05 广州视源电子科技股份有限公司 书写操作的处理方法和装置
CN111930425B (zh) * 2020-06-23 2022-06-10 联宝(合肥)电子科技有限公司 一种数据控制方法、装置以及计算机可读存储介质

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4435758A (en) 1980-03-10 1984-03-06 International Business Machines Corporation Method for conditional branch execution in SIMD vector processors
GB2273377A (en) 1992-12-11 1994-06-15 Hughes Aircraft Co Multiple masks for array processors
US5689677A (en) 1995-06-05 1997-11-18 Macmillan; David C. Circuit for enhancing performance of a computer for personal use
US6889319B1 (en) 1999-12-09 2005-05-03 Intel Corporation Method and apparatus for entering and exiting multiple threads within a multithreaded processor
US6947047B1 (en) 2001-09-20 2005-09-20 Nvidia Corporation Method and system for programmable pipelined graphics processing with branching instructions
EP1341080A1 (fr) * 2002-02-26 2003-09-03 Koninklijke Philips Electronics N.V. Système de traitement d'instructions d'un programme
US20050050305A1 (en) 2003-08-28 2005-03-03 Kissell Kevin D. Integrated mechanism for suspension and deallocation of computational threads of execution in a processor
US7477255B1 (en) 2004-04-12 2009-01-13 Nvidia Corporation System and method for synchronizing divergent samples in a programmable graphics processing unit
US7890735B2 (en) 2004-08-30 2011-02-15 Texas Instruments Incorporated Multi-threading processors, integrated circuit devices, systems, and processes of operation and manufacture
US7761697B1 (en) 2005-07-13 2010-07-20 Nvidia Corporation Processing an indirect branch instruction in a SIMD architecture
US7543136B1 (en) 2005-07-13 2009-06-02 Nvidia Corporation System and method for managing divergent threads using synchronization tokens and program instructions that include set-synchronization bits
US7788468B1 (en) 2005-12-15 2010-08-31 Nvidia Corporation Synchronization of threads in a cooperative thread array
JP2007272353A (ja) 2006-03-30 2007-10-18 Nec Electronics Corp プロセッサ装置及び複合条件処理方法
US7617384B1 (en) 2006-11-06 2009-11-10 Nvidia Corporation Structured programming control flow using a disable mask in a SIMD architecture
US8312254B2 (en) * 2008-03-24 2012-11-13 Nvidia Corporation Indirect function call instructions in a synchronous parallel thread processor
US8615646B2 (en) 2009-09-24 2013-12-24 Nvidia Corporation Unanimous branch instructions in a parallel thread processor
US8539204B2 (en) 2009-09-25 2013-09-17 Nvidia Corporation Cooperative thread array reduction and scan operations
US8850436B2 (en) 2009-09-28 2014-09-30 Nvidia Corporation Opcode-specified predicatable warp post-synchronization
US10360039B2 (en) * 2009-09-28 2019-07-23 Nvidia Corporation Predicted instruction execution in parallel processors with reduced per-thread state information including choosing a minimum or maximum of two operands based on a predicate value
US20110219221A1 (en) * 2010-03-03 2011-09-08 Kevin Skadron Dynamic warp subdivision for integrated branch and memory latency divergence tolerance
US8832417B2 (en) * 2011-09-07 2014-09-09 Qualcomm Incorporated Program flow control for multiple divergent SIMD threads using a minimum resume counter
US9229721B2 (en) 2012-09-10 2016-01-05 Qualcomm Incorporated Executing subroutines in a multi-threaded processing system

Also Published As

Publication number Publication date
CN104583941A (zh) 2015-04-29
JP2015531124A (ja) 2015-10-29
US9256429B2 (en) 2016-02-09
JP6077117B2 (ja) 2017-02-08
WO2014025480A1 (en) 2014-02-13
US20140047223A1 (en) 2014-02-13

Similar Documents

Publication Publication Date Title
CN104583941B (zh) 用于在多线程处理系统中选择性激活恢复检查操作的方法和设备
CN107810477A (zh) 解码的指令的重复使用
US8082420B2 (en) Method and apparatus for executing instructions
Fang et al. swdnn: A library for accelerating deep learning applications on sunway taihulight
US20220197645A1 (en) Repeat Instruction for Loading and/or Executing Code in a Claimable Repeat Cache a Specified Number of Times
CN105279016B (zh) 线程暂停处理器、方法、系统及指令
CN105760265B (zh) 用于测试事务性执行状态的指令和逻辑
CN104603749A (zh) 在多线程处理系统中执行子例程
CN103218209B (zh) 控制分支预测逻辑的方法和装置
CN106257411B (zh) 单指令多线程计算系统及其方法
US8615646B2 (en) Unanimous branch instructions in a parallel thread processor
CN103810035B (zh) 智能上下文管理
KR101594502B1 (ko) 바이패스 멀티플 인스턴스화 테이블을 갖는 이동 제거 시스템 및 방법
CN105579967A (zh) Gpu发散栅栏
WO2017223006A1 (en) Load-store queue for multiple processor cores
CN108027731A (zh) 针对基于块的处理器的调试支持
CA2986061A1 (en) Block-based architecture with parallel execution of successive blocks
EP3350687B1 (en) Store nullification in the target field
CN118502903A (zh) 基于通用图形处理器的线程束调度方法以及存储介质
EP3350688A1 (en) Write nullification
US20110078418A1 (en) Support for Non-Local Returns in Parallel Thread SIMD Engine
CN109074256A (zh) 在执行向量操作时管理地址冲突的装置及方法
CN104049937A (zh) 裸露向量管线之间的链接
CN103365628A (zh) 用于执行预解码时优化的指令的方法和系统
EP3815002A2 (en) Method and system for opportunistic load balancing in neural networks using metadata

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant