TW200516492A - Dynamically shared group completion table between multiple threads - Google Patents

Dynamically shared group completion table between multiple threads

Info

Publication number
TW200516492A
TW200516492A TW093110731A TW93110731A TW200516492A TW 200516492 A TW200516492 A TW 200516492A TW 093110731 A TW093110731 A TW 093110731A TW 93110731 A TW93110731 A TW 93110731A TW 200516492 A TW200516492 A TW 200516492A
Authority
TW
Taiwan
Prior art keywords
thread
gct
completion
dynamically shared
simultaneous
Prior art date
Application number
TW093110731A
Other languages
Chinese (zh)
Other versions
TWI299465B (en
Inventor
William E Burky
Peter J Klim
Hung Q Le
Original Assignee
Ibm
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ibm filed Critical Ibm
Publication of TW200516492A publication Critical patent/TW200516492A/en
Application granted granted Critical
Publication of TWI299465B publication Critical patent/TWI299465B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • G06F9/384Register renaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory

Abstract

An SMT system has a dynamically shared GCT. Performance for the SMT is improved by configuring the GCT to allow an instruction group from each thread to complete simultaneously. The GCT has a read port for each thread corresponding to the completion table instruction/address array for simultaneous updating on completion. The forward link array also has a read port for each thread to find the next instruction group for each thread upon completion. The backward link array has a backward link write port for each thread in order to update the backward links for each thread simultaneously. The GCT has independent pointer management for each thread. Each of the threads has simultaneous commit of their renamed result registers and simultaneous updating of outstanding load and store tag usage.
TW093110731A 2003-04-21 2004-04-16 Simultaneous multithread processor and method therefor TWI299465B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/422,654 US7472258B2 (en) 2003-04-21 2003-04-21 Dynamically shared group completion table between multiple threads

Publications (2)

Publication Number Publication Date
TW200516492A true TW200516492A (en) 2005-05-16
TWI299465B TWI299465B (en) 2008-08-01

Family

ID=33159423

Family Applications (1)

Application Number Title Priority Date Filing Date
TW093110731A TWI299465B (en) 2003-04-21 2004-04-16 Simultaneous multithread processor and method therefor

Country Status (4)

Country Link
US (1) US7472258B2 (en)
JP (1) JP3927546B2 (en)
CN (1) CN1304943C (en)
TW (1) TWI299465B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI416408B (en) * 2009-07-15 2013-11-21 Via Tech Inc A microprocessor and information storage method thereof
US9921873B2 (en) 2012-01-31 2018-03-20 Nvidia Corporation Controlling work distribution for processing tasks

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4276201B2 (en) * 2005-03-31 2009-06-10 富士通株式会社 Billing processing apparatus for SMT processor, billing processing method, and billing processing program
US20070204139A1 (en) * 2006-02-28 2007-08-30 Mips Technologies, Inc. Compact linked-list-based multi-threaded instruction graduation buffer
US9069547B2 (en) 2006-09-22 2015-06-30 Intel Corporation Instruction and logic for processing text strings
WO2008077283A1 (en) * 2006-12-27 2008-07-03 Intel Corporation Pointer renaming in workqueuing execution model
JP2008191856A (en) * 2007-02-02 2008-08-21 Nec Computertechno Ltd Information processing system
JP4706030B2 (en) * 2007-06-19 2011-06-22 富士通株式会社 Cache control apparatus and control method
WO2008155804A1 (en) 2007-06-20 2008-12-24 Fujitsu Limited Simultaneous multithreaded instruction completion controller
WO2008155845A1 (en) * 2007-06-20 2008-12-24 Fujitsu Limited Processor
GB201001621D0 (en) * 2010-02-01 2010-03-17 Univ Catholique Louvain A tile-based processor architecture model for high efficiency embedded homogenous multicore platforms
TWI486966B (en) * 2010-02-04 2015-06-01 Phison Electronics Corp Flash memory storage device, controller thereof, and programming management method thereof
US8521998B2 (en) * 2010-06-04 2013-08-27 International Business Machines Corporation Instruction tracking system for processors
US9207995B2 (en) 2010-11-03 2015-12-08 International Business Machines Corporation Mechanism to speed-up multithreaded execution by register file write port reallocation
US8825915B2 (en) * 2012-03-12 2014-09-02 International Business Machines Corporation Input/output port rotation in a storage area network device
US10095526B2 (en) * 2012-10-12 2018-10-09 Nvidia Corporation Technique for improving performance in multi-threaded processing units
KR102177871B1 (en) * 2013-12-20 2020-11-12 삼성전자주식회사 Function unit for supporting multithreading, processor comprising the same, and operating method thereof
US9218185B2 (en) * 2014-03-27 2015-12-22 International Business Machines Corporation Multithreading capability information retrieval
CN105224416B (en) * 2014-05-28 2018-08-21 联发科技(新加坡)私人有限公司 Restorative procedure and related electronic device
US9672045B2 (en) * 2014-09-30 2017-06-06 International Business Machines Corporation Checkpoints for a simultaneous multithreading processor
US20180239532A1 (en) * 2017-02-23 2018-08-23 Western Digital Technologies, Inc. Techniques for performing a non-blocking control sync operation
US10599431B2 (en) 2017-07-17 2020-03-24 International Business Machines Corporation Managing backend resources via frontend steering or stalls
US10802829B2 (en) 2017-11-30 2020-10-13 International Business Machines Corporation Scalable dependency matrix with wake-up columns for long latency instructions in an out-of-order processor
US10572264B2 (en) * 2017-11-30 2020-02-25 International Business Machines Corporation Completing coalesced global completion table entries in an out-of-order processor
US10901744B2 (en) 2017-11-30 2021-01-26 International Business Machines Corporation Buffered instruction dispatching to an issue queue
US10942747B2 (en) 2017-11-30 2021-03-09 International Business Machines Corporation Head and tail pointer manipulation in a first-in-first-out issue queue
US10884753B2 (en) 2017-11-30 2021-01-05 International Business Machines Corporation Issue queue with dynamic shifting between ports
US10929140B2 (en) 2017-11-30 2021-02-23 International Business Machines Corporation Scalable dependency matrix with a single summary bit in an out-of-order processor
US10922087B2 (en) 2017-11-30 2021-02-16 International Business Machines Corporation Block based allocation and deallocation of issue queue entries
US10564976B2 (en) 2017-11-30 2020-02-18 International Business Machines Corporation Scalable dependency matrix with multiple summary bits in an out-of-order processor
US10564979B2 (en) * 2017-11-30 2020-02-18 International Business Machines Corporation Coalescing global completion table entries in an out-of-order processor
US10725786B2 (en) 2018-08-23 2020-07-28 International Business Machines Corporation Completion mechanism for a microprocessor instruction completion table
CN109831434B (en) * 2019-01-31 2021-03-02 西安微电子技术研究所 Multi-protocol communication exchange controller based on user-defined exchange strategy
US10977041B2 (en) 2019-02-27 2021-04-13 International Business Machines Corporation Offset-based mechanism for storage in global completion tables
US11163581B2 (en) * 2019-10-21 2021-11-02 Arm Limited Online instruction tagging
CN111708622B (en) * 2020-05-28 2022-06-10 山东云海国创云计算装备产业创新中心有限公司 Instruction group scheduling method, architecture, equipment and storage medium
CN112214243B (en) * 2020-10-21 2022-05-27 上海壁仞智能科技有限公司 Apparatus and method for configuring cooperative thread bundle in vector operation system
CN112230901B (en) * 2020-10-29 2023-06-20 厦门市易联众易惠科技有限公司 Network programming framework system and method based on asynchronous IO model
US11816349B2 (en) 2021-11-03 2023-11-14 Western Digital Technologies, Inc. Reduce command latency using block pre-erase
CN115269008B (en) * 2022-09-29 2023-02-28 苏州浪潮智能科技有限公司 Data processing method, device, medium and electronic equipment

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0644089A (en) 1992-05-18 1994-02-18 Matsushita Electric Ind Co Ltd Information processor
JPH06332700A (en) 1993-05-25 1994-12-02 Matsushita Electric Ind Co Ltd Information processor
US5574935A (en) * 1993-12-29 1996-11-12 Intel Corporation Superscalar processor with a multi-port reorder buffer
US5724565A (en) 1995-02-03 1998-03-03 International Business Machines Corporation Method and system for processing first and second sets of instructions by first and second types of processing systems
US5841999A (en) * 1996-04-17 1998-11-24 International Business Machines Corporation Information handling system having a register remap structure using a content addressable table
US6219728B1 (en) * 1996-04-22 2001-04-17 Nortel Networks Limited Method and apparatus for allocating shared memory resources among a plurality of queues each having a threshold value therefor
JP2882475B2 (en) * 1996-07-12 1999-04-12 日本電気株式会社 Thread execution method
US5922057A (en) * 1997-01-10 1999-07-13 Lsi Logic Corporation Method for multiprocessor system of controlling a dynamically expandable shared queue in which ownership of a queue entry by a processor is indicated by a semaphore
US6772324B2 (en) * 1997-12-17 2004-08-03 Intel Corporation Processor having multiple program counters and trace buffers outside an execution pipeline
US6134645A (en) * 1998-06-01 2000-10-17 International Business Machines Corporation Instruction completion logic distributed among execution units for improving completion efficiency
JP2000047887A (en) 1998-07-30 2000-02-18 Toshiba Corp Speculative multi-thread processing method and its device
JP3604029B2 (en) 1999-01-12 2004-12-22 日本電気株式会社 Multi-thread processor
US6594755B1 (en) * 2000-01-04 2003-07-15 National Semiconductor Corporation System and method for interleaved execution of multiple independent threads
US6854075B2 (en) * 2000-04-19 2005-02-08 Hewlett-Packard Development Company, L.P. Simultaneous and redundantly threaded processor store instruction comparator
US6681345B1 (en) 2000-08-15 2004-01-20 International Business Machines Corporation Field protection against thread loss in a multithreaded computer processor
US6931639B1 (en) * 2000-08-24 2005-08-16 International Business Machines Corporation Method for implementing a variable-partitioned queue for simultaneous multithreaded processors

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI416408B (en) * 2009-07-15 2013-11-21 Via Tech Inc A microprocessor and information storage method thereof
US9921873B2 (en) 2012-01-31 2018-03-20 Nvidia Corporation Controlling work distribution for processing tasks

Also Published As

Publication number Publication date
US7472258B2 (en) 2008-12-30
US20040210743A1 (en) 2004-10-21
CN1304943C (en) 2007-03-14
JP3927546B2 (en) 2007-06-13
JP2004326738A (en) 2004-11-18
TWI299465B (en) 2008-08-01
CN1542607A (en) 2004-11-03

Similar Documents

Publication Publication Date Title
TW200516492A (en) Dynamically shared group completion table between multiple threads
US7941648B2 (en) Methods and apparatus for dynamic instruction controlled reconfigurable register file
US20080126757A1 (en) Cellular engine for a data processing system
EP1121636B1 (en) Multiplier-accumulator configuration for efficient scheduling in a digital signal processor
US5175862A (en) Method and apparatus for a special purpose arithmetic boolean unit
US5794003A (en) Instruction cache associative crossbar switch system
CN101794214B (en) Register renaming system using multi-block physical register mapping table and method thereof
US20080034235A1 (en) Reconfigurable Signal Processor
WO2000022508A2 (en) Forwarding paths and operand sharing in a digital signal processor
CN100555216C (en) A kind of data processing method and processor
CN103221935A (en) Method and apparatus for moving data from a SIMD register file to general purpose register file
US20070294515A1 (en) Register file bit and method for fast context switch
MY122682A (en) System and method for performing context switching and rescheduling of a processor
US20200341772A1 (en) Efficient Architectures For Deep Learning Algorithms
US20080162824A1 (en) Orthogonal Data Memory
US20080209164A1 (en) Microprocessor Architectures
JP2006040254A (en) Reconfigurable circuit and processor
US20110241744A1 (en) Latch-based implementation of a register file for a multi-threaded processor
GB2383868A (en) Cache dynamically configured for simultaneous accesses by multiple computing engines
CN101313290B (en) Performing an N-bit write access to an MxN-bit-only peripheral
CN101387951B (en) Public subcircuit processors with instructions of different kinds
CN101930355B (en) Register circuit realizing grouping addressing and read write control method for register files
CN112074810B (en) Parallel processing apparatus
CN103186502A (en) Register file organization for sharing process contexts of processors
CN101930356B (en) Method for group addressing and read-write controlling of register file for floating-point coprocessor

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees