TW200516492A - Dynamically shared group completion table between multiple threads - Google Patents
Dynamically shared group completion table between multiple threadsInfo
- Publication number
- TW200516492A TW200516492A TW093110731A TW93110731A TW200516492A TW 200516492 A TW200516492 A TW 200516492A TW 093110731 A TW093110731 A TW 093110731A TW 93110731 A TW93110731 A TW 93110731A TW 200516492 A TW200516492 A TW 200516492A
- Authority
- TW
- Taiwan
- Prior art keywords
- thread
- gct
- completion
- dynamically shared
- simultaneous
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3838—Dependency mechanisms, e.g. register scoreboarding
- G06F9/384—Register renaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3854—Instruction completion, e.g. retiring, committing or graduating
- G06F9/3858—Result writeback, i.e. updating the architectural state or memory
Abstract
An SMT system has a dynamically shared GCT. Performance for the SMT is improved by configuring the GCT to allow an instruction group from each thread to complete simultaneously. The GCT has a read port for each thread corresponding to the completion table instruction/address array for simultaneous updating on completion. The forward link array also has a read port for each thread to find the next instruction group for each thread upon completion. The backward link array has a backward link write port for each thread in order to update the backward links for each thread simultaneously. The GCT has independent pointer management for each thread. Each of the threads has simultaneous commit of their renamed result registers and simultaneous updating of outstanding load and store tag usage.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/422,654 US7472258B2 (en) | 2003-04-21 | 2003-04-21 | Dynamically shared group completion table between multiple threads |
Publications (2)
Publication Number | Publication Date |
---|---|
TW200516492A true TW200516492A (en) | 2005-05-16 |
TWI299465B TWI299465B (en) | 2008-08-01 |
Family
ID=33159423
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW093110731A TWI299465B (en) | 2003-04-21 | 2004-04-16 | Simultaneous multithread processor and method therefor |
Country Status (4)
Country | Link |
---|---|
US (1) | US7472258B2 (en) |
JP (1) | JP3927546B2 (en) |
CN (1) | CN1304943C (en) |
TW (1) | TWI299465B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI416408B (en) * | 2009-07-15 | 2013-11-21 | Via Tech Inc | A microprocessor and information storage method thereof |
US9921873B2 (en) | 2012-01-31 | 2018-03-20 | Nvidia Corporation | Controlling work distribution for processing tasks |
Families Citing this family (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4276201B2 (en) * | 2005-03-31 | 2009-06-10 | 富士通株式会社 | Billing processing apparatus for SMT processor, billing processing method, and billing processing program |
US20070204139A1 (en) * | 2006-02-28 | 2007-08-30 | Mips Technologies, Inc. | Compact linked-list-based multi-threaded instruction graduation buffer |
US9069547B2 (en) | 2006-09-22 | 2015-06-30 | Intel Corporation | Instruction and logic for processing text strings |
WO2008077283A1 (en) * | 2006-12-27 | 2008-07-03 | Intel Corporation | Pointer renaming in workqueuing execution model |
JP2008191856A (en) * | 2007-02-02 | 2008-08-21 | Nec Computertechno Ltd | Information processing system |
JP4706030B2 (en) * | 2007-06-19 | 2011-06-22 | 富士通株式会社 | Cache control apparatus and control method |
WO2008155804A1 (en) | 2007-06-20 | 2008-12-24 | Fujitsu Limited | Simultaneous multithreaded instruction completion controller |
WO2008155845A1 (en) * | 2007-06-20 | 2008-12-24 | Fujitsu Limited | Processor |
GB201001621D0 (en) * | 2010-02-01 | 2010-03-17 | Univ Catholique Louvain | A tile-based processor architecture model for high efficiency embedded homogenous multicore platforms |
TWI486966B (en) * | 2010-02-04 | 2015-06-01 | Phison Electronics Corp | Flash memory storage device, controller thereof, and programming management method thereof |
US8521998B2 (en) * | 2010-06-04 | 2013-08-27 | International Business Machines Corporation | Instruction tracking system for processors |
US9207995B2 (en) | 2010-11-03 | 2015-12-08 | International Business Machines Corporation | Mechanism to speed-up multithreaded execution by register file write port reallocation |
US8825915B2 (en) * | 2012-03-12 | 2014-09-02 | International Business Machines Corporation | Input/output port rotation in a storage area network device |
US10095526B2 (en) * | 2012-10-12 | 2018-10-09 | Nvidia Corporation | Technique for improving performance in multi-threaded processing units |
KR102177871B1 (en) * | 2013-12-20 | 2020-11-12 | 삼성전자주식회사 | Function unit for supporting multithreading, processor comprising the same, and operating method thereof |
US9218185B2 (en) * | 2014-03-27 | 2015-12-22 | International Business Machines Corporation | Multithreading capability information retrieval |
CN105224416B (en) * | 2014-05-28 | 2018-08-21 | 联发科技(新加坡)私人有限公司 | Restorative procedure and related electronic device |
US9672045B2 (en) * | 2014-09-30 | 2017-06-06 | International Business Machines Corporation | Checkpoints for a simultaneous multithreading processor |
US20180239532A1 (en) * | 2017-02-23 | 2018-08-23 | Western Digital Technologies, Inc. | Techniques for performing a non-blocking control sync operation |
US10599431B2 (en) | 2017-07-17 | 2020-03-24 | International Business Machines Corporation | Managing backend resources via frontend steering or stalls |
US10802829B2 (en) | 2017-11-30 | 2020-10-13 | International Business Machines Corporation | Scalable dependency matrix with wake-up columns for long latency instructions in an out-of-order processor |
US10572264B2 (en) * | 2017-11-30 | 2020-02-25 | International Business Machines Corporation | Completing coalesced global completion table entries in an out-of-order processor |
US10901744B2 (en) | 2017-11-30 | 2021-01-26 | International Business Machines Corporation | Buffered instruction dispatching to an issue queue |
US10942747B2 (en) | 2017-11-30 | 2021-03-09 | International Business Machines Corporation | Head and tail pointer manipulation in a first-in-first-out issue queue |
US10884753B2 (en) | 2017-11-30 | 2021-01-05 | International Business Machines Corporation | Issue queue with dynamic shifting between ports |
US10929140B2 (en) | 2017-11-30 | 2021-02-23 | International Business Machines Corporation | Scalable dependency matrix with a single summary bit in an out-of-order processor |
US10922087B2 (en) | 2017-11-30 | 2021-02-16 | International Business Machines Corporation | Block based allocation and deallocation of issue queue entries |
US10564976B2 (en) | 2017-11-30 | 2020-02-18 | International Business Machines Corporation | Scalable dependency matrix with multiple summary bits in an out-of-order processor |
US10564979B2 (en) * | 2017-11-30 | 2020-02-18 | International Business Machines Corporation | Coalescing global completion table entries in an out-of-order processor |
US10725786B2 (en) | 2018-08-23 | 2020-07-28 | International Business Machines Corporation | Completion mechanism for a microprocessor instruction completion table |
CN109831434B (en) * | 2019-01-31 | 2021-03-02 | 西安微电子技术研究所 | Multi-protocol communication exchange controller based on user-defined exchange strategy |
US10977041B2 (en) | 2019-02-27 | 2021-04-13 | International Business Machines Corporation | Offset-based mechanism for storage in global completion tables |
US11163581B2 (en) * | 2019-10-21 | 2021-11-02 | Arm Limited | Online instruction tagging |
CN111708622B (en) * | 2020-05-28 | 2022-06-10 | 山东云海国创云计算装备产业创新中心有限公司 | Instruction group scheduling method, architecture, equipment and storage medium |
CN112214243B (en) * | 2020-10-21 | 2022-05-27 | 上海壁仞智能科技有限公司 | Apparatus and method for configuring cooperative thread bundle in vector operation system |
CN112230901B (en) * | 2020-10-29 | 2023-06-20 | 厦门市易联众易惠科技有限公司 | Network programming framework system and method based on asynchronous IO model |
US11816349B2 (en) | 2021-11-03 | 2023-11-14 | Western Digital Technologies, Inc. | Reduce command latency using block pre-erase |
CN115269008B (en) * | 2022-09-29 | 2023-02-28 | 苏州浪潮智能科技有限公司 | Data processing method, device, medium and electronic equipment |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0644089A (en) | 1992-05-18 | 1994-02-18 | Matsushita Electric Ind Co Ltd | Information processor |
JPH06332700A (en) | 1993-05-25 | 1994-12-02 | Matsushita Electric Ind Co Ltd | Information processor |
US5574935A (en) * | 1993-12-29 | 1996-11-12 | Intel Corporation | Superscalar processor with a multi-port reorder buffer |
US5724565A (en) | 1995-02-03 | 1998-03-03 | International Business Machines Corporation | Method and system for processing first and second sets of instructions by first and second types of processing systems |
US5841999A (en) * | 1996-04-17 | 1998-11-24 | International Business Machines Corporation | Information handling system having a register remap structure using a content addressable table |
US6219728B1 (en) * | 1996-04-22 | 2001-04-17 | Nortel Networks Limited | Method and apparatus for allocating shared memory resources among a plurality of queues each having a threshold value therefor |
JP2882475B2 (en) * | 1996-07-12 | 1999-04-12 | 日本電気株式会社 | Thread execution method |
US5922057A (en) * | 1997-01-10 | 1999-07-13 | Lsi Logic Corporation | Method for multiprocessor system of controlling a dynamically expandable shared queue in which ownership of a queue entry by a processor is indicated by a semaphore |
US6772324B2 (en) * | 1997-12-17 | 2004-08-03 | Intel Corporation | Processor having multiple program counters and trace buffers outside an execution pipeline |
US6134645A (en) * | 1998-06-01 | 2000-10-17 | International Business Machines Corporation | Instruction completion logic distributed among execution units for improving completion efficiency |
JP2000047887A (en) | 1998-07-30 | 2000-02-18 | Toshiba Corp | Speculative multi-thread processing method and its device |
JP3604029B2 (en) | 1999-01-12 | 2004-12-22 | 日本電気株式会社 | Multi-thread processor |
US6594755B1 (en) * | 2000-01-04 | 2003-07-15 | National Semiconductor Corporation | System and method for interleaved execution of multiple independent threads |
US6854075B2 (en) * | 2000-04-19 | 2005-02-08 | Hewlett-Packard Development Company, L.P. | Simultaneous and redundantly threaded processor store instruction comparator |
US6681345B1 (en) | 2000-08-15 | 2004-01-20 | International Business Machines Corporation | Field protection against thread loss in a multithreaded computer processor |
US6931639B1 (en) * | 2000-08-24 | 2005-08-16 | International Business Machines Corporation | Method for implementing a variable-partitioned queue for simultaneous multithreaded processors |
-
2003
- 2003-04-21 US US10/422,654 patent/US7472258B2/en not_active Expired - Fee Related
-
2004
- 2004-03-08 JP JP2004064719A patent/JP3927546B2/en not_active Expired - Fee Related
- 2004-04-15 CN CNB2004100348872A patent/CN1304943C/en not_active Expired - Fee Related
- 2004-04-16 TW TW093110731A patent/TWI299465B/en not_active IP Right Cessation
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI416408B (en) * | 2009-07-15 | 2013-11-21 | Via Tech Inc | A microprocessor and information storage method thereof |
US9921873B2 (en) | 2012-01-31 | 2018-03-20 | Nvidia Corporation | Controlling work distribution for processing tasks |
Also Published As
Publication number | Publication date |
---|---|
US7472258B2 (en) | 2008-12-30 |
US20040210743A1 (en) | 2004-10-21 |
CN1304943C (en) | 2007-03-14 |
JP3927546B2 (en) | 2007-06-13 |
JP2004326738A (en) | 2004-11-18 |
TWI299465B (en) | 2008-08-01 |
CN1542607A (en) | 2004-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TW200516492A (en) | Dynamically shared group completion table between multiple threads | |
US7941648B2 (en) | Methods and apparatus for dynamic instruction controlled reconfigurable register file | |
US20080126757A1 (en) | Cellular engine for a data processing system | |
EP1121636B1 (en) | Multiplier-accumulator configuration for efficient scheduling in a digital signal processor | |
US5175862A (en) | Method and apparatus for a special purpose arithmetic boolean unit | |
US5794003A (en) | Instruction cache associative crossbar switch system | |
CN101794214B (en) | Register renaming system using multi-block physical register mapping table and method thereof | |
US20080034235A1 (en) | Reconfigurable Signal Processor | |
WO2000022508A2 (en) | Forwarding paths and operand sharing in a digital signal processor | |
CN100555216C (en) | A kind of data processing method and processor | |
CN103221935A (en) | Method and apparatus for moving data from a SIMD register file to general purpose register file | |
US20070294515A1 (en) | Register file bit and method for fast context switch | |
MY122682A (en) | System and method for performing context switching and rescheduling of a processor | |
US20200341772A1 (en) | Efficient Architectures For Deep Learning Algorithms | |
US20080162824A1 (en) | Orthogonal Data Memory | |
US20080209164A1 (en) | Microprocessor Architectures | |
JP2006040254A (en) | Reconfigurable circuit and processor | |
US20110241744A1 (en) | Latch-based implementation of a register file for a multi-threaded processor | |
GB2383868A (en) | Cache dynamically configured for simultaneous accesses by multiple computing engines | |
CN101313290B (en) | Performing an N-bit write access to an MxN-bit-only peripheral | |
CN101387951B (en) | Public subcircuit processors with instructions of different kinds | |
CN101930355B (en) | Register circuit realizing grouping addressing and read write control method for register files | |
CN112074810B (en) | Parallel processing apparatus | |
CN103186502A (en) | Register file organization for sharing process contexts of processors | |
CN101930356B (en) | Method for group addressing and read-write controlling of register file for floating-point coprocessor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |