TW200421180A - Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating - Google Patents

Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating Download PDF

Info

Publication number
TW200421180A
TW200421180A TW093103698A TW93103698A TW200421180A TW 200421180 A TW200421180 A TW 200421180A TW 093103698 A TW093103698 A TW 093103698A TW 93103698 A TW93103698 A TW 93103698A TW 200421180 A TW200421180 A TW 200421180A
Authority
TW
Taiwan
Prior art keywords
processor
synchronous multi
level
circuit
fiber
Prior art date
Application number
TW093103698A
Other languages
Chinese (zh)
Other versions
TWI261198B (en
Inventor
Gi-Ho Park
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/631,601 external-priority patent/US7152170B2/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of TW200421180A publication Critical patent/TW200421180A/en
Application granted granted Critical
Publication of TWI261198B publication Critical patent/TWI261198B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30189Instruction operation extension or modification according to execution mode, e.g. mode flag
    • AHUMAN NECESSITIES
    • A41WEARING APPAREL
    • A41DOUTERWEAR; PROTECTIVE GARMENTS; ACCESSORIES
    • A41D19/00Gloves
    • A41D19/015Protective gloves
    • A41D19/01547Protective gloves with grip improving means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Textile Engineering (AREA)
  • Multimedia (AREA)
  • Power Sources (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

Processing circuits that are associated with the operation of the threads in an SMT processor can be configured to operate at different performance levels based on a number of threads currently operated by the SMT processor. For example, in some embodiments according to the invention, processing circuits, such as a floating point unit or a data cache, that are associated with the operation of a thread in the SMT processor can operate in one of a high power mode or a low power mode based on the number of threads currently operated by the SMT processor. Furthermore, as the number of threads operated by the SMT operator increases, the performance levels of the processing circuits can be decreased, thereby providing the architectural benefits of the SMT processor while allowing a reduction in the amount of power consumed by the processing circuits associated with the threads. Related computer program products and methods are also disclosed.

Description

200421180200421180

發明所屬之技術領域 本發明是有關於一種電腦處理器的一般架構,且 別的是,有關於一種同步多絮電腦處理器、相關電腦★ 產品、以及其操作方法。 式 先前技術 同步夕絮(Simultaneous Multi-Threading,以下簡稱 SMT—)係為一種使用硬體多絮(multi threading)方式,允許 在母週期期間’多數個絮(threads)可各自發出指令的& 處理器架構。不像在任一給定週期中只有一單一硬體情境 (con text)(也就是絮)可啟動的其他硬體多絮架構,同步& 夕絮架構可允許所有絮情境(thread contexts)同時競爭 並且分享處理器資源。 同步多絮處理器可善加利用原本被浪費掉的週期,來 執行可降低在同步多絮處理器中的長潛伏(latency)動作 效應的指令。此外,當絮個數增加時,整體效能也會增 加,因此多絮處理器的消耗功率也可能增加。 第1圖係繪示一個習知的同步多絮處理器的方塊圖。第 1圖的習知同步多絮處理器的動作,是在Dean M.TECHNICAL FIELD The present invention relates to a general architecture of a computer processor, and in addition, to a synchronous multi-fiber computer processor, a related computer product, and a method of operating the same. The previous technology, Simultaneous Multi-Threading (hereinafter referred to as SMT-), is a multi-threading method using hardware, which allows 'multiple threads' to issue instructions during the mother cycle. Processor architecture. Unlike other hardware multi-fiber architectures where only a single hardware context (i.e., floc) can be launched in any given cycle, the synchronous & fabric architecture allows all thread contexts to compete at the same time And share processor resources. The synchronous multi-fiber processor can make good use of the cycle that was originally wasted to execute instructions that can reduce the effect of the long latency action in the synchronous multi-fiber processor. In addition, as the number of batts increases, the overall performance also increases, so the power consumption of the multi-batch processor may also increase. FIG. 1 is a block diagram of a conventional synchronous multi-fiber processor. The action of the conventional synchronous multi-fiber processor in Figure 1 is in Dean M.

Tullsen; Susan J. Egger; Henry M. Levy; Jack L.Tullsen; Susan J. Egger; Henry M. Levy; Jack L.

Lo; Rebecca L. Stamm 等人;Exploiting Choice:Lo; Rebecca L. Stamm et al .; Exploiting Choice:

Instruction Fetch and Issue on an Imp 1 ementab1e Simultaneous Multithreading Processor, The 23rdInstruction Fetch and Issue on an Imp 1 ementab1e Simultaneous Multithreading Processor, The 23rd

Annual International Symposium on Computer Architecture, pp· 191 -202,1996 中所揭露,而且本發As disclosed in Annual International Symposium on Computer Architecture, pp. 191-202, 1996, and this publication

13145pif.ptd 第6頁 200421180 五、發明說明(2) 明在此將合併參考上述論文以做說明。 絮處理器的架構與動作,為熟習相關技藏m的同步多 其細節在此不再贅述。 θ者所熟知,因此 發明内容 本發明實施例提供一種處理電路、_ 品、和/或一種依據多數個操作絮 電腦程式產 同步多絮處理器操作之方法讀二不,能水平下由-部分實施例中,與在同步多絮;:、而;的在=發明的 如浮點單元⑴⑽叫PQlnt 關之 :絮個;,在高功率模式或低功率模式下ϋ::: 由同步多絮處理器所操作的絮個數 = 平’因此在降低與絮相 耗的功率的同時,提供同步多絮處理器架 :卜:理器亦可以相同功率但較高效能之方式, ί 功率但較習知同步多絮處理器較高效能之 在根據本發明的部分實施例中’當同 ::;ΠΓΓ數係小於或等於一臨界值時,處理電路係 效能水平上操作,而且當同步多絮處理器 =3作的絮個數係大於該臨界值時,處理電路係架構 成在一第一效能水平上操作。 在根據本發明的部分實施例中,可架構一個效能水平 二 路以根據同步多絮處理器目前所操作的絮個數, 第7頁 13145pif.ptd 200421180 五、發明說明(3) 所在根據本發明的部分實 …臨界值時小於或 升至-第-效能水平。以 可將提供給至少一處理電路的效=平效==路 效能水平的一第二效能水平。 下降至低於第一 在根據本發明的部分實施例中,當同步 :所操作的絮個數超過大於第一臨里盗目 下降至低於第二效能水平的一第三效能水平的效-水千 本發明提供多種效能水平變化的實施例。舉例而+, 在根據本發明的部分實施例中,處理電路可。 記憶體(tag memory)與一資料記憶體(data mem〇r〇的示一紙 快=記憶體電路(cache mem〇ry circuit)。其中,當、取 記憶,電路是以第-效能水平操作時,資料記憶體;接仳 與標籤記憶體同時存取的快取資料(cached dat=)。去” 取^憶體電路是以低於第一效能水平的一第二效能水曰、 作時,資料記憶體可提供對命中(h i t)標籤眾 的快取資料。 丨〜蔽、夂應 在根據本發明的部分實施例中,快取記憶體可為由 々刼作以儲存資料的一資料快取記憶體(data ache 9 memory),以及用來儲存在相關資料上操作的指令的一 令快取記憶體(instruction cache mem〇ry)的至少复玲 第8頁 13145pif.ptd 20042118013145pif.ptd Page 6 200421180 V. Description of the Invention (2) It is stated here that the above papers will be incorporated by reference for explanation. The architecture and actions of the processor are familiar with the synchronization of related technologies. The details will not be repeated here. θ is well known, so the embodiments of the present invention provide a processing circuit, a product, and / or a method for producing a synchronous multi-processor operation based on a plurality of operating computer programs. In the embodiment, the same as in the multi-float;:, and; the == invented as a floating-point unit called PQlnt related: in the high-power mode or low-power mode: :: by the multi-float synchronization The number of processors operated by the processor = flat. Therefore, while reducing the power consumed by the processor, a synchronous multi-processor processor rack is provided: Bu: The processor can also use the same power but higher performance. Ί Power but more efficient It is known that the performance of the synchronous multi-fiber processor is higher in some embodiments according to the present invention. 'When the same ::; ΠΓΓ number system is less than or equal to a critical value, the processing circuit system operates at an efficiency level, and When the number of processors of the processor = 3 is greater than the critical value, the processing circuit system is configured to operate at a first efficiency level. In some embodiments according to the present invention, a two-way performance level may be constructed to be based on the number of bats currently operated by the synchronous multi-batch processor. Page 7 13145pif.ptd 200421180 V. Description of the invention (3) Where according to the present invention Part of the real ... the critical value is less than or increased to the -th-efficiency level. A second efficiency level that can provide the efficiency = average efficiency == circuit efficiency level provided to at least one processing circuit. Drop to below the first. In some embodiments according to the present invention, when synchronizing: the number of bats that are operated exceeds the effect of the first trip to the third efficiency level which is lower than the second efficiency level. The present invention provides various embodiments with varying levels of efficacy. By way of example, +, in some embodiments according to the present invention, the processing circuit is available. Memory (tag memory) and a data memory (data mem〇r〇 show a paper fast = memory circuit (cache memory circuit). Among them, when the memory is fetched, the circuit operates at the first-performance level Data memory; then access the cached data (cached dat =) which is accessed at the same time as the tag memory. To get the memory circuit is based on a second performance level below the first performance level. The data memory can provide cached data for hit tags. 丨 ~ 蔽 In some embodiments according to the present invention, the cache memory can be used as a data cache to store data. Fetch memory (data ache 9 memory), and an order cache memory (instruction cache memory) used to store instructions on related data at least Fu Ling page 8 13145pif.ptd 200421180

五、發明說明(4) 一。在根據本發明的部分實施例中,當以 作時,資料記憶體可更加係架構成不^供二=能水平操 標籤記憶體起反應的快取資料。 、、非命中(m i s s) 在根據本發明的部分實施例中,處理電 、 單元。在根據本發明的部分實施例φ,>路可為一浮點 τ 〆手黑占- r 止 第一浮點單元,而且當同步多絮處理器目^ 疋可為一個 數係小於或等於一臨界值時,第一浮點單則所操作的絮個 第一效能水平上操作。此外,同步多智:疋係架構成在一 一個第二浮點電路,而且當同步多絮處理° 、更加包括 絮個數係大於該臨界值時,該第二淳器目刚所操作的 第二效能水平上操作。 "。路係架構成在一 在根據本發明的部分實施例中,可架 控制電路,藉以響應在同步多絮處理器片# 一個效能水平 作的絮個數,分別增加或減少同步多I 已建立或完成工 的絮個數。 ’、处理器目前所操作 在根據本發明的部分實施例中,可y 電路,響應同步多絮處理器目前所操卞 個第二處理 值的絮個數,在低於第一效能水平的_ t已增加至該臨界 作。 第二效能水平上操 在根據本發明的部分實施例中, t 衡 架構成響應當增加一新絮以將同牛/如此水平控制電路係w 的絮個數,從小於或等於臨界值(二^處理器目前所操作 時,減少提供至至少_處理電路的二^大於臨界值數 明的部分實施例中,效能水平水平。在根據本發 卫4路係架構成當同步多5. Description of the invention (4) In some embodiments according to the present invention, when operating, the data memory can be more structured to provide cache data that can be manipulated horizontally by the tag memory. Non-hit (m i s s) In some embodiments according to the present invention, the electric unit is processed. In some embodiments according to the invention φ, > the path may be a floating point τ τ hand black account-r only the first floating point unit, and when the synchronous multi-processor processor ^ 疋 may be a number system less than or equal to At a critical value, the first floating-point unit operates at a first efficiency level. In addition, the synchronous multi-intelligence: the frame is composed of a second floating-point circuit, and when the synchronous multi-float processing °, including the number of flocks is greater than the critical value, the second device Operate at the second efficiency level. ". The road frame constitutes a control circuit in some embodiments according to the present invention, so as to respond to the number of loops made at a synchronous multi-processor chip # at an efficiency level, and increase or decrease the synchronous multi-Is respectively. Number of completed jobs. 'The processor is currently operating. In some embodiments according to the present invention, the y circuit can respond to the number of second processing values currently operated by the synchronous multi-fiber processor, which is lower than the first performance level. t has increased to this threshold. Operating at the second level of efficiency In some embodiments according to the present invention, the t-frame constitutes a response when a new line is added to reduce the number of lines in the same cattle / such level control circuit system from less than or equal to a critical value (two ^ When the processor is currently operating, the efficiency level is reduced in some embodiments that provide at least two processing circuits that are greater than the critical value. In the embodiment of the present four-lane system, the synchronization is more

13145pif.ptd 第9頁 200421180 五、發明說明(5) __ 絮處理器目前所操作的絮個數超過所有的遞 (ascendmg threshold values)時,將處理 ^二值 ::降低至複數個遞減(descending)效能水平的其=水 在根據本發明的部分實施例中,效能水 架構成響應當同步多絮處理器目前所操作 I制電路係 於或等於臨界值,增加至大於臨界值時,早上,數,從小 -處理電路的一第一效能水平,i且將低於】第 的-第二效能水平’提供至一第二處理電路。效此水平13145pif.ptd Page 9 200421180 V. Description of the invention (5) __ When the number of flocs currently operated by the flocculation processor exceeds all the ascendmg threshold values, it will be processed ^ two values :: reduced to plural descending (descending) The efficiency level is equal to water. In some embodiments according to the present invention, the efficiency water rack constitutes a response. When the I-system circuit currently operated by the synchronous multi-flop processor is at or equal to a critical value and increases to a value greater than the critical value, in the morning, From a small-first processing level of the processing circuit, i is provided to a second-processing circuit that is lower than the second-second performance level '. At this level

在根據本發明的其他實施例中,效能水平控 架,成根據同步多絮處理器目前所操作的絮個• 季二 效能水平給在同步多絮處理器中的處理電路。 ,、一 在根據本發明的其他實施例中,可 J路用立絮時,將與同步多絮處理器個上?電 ::電同步多絮處理器中所操作的絮。效能水平 “ίΪΪ 據由將同步多絮處理器目前所操作的 其中之一,提供給處理電路。 丁刃 關之在發明的其他實施例中,與同步多絮處理器相 體丄 包括:個標籤記憶體與-個資料記憶 體可二》鑛ζ二絮處理器目前所操作的絮個數,資料記憶 =取同步存取’或是在標藏記憶體存取之後 為讓本發明之上述和其他目的、特徵、和優點能更明In other embodiments according to the present invention, the efficiency level is controlled to the processing circuit in the synchronous multi-fiber processor according to the current performance of the synchronous multi-fiber processor. 1. In other embodiments according to the present invention, when the J-floor can be used, will it be synchronized with the multi-fiber processor? Electric :: Electric Synchronized Multi-Flood Processor. The efficiency level is provided to the processing circuit according to one of the current operations of the synchronous multi-fiber processor. In other embodiments of the invention, Ding Guan Guan and the synchronous multi-fiber processor include: tags The memory and the data memory can be two, and the number of operations currently performed by the zeta second processor, the data memory = take synchronous access' or after the access to the mark memory to allow the above-mentioned sum of the present invention and Other purposes, features, and advantages can be made clearer

200421180 五、發明說明(6) ' -------^_ 顯易It,下文特以較佳實施例,並配合所附圖式,、, 說明如下: 砰細 實施方式: 以下將參考所附繪圖,詳細說明本發明的較佳實施 例。在下文中,相同的參考號碼代表相同的元奴/ 發明之說明及實作’可充分了解及學習本發明之 及/或優點。 、他特色 ^熟習相關技藝者當知雖然在此所用的第一及第二 詞係描述各種元件,但該些元件並不受限於該特定—名 用:名詞係用來區分元件之用。因此,在此所;論 的第一 70件,在另一章節中亦可為一第二元件。同理, 此所討論的第二元4,亦可為另一章節中的第一元件,口 要其不偏離本發明之精神即可。 /、 熟習相關技藝者當知本發明亦可以電路、電腦程式 品、和/或電腦程式產品的方式實現。因此,本發 ^每 現方式可包括完全用硬體實現、完全用軟體實現、或匈^八 硬,部份軟體的方式實現。此外,本發明可以用在一二 可項取1存媒體上’具有儲存在該媒體中的電腦程式碼的 一個電腦程式產品的方式實現。包括硬碟、cd — r〇Ms、光 學儲存裝置、或磁性儲存裝置在内的任何電腦可讀取 皆可應用於本發明。 ' ® 本發明用來執行運算的電腦程式碼或,,程式碼,,可以如 JAVA ' Smalltalk iLC + + ^JavaScript ^Visual Basic ^ TSQL、Perl的物件導向程式語言,或以各種其他程式語言 200421180 五、發明說明(7) 編寫。本發明的軟體實施例並不受限於必須以某種特定程 式語言實現。部分程式碼亦可完全在中間層伺服器 (intermediary server)所用的一或多個系統上執行。 該程式碼可完全在一或多個電腦系統上執行,或部分 在一個伺服器(server)上執行,而另一部分在配置於一客 戶端裝置(client device)内的一個客戶端(client)上執 行’或甚至可當成位於通訊網路的中間點(intermediate point)上的一個代理祠服器(pr〇Xy server)使用。在後一 個範例中,該客戶端裝置可能連接至區域網路(LAN)或廣 域網路(WAN)(如網際網路)上的一個伺服器,或是經由網 際網路(例如經由網際網路服務供應商)與外界連接。本發 明可以不同類型的電腦網路上所運行的各種協定實現。 以下參考用來描述根據本發明實施例之方法、系統、 及電腦程式產品的方塊圖與流程圖,詳細說明本發明的操 作。熟習相關技藝者當知其中所示的方塊圖與流程圖的每 一方塊,以及方塊圖與流程圖方塊的各種組合, 腦程式指令實現。該些電腦程式指令可由同步 哭 電路、專門用途電腦、或其他可程式化資料處理裝置° 供,以產生一個可由電腦的處理器或其他 理裝置執行的機器指令,藉 =化貝枓處 、、古招岡士治《V + & 精此座生用术執仃方塊圖和/或 W私圖方塊或方塊組合所指定的各種功能。 次 Φ該/、二腦《程、式指令可儲存於-個電腦可讀取記憶體 中’曰v電腦或其他可程式化資料處理 :々 作,藉此使得儲存在雷評叮> % _ 置从特疋方式操 于减仔在電腦可讀取記憶體中的指令,可產生200421180 V. Description of the invention (6) '------- ^ _ It is easy to show it. The following is a description of the preferred embodiment and the accompanying drawings. The detailed description is as follows: The accompanying drawings detail preferred embodiments of the present invention. In the following, the same reference numbers represent the same description and implementation of the original slave / invention 'to fully understand and study the present invention and / or advantages. Other characteristics ^ Those familiar with related arts should know that although the first and second words used here describe various elements, these elements are not limited to this specific-name: nouns are used to distinguish elements. Therefore, the first 70 pieces discussed here can also be a second element in another chapter. Similarly, the second element 4 discussed herein may also be the first element in another chapter, as long as it does not deviate from the spirit of the present invention. /, Those skilled in the art know that the present invention can also be implemented in the form of a circuit, a computer program product, and / or a computer program product. Therefore, the present invention can be implemented by hardware, software, or hardware, and some software. In addition, the present invention can be implemented in the form of a computer program product having computer code stored in the medium on one or two removable media. Any computer-readable medium including a hard disk, cd-ROMs, optical storage device, or magnetic storage device can be applied to the present invention. '® The computer code or code used by the present invention for performing calculations, such as JAVA' Smalltalk iLC + + ^ JavaScript ^ Visual Basic ^ TSQL, Perl object-oriented programming language, or in various other programming languages 200421180 5 2. Description of invention (7). The software embodiments of the present invention are not limited to having to be implemented in a certain programming language. Some code can also run entirely on one or more systems used by an intermediate server. The code can be executed entirely on one or more computer systems, or partly on a server, and the other part on a client configured in a client device Execution 'or even can be used as a proxy server located at an intermediate point of the communication network. In the latter example, the client device may be connected to a server on a local area network (LAN) or wide area network (WAN), such as the Internet, or via the Internet, such as through an Internet service Suppliers). The invention can be implemented with various protocols running on different types of computer networks. The following describes the operation of the present invention in detail with reference to block diagrams and flowcharts for describing methods, systems, and computer program products according to embodiments of the present invention. Those skilled in the art should know each block shown in the block diagrams and flowcharts, and the various combinations of block diagrams and flowchart blocks, which are implemented by brain program instructions. These computer program instructions can be provided by a synchronous cry circuit, a special purpose computer, or other programmable data processing devices, to generate a machine instruction that can be executed by a computer's processor or other processing device. The ancient trick Okazaki "V + & Performs the various functions specified by the block diagram and / or the private map block or combination of blocks. The second, the second and the second brain "program, type instructions can be stored in a computer readable memory," V computer or other programmable data processing: work, thereby making it stored in Lei Dingding>% _ Set the slave mode to the instructions in the computer readable memory, which can generate

200421180 五、發明說明(8) 包括可執行方塊圖和/或流程圖方塊或方塊組合所指定的 各種功能的指令的一個生產物件(article of manufacture ) ° 該些電腦程式指令可載入一個同步多絮處理器電路或 可程式化資料處理裝置,執行一連串操作步驟,以在電腦 或其他可程式化裝置上執行一個由電腦所實現的處理,藉 此在電腦或其他可程式化裝置上執行用來實現方塊圖和/ 或流程圖方塊或方塊組合所指定的各種功能。 本發明實施例提供與在一同步多絮處理器中的絮動作 相關之處理電路。其中,該處理電路係架構成根據同步多 絮處理器目前所操作的絮個數,在各種不同效能水平上 作。熟習相關技藝者當知該些各種不同效能水平可包各 種不同的電路運算速率和或各種不同的精密度 (precision)。在根據本發明的部分實施例中,根據本 明的處理電路可在不同時脈速率下操作,和/或使用= 員而如不同類型的⑽s裝置),以提供不同:效 言’在根據本發明的部分實施例中,ίί 單元或資料快,,可根據同步多洋點 速率的低功率模式下操作。料,當②時脈 加時,可能會降低 因此在降低與絮相關之處理政此水千, 供同步多絮處理器架構上之優勢。 干200421180 V. Description of the invention (8) An article of manufacture that includes instructions that can perform various functions specified by block diagrams and / or flowchart blocks or block combinations ° These computer program instructions can load a synchronous multi- A processor circuit or a programmable data processing device executes a series of operating steps to perform a computer-implemented process on a computer or other programmable device, thereby executing on the computer or other programmable device Implements various functions specified by block diagrams and / or flowchart blocks or combinations of blocks. An embodiment of the present invention provides a processing circuit related to a flocking operation in a synchronous multi-fiber processor. Among them, the processing circuit frame structure operates at various performance levels according to the number of batters currently operated by the Synchronous Buffer Processor. Those skilled in the art should know that these various performance levels can include different circuit operation rates and / or various precisions. In some embodiments according to the present invention, the processing circuit according to the present invention may operate at different clock rates, and / or use different devices such as different types of 装置 s devices) to provide different: In some embodiments of the invention, the unit or data is fast and can operate in a low power mode based on a synchronous multi-ocean point rate. It is expected that when ② the clock is increased, it may be reduced. Therefore, it is necessary to reduce the number of processing related to flocking and provide advantages in the synchronous multi-fiber processor architecture. dry

200421180 五、發明說明(9) ' ' 热習相關技藝者當知本發明的實施例可展現絮層次 (thread-level)之並行運算能力,也就是可用多數個絮, ,並行執行其所内含的多個處理。在此所用的名詞"絮” (thread)可為具有相關指令與資料之個別處理。一個絮 可代表具有多重處理的並行電腦程式一部分的一個處理、 (process)。一個絮可代表一個可與其他程式分離而獨 運算的獨立電腦程式。每個絮可具有一相關動態,例如 對相關指令、資料、程式計數器(Pr〇gram c〇unter)、和/200421180 V. Description of the invention (9) '' As the heat-study related artisans know that the embodiments of the present invention can demonstrate the parallel-level computing capability of thread-level, that is, a plurality of threads can be used to execute the contained content in parallel. Multiple processing. As used herein, the term " thread " (thread) may be an individual process with related instructions and data. A thread may represent a process, which is part of a parallel computer program with multiple processes. A thread may represent a process that can communicate with Other programs are independent and independent computer programs. Each program can have a related dynamic, such as related instructions, data, program counters (Pr0gram c〇unter), and /

iff器(registers)所定義之相關狀態。其中,絮的相 關狀悲可包含由一個處理器執行該絮所需之足夠資訊。 ^根據本發明的部分實施例中,可架構一個效能水平 步多絮處理器中所=供至配置給用來處理在同 水平控制電路可提:一第” 路。舉例而言’效能 電路m 作,亦可提供-第二效能水平給處理 每# π 一功率模式下操作。在根據本發明的其他 只,例中,效能水平控制電路更加提供中級 間的其他效能水平)。(也就疋"於南功率與低功率之 下操在作根:處本理發電明:,部 體的-快取記憶;。 就是在高功率模式)下體疋在第一效能水平(也 否命中,都能同味六刼作時,可不管存取標籤記憶體是 匕同時存取標籤記憶體與資料記憶體。因為標Related states defined by iff registers. Among them, the relevant state of the floe may contain sufficient information required for a processor to execute the floe. ^ According to some embodiments of the present invention, an efficiency level step multi-flop processor can be configured to be provided to be used to process control circuits at the same level. One can be provided: for example, the "efficiency circuit m" Operation, can also provide-the second efficiency level to the processing operation per # π in a power mode. In other examples according to the present invention, the performance level control circuit provides other levels of efficiency between intermediate levels). " Yu Nan power and low power are the roots of the operation: the local haircuts: the body's-cache memory; that is, in the high power mode) the body is at the first level of efficiency (also hits, both When it can be used in the same way, you can access the tag memory and data memory at the same time regardless of whether the tag memory is accessed.

200421180 五、發明說明(ίο) 籤記憶體的命中率可能相當高,所以同時存取資料記憶體 可提供較佳效能。另一方面,快取記憶體亦可在第二效能 水平(也就是在低功率模式)下操作,其中資料記憶體^ ^ 針對命中標籤記憶體的反應而存取。因此,當標籤非命中 (tag m1SS)的情形發生時,可避免與存取資料記憶體相關 之部分功率消耗。此外,當標籤命中(tag hit)的情形發 生時,存取標籤記憶體與存取資料記憶體亦可能 間偏差。 β二饭时 在根據本發明的其他實施例中,與由同步多絮 所操作的絮相關之處理電路,可為一指令快取” ° (instruction cache) > ^ JL # «ι ^ ^ „ 點雷踗式敏iw# ·次/、他類型的處理電路,例如浮 “,電路4整數/载人-儲存電路(integer/1Qad_store 此外’每—處理電路都可在不同效能水平下 :】:二’在根據本發明的部分實施例中,快取記 5!不及浮點電路與整數/載入-儲存電路,ί 了以不同效此水平同時操作。 丨 在根據本發明的其他實施你由 (例如浮點電路與整數二:電:)同 類別’以使得部分電路可設計成專門用:::不同的效能 下操作,而其他處理電路可=用來在第一效能水平 平下操作。舉例而言,在 門用來在第二效能水 置給同步多絮& s < 明的部分實施例中,配 高功率模= = :部分浮點轉,係架構成在 他浮點電路,則俜¥ ^ ^ ζ ^ $多絮處理器中的絮的其 係木構成在低功率模式下操作。 200421180 五、發明說明(ll) 第2圖係顯示一個根據本發明實施例的同步多絮處理器 的方塊圖。請參考第2圖所示,當在一個同步多絮處理零 20 0中建立一新絮時,一個絮管理電路2〇5會配置一組處理 電路給新建立的絮使用。所配置的處理電路可包括一個程 式計數器2 1 5、一組浮點暫存器2 4 5、以及一組整數暫存5| 2 5 0。其他處理電路亦可配置給新建立的絮使用。熟習相 關技藝者當知當絮處理結束時,配置給該絮使用的處理電 路應該釋出,以使其可再被配置給後續建立的其他絮使 用。 在其操作中,首先一個擷取電路(fetch circuit)21() 會根據由所配置的程式計數器21 5所提供的一位置資訊, 從一 4曰令快取2 2 0擷取一指令,並且將其輸出至一解碼器 2 2 5。解碼器2 2 5將一個解碼過的指令,輸出至一個暫存器 更名電路(register renaming circuit) 230。根據暫存器 更名電路230所輸出的指令類型,更名過的指令會被輸出 至一洋點指令佇列(floating p〇int instructi〇n叫⑽㈧ 235 或一整數指令佇列(integer instruction q^eUe)240。舉例而言,如果暫存器更新電路23()所輸出的 指令類型為一浮點指令,則該指令會載入浮點指令佇列 235,而如*暫存器更新電路23〇所輸出的指令類型為一整罾 數指令’則該指令會載入整數指令佇列24〇。 八從浮點指令佇列235或整數指令佇列240所輸出的指 令,會載入一個相關暫存器,以使其由一個對應的 路255或整數/載入—儲存電路26〇執行。較明確地說,浮點200421180 V. Description of the Invention (ίο) The hit rate of the signed memory may be quite high, so simultaneous access to the data memory can provide better performance. On the other hand, the cache memory can also be operated at the second performance level (that is, in the low power mode), where the data memory ^ ^ is accessed in response to hitting the tag memory. Therefore, when a tag mismatch (tag m1SS) occurs, part of the power consumption associated with accessing the data memory can be avoided. In addition, when a tag hit occurs, there may be a deviation between accessing the tag memory and accessing the data memory. β Second meal In other embodiments according to the present invention, the processing circuit related to the blobs operated by the synchronized blobs may be an instruction cache "° (instruction cache) > ^ JL #« ι ^ ^ „points雷 踗 式 敏 iw # · Time /, other types of processing circuits, such as floating ", circuit 4 integer / manned-storage circuit (integer / 1Qad_store In addition 'each-processing circuit can be at different performance levels:]: two 'In some embodiments according to the present invention, the cache is 5! It is not as good as the floating-point circuit and the integer / load-storage circuit, so it can operate at different levels at the same time. 丨 In other implementations according to the present invention, you can use ( For example, the floating-point circuit and the integer two: electricity :) the same type, so that some circuits can be designed to specifically use ::: operate at different performance, while other processing circuits can be used to operate at the first performance level. For example, in some embodiments where the gate is used to place the synchronous multi-amplifier & s < at the second performance level, with a high power mode ==: part of the floating-point rotation, the frame constitutes another floating-point circuit. , Then 俜 ¥ ^ ^ ζ ^ $ The system structure operates in low power mode. 200421180 V. Description of the Invention (ll) Figure 2 shows a block diagram of a synchronous multi-fiber processor according to an embodiment of the present invention. Please refer to Figure 2 when the When a new batch is created in Synchronous Multi-Blot Processing, the batch management circuit 205 will configure a set of processing circuits for the newly created batch. The configured processing circuit may include a program counter 2 1 5. The floating-point register 2 4 5 and a set of integers temporarily store 5 | 2 5 0. Other processing circuits can also be configured for the newly created floc. Those skilled in related arts will be allocated to the floc when it is finished The processing circuit used should be released so that it can be re-configured for subsequent use in other processes. In its operation, a fetch circuit 21 () is first based on the programmed counter 21 5 The provided position information fetches an instruction from a 4 command cache 2 2 0 and outputs it to a decoder 2 2 5. The decoder 2 2 5 outputs a decoded instruction to a temporary Register rename circuit renaming circuit) 230. According to the type of instruction output from the register rename circuit 230, the renamed instruction will be output to a foreign point instruction queue (floating p〇int instructi〇n called ⑽㈧235 or an integer instruction queue ( integer instruction q ^ eUe) 240. For example, if the type of instruction output by the register update circuit 23 () is a floating-point instruction, the instruction will load the floating-point instruction queue 235, and if * is temporarily stored The type of instruction output by the processor update circuit 23 is an integer instruction ', and the instruction will load the integer instruction queue 24. The instruction output from the floating-point instruction queue 235 or the integer instruction queue 240 is loaded into a relevant register so that it can be executed by a corresponding path 255 or integer / load-store circuit 26. More specifically, floating point

200421180 五、發明說明(12) 々曰令會從浮點指令狩列2 3 5,彳查、* $ , ^ 在浮點暫存号245中的Π :送至一組洋點暫存器“5。 ΐ Λ可存取儲存在資料快取265中的浮點資肖,以使ί 二路255 (從浮點暫存器245)執行指令時,可參考儲 存在―貝料快取2 6 5中的資料。 哼储 川整令會從整數指令件列240,傳送至整數暫存器 250中^整數载户入八一儲存電路260可存取儲存在整數暫存器 26。亦可存取該些指令。整數/載人-儲存電路 中的敕Λ入 / 以使得儲存在整數暫存器25〇 二的正數&令’可參考儲存在資料快取m中的整數資 供Λ據Λ發明實施例’絮管理電路205對資料快取265提 2^'二人較明確、地說,效能水平可控制資料快取 ,效旎水平或第二效能水平(也就是在高功率模 模式)下操作。舉例而言,絮管理電路205可 #二,A ^ =水平,以使得資料快取265在高功率模式下 低Λ瘟二:提供第二效能水平,以使得資料快取265可在 265的=下操作。熟習相關技藝者當知雖然資料快取 明疋士以第一效能水平或第二效能水平在此做說 據本發明的部分實施例中,亦可使用其他 效能水平。 第3圖係顯示一個根據本發明實施例的絮管理電路的方 塊圖。請參考第3圖所示,絮管理電路3〇5從作業系統 (operating system),或從與在同步多絮處理器中所建立 13145pif.ptd 第17頁 五、發明說明(13) 的一絮相關之一個却 包括—個絮配置電跋Γ,電路接收資訊。絮管理電路305 用來將根據本發明的^ read allocation circuit)330, 所建立的絮使用。的處理電路,配置給由同步多絮處理器200421180 V. Description of the invention (12) The 々 令 will list 2 3 5 from the floating-point instruction, check, * $, ^ Π in the floating-point temporary number 245: Send to a group of foreign-point temporary registers " 5. ΐ Λ can access the floating-point asset stored in the data cache 265, so that when the two-path 255 (from the floating-point register 245) executes the instruction, it can refer to the storage in ―Shell Cache 2 6 The data in 5. The hum Chuchuan order will be sent from the integer instruction list 240 to the integer register 250. The integer is stored in the Bayi storage circuit 260 and can be stored in the integer register 26. It can also be stored Take these instructions. Integer / man-in-storage circuit 入 Λ 入 / so that the positive number stored in the integer register 2502 & order can refer to the integer data stored in the data cache m for data Λ Embodiment of the invention 'The management circuit 205 mentions the data cache 265 2 ^' The two are more specific, that is, the performance level can control the data cache, the efficiency level or the second efficiency level (that is, in the high-power mode) ). For example, the flocking management circuit 205 can be # 2, A ^ = level, so that the data cache 265 is low in high power mode: Provides a second level of performance, so that the data cache 265 can be operated at 265 =. Those skilled in the art should know that although the data cache can be used here at the first or second efficiency level, according to the present invention In some embodiments, other performance levels can also be used. Fig. 3 is a block diagram showing a flocking management circuit according to an embodiment of the present invention. Please refer to Fig. 3, the flocking management circuit 3005 is from the operating system ( operating system), or one that is related to a trumpet of the invention description (13), which is established in a synchronous multi-flop processor. The management circuit 305 is configured to use the processing circuit established by the read allocation circuit 330 according to the present invention, and configure the processing circuit to be used by the synchronous multi-fiber processor.

用來絮提V理么路:05更加包括-個效能水平控制電路340, 之處理電路同步多絮處理器所建立的絮相關 器目前平控制電路34〇可根據同步多絮處理 :地呤,:-的絮個數’提供效能水平給處理電路。較明 4γ I二同步多絮處理器目前所操作的絮個數增加時, ί:水t控制電路3 4 〇可提供遞減的效能水平給與同步多 目刖所操作的絮相關之處理電路。藉由響應由同 V夕i處理器所操作的絮的建立與結束,效能水平控制電 路340可增加或減少一個内部記數值,藉此決定同步 處理器目前所操作的絮個數。 ” 熟習相關技藝者當知提供給根據本發明的處理電路的 效能水平,可能具有如第一效能水平(或高功率模式)的一 個系統預設值(default value)。因此,當絮加入時,提 供給處理電路的效能水平可降低,以降低效能與處理電路 的消耗功率。熟習相關技藝者當知效能水平可經由一訊麥 線提供至處理電路,其中該訊號線可傳導至少具有第一^ 能水平及第二效能水平兩狀態的一個訊號。舉例而言,在 剛初始化(i n i t i a 1 i ze )同步多絮處理器之後,由同步多絮 處理器所操作的絮個數為零,其中提供給處理電路的效能 水平的系統預設值為系統預設的第一效能水平(高功率模It is used to improve the performance of the V system: 05. It also includes an efficiency level control circuit 340. The processing circuit is established by a synchronous multi-fiber processor. The current flat control circuit 34 can be processed according to the synchronous multi-floc: :-The number of batts' provides the efficiency level to the processing circuit. It is clear that when the number of batts currently operated by the 4γI two-synchronous multi-fiber processor is increased, the water control circuit 3 4 0 may provide a decreasing level of performance to the processing circuit related to the blobs operated by the multi-synchronous device. By responding to the establishment and termination of the blobs operated by the same processor, the performance level control circuit 340 may increase or decrease an internal count, thereby determining the number of blobs currently operated by the synchronous processor. When a person skilled in the art knows that the efficiency level provided to the processing circuit according to the present invention may have a system default value such as the first efficiency level (or high power mode). Therefore, when the floc is added, The efficiency level provided to the processing circuit can be reduced to reduce the efficiency and power consumption of the processing circuit. Those skilled in the art will know that the performance level can be provided to the processing circuit through a Xunmai line, where the signal line can conduct at least the first ^ A signal with two states of energy level and second performance level. For example, just after initializing (initia 1 i ze) the synchronous multi-fiber processor, the number of flocks operated by the synchronous multi-fiber processor is zero, which provides The system preset value for the performance level of the processing circuit is the first performance level preset by the system (high power mode

13145pif.ptd 第18頁 200421180 五、發明說明(14) 式)。當絮加入而且其總數最後超過一臨界值時,藉由改 變用來指不哪一個效能水平將被使用的訊號的狀態,可將 效能水平變成第二效能水平。 第4圖係顯不一個根據本發明實施例的效能水平控制電 路的方塊圖。請參考第4圖所示,一個計數器電路4〇5可從 作業系統或參考第3圖所說明的絮產生電路接收資訊,藉 以决疋同步夕絮處理器目前所操作的絮個數。如果當所接 收到的資訊是與建立新絮有關時,舉例而言,計數器電路 405會扣出同步多絮處理器先前已經啟動四個絮,接下來 計數器電路405會增加其計數,以反映同步多絮處理器目 前所操作的絮個數為五個的事實。 計數器電路405可將同步多絮.里器目冑所操作的絮個 數,提供至一個比較器電路41〇。一個臨界值也會連同同 二/^絮。處兮理合器目别所操作❸絮個*,一起提供、給比較器電 I ^ 界值可為用來指示超過該絮個數效能水平即 ^ ^ ^ ^ ? 口此,*同步多絮處理器目前所操 作的I個數係小於岑黧% η士 寺 界值時,提供給處理電路的效 :ί Ϊ:ΐ 高功率模式的第-效能水平。然*,當 二二i i &理器目*所操作#絮個數超過臨界值時效能水 千可降低,以降低同步多絮處理器的消耗功率。 平用來說明根據本發明實施例的效能水 干徑制冤路的流程圖。诗灸去楚e 半夕梦考柯π # 明參考弟5圖所示,當剛初始化同 步夕養處理裔時,同舟之却♦ 了田明 j ν夕戈處理器目前所操作的絮個數為 零(方塊500)。當絮為印丰夕却全 '、在门ν夕絮處理器中建立及結束處理 13145pif.ptd 第19頁 200421180 五、發明說明(15) 時’同步多絮處理器目前所操作的絮個_,就會 減少(方塊505)。舉例而言’當同步多絮處理器目前操作 四個絮時,N的值為4。當新建立—個絮時,N的值增加為 5 ’而當接下來該些絮的其中一絮結束_,N的值會·、變 回4。 •义13145pif.ptd Page 18 200421180 V. Description of the invention (Formula 14)). When the floc is added and its total number finally exceeds a critical value, the performance level can be changed to a second performance level by changing the state of the signal used to indicate which performance level is to be used. Fig. 4 is a block diagram showing an efficiency level control circuit according to an embodiment of the present invention. Please refer to FIG. 4, a counter circuit 405 can receive information from the operating system or the flocculation circuit described with reference to FIG. 3 to determine the number of flops currently operated by the synchronization processor. If the received information is related to the creation of a new flop, for example, the counter circuit 405 will deduct the synchronous multi-flop processor that has previously started four flops, and then the counter circuit 405 will increase its count to reflect the synchronization The fact that the number of flocs currently operated by the multi-fiber processor is five. The counter circuit 405 can provide the number of clocks operated by the synchronous multi-channel controller to a comparator circuit 410. A cut-off value will also be included in the same value. The operating unit of the processing unit is provided with a *, which is provided together with the power of the comparator. The cut-off value can be used to indicate that the efficiency level of the number of units is exceeded, that is, ^ ^ ^ ^? When the number of I currently operated by the device is less than the threshold of Cen Yi% η Temple Temple, the effect provided to the processing circuit is: Ϊ Ϊ: ΐ The first-efficiency level of the high power mode. However, when the number of operations performed by the two or more processors exceeds a critical value, the performance level can be reduced to reduce the power consumption of the synchronous multi-processor. Ping is used to explain the flow chart of the efficient water supply system based on the embodiment of the present invention. Poetry and moxibustion go Chu e Banxi Mengkao Ke ## As shown in Figure 5 of the reference brother, when the synchronization Xiyang treatment family has just been initialized, the same boat did. ♦ The number of current operations of Tian Ming j ν Xige processor Zero (block 500). When the card is printed and printed by Feng Xi, but the processing is set up and finished in the door processor. 13145pif.ptd Page 19 200421180 V. Description of the invention (15) 'The multi-chip processor currently operated by the card_ Will decrease (block 505). For example, 'when the synchronous multi-flop processor currently operates four flocks, the value of N is four. When a batt is newly established, the value of N is increased to 5 ', and when one of the following blobs ends, the value of N will change to 4. • Righteousness

同步多絮處理器目前所操作的絮個數,會與一個臨 ,相比較(方塊51〇)。如果同步多絮處理器目前所操作的 土個J小於或等於臨界值’則效能水平控制電路會提供的第 一效能水平給配置給絮的處理電路(方塊515)。舉例而 =如果配置給絮的處理電路是參考第2圖說明的快取記 L也二該/Λ記:體可同時存取標籤記憶體與資料記憶 =(也就疋在向功率模式下操作)。另一方面,如果同步多 理器目前所操作的絮個數係大於臨界值(方塊510), :水平控帝】電路會提供第二效能水平給與絮才目關之處理 :路(方塊520 )。舉例而言,在上述參考第2圖的實施例 塑寤Ϊ Ϊ記憶體可在第二效能水平下操作,以使得只有在 :中標戴記憶體的情況下’才會存取資料記憶體(也 就疋在低功率模式下操作)。 第6 ^係顯不一個根據本發明實施例的快取記憶體的方 圖明參考第6圖所示,一個標籤記憶體6 1 0係架構成用 ,儲存在一個資料記憶體62〇中所儲存的資料的位址。標 广己憶體61 G是用與即將由同步多絮處理器所處理的資料 j =之位址所存取。一個標籤比較電路63〇會將標籤記憶 "〇中的項目(entries)與該位址相比較,以決定同步多The number of batters currently operated by the synchronous multi-battle processor will be compared with that of a bat (block 51). If the number of J currently operated by the synchronous multi-fiber processor is less than or equal to the critical value ', the performance level control circuit will provide the first performance level to the processing circuit allocated to the flock (block 515). For example, if the processing circuit configured for the cache is the cache entry L described with reference to FIG. 2, the / Λ record: the body can access the tag memory and the data memory at the same time = (that is, it operates in the power mode) ). On the other hand, if the number of loops currently being operated by the Synchronous Multiplexer is greater than the critical value (block 510), the: level control emperor] circuit will provide a second level of performance for the processing of the loop: block (block 520 ). For example, in the embodiment described above with reference to FIG. 2, the memory can be operated at the second performance level, so that the data memory will be accessed only when: Just operate in low power mode). The sixth block is a block diagram of a cache memory according to an embodiment of the present invention. Referring to FIG. 6, a tag memory 610 is used for frame construction and is stored in a data memory 62. The address of the stored data. The standard memory 61G is accessed with the address of the data j = to be processed by the synchronous multi-processor. A tag comparison circuit 63 will compare the entries in the tag memory " with the address to determine the synchronization

200421180 五、發明說明(16) 絮處,器所需要的資料是否儲存在資料記憶體62〇中。如 果標籤比較電路630決定所需的資料確實儲存在 體62〇中’即代表標鐵命中。否_,代表標以隐 中。如果標籤命中的情形發生,一個輸出啟動電路 (output enable circuit)650 就會啟動(enabie)將從資料 記憶體620輸出的資料。 、200421180 V. Description of the invention (16) Whether the data required by the device is stored in the data memory 62. If the tag comparison circuit 630 decides that the required data is indeed stored in the body 62 ', it means that the iron is hit. No_, it means hidden. If a tag hit occurs, an output enable circuit 650 will enable the data output from the data memory 620. ,

▲ ▲根據本發明的實施例,由效能水平控制電路所提供的 效=水平,疋用來控制標籤記憶體6丨〇與資料記憶體6 2 〇如 何刼作。較明確地說,如果提供給快取記憶體電路是一個 第一效能水平,則一個資料記憶體啟動電路64〇會不管是 ,發生標籤命中,都會啟動即將與標籤記憶體61〇同時被 存取的資料記憶體62〇。相反的,如果提供給快取記憶體 的是一個第二效能水平,則除非有標籤命中,否則資料記 憶體啟動電路640不會允許存取資料記憶體62〇。 〃因此,在根據本發明的實施例中,在高功率模式下, 標籤記憶體610與資料記憶體620可同時被存取,藉P此提供 更佳效能,然而當在低功率模式時,只有在標 61〇指出標籤命中時,才會存取資料記憶體62〇,因^可降 低快取記憶體的消耗功率。 第7圖係顯示一個根據本發明實施例的指令快取的方塊 圖。請參考第7圖所示,絮管理電路7〇〇將一個指令快取 722,配置給一個新絮。包含在絮管理電路3〇〇中的效能水 平控制電路提供一個效能水平給指令快取722,以控制指 令快取7 2 2如何操作。 工▲ ▲ According to the embodiment of the present invention, the effect = level provided by the performance level control circuit is used to control how the tag memory 6 and the data memory 6 2 0 work. More specifically, if the cache memory circuit is provided with a first performance level, a data memory activation circuit 64 will start to be accessed at the same time as the tag memory 61 regardless of whether a tag hit occurs. Data memory 62. In contrast, if a second level of performance is provided to the cache memory, the data memory activation circuit 640 will not allow access to the data memory 62 unless there is a tag hit. 〃 Therefore, in the embodiment according to the present invention, in the high-power mode, the tag memory 610 and the data memory 620 can be accessed at the same time, thereby providing better performance. However, when in the low-power mode, only the The data memory 62 will be accessed only when the label 61 indicates a tag hit, because the power consumption of the cache memory can be reduced. FIG. 7 is a block diagram showing an instruction cache according to an embodiment of the present invention. Please refer to FIG. 7, the cache management circuit 700 allocates an instruction cache 722 to a new cache. The performance level control circuit included in the flop management circuit 300 provides an efficiency level to the instruction cache 722 to control how the instruction cache 722 operates. work

200421180 五、發明說明(17) 較明確地說,指令快取722可響應第一效能水平,在高 功率模式下操作,亦係架構成響應第二效能水平,在低功 率模式下操作。如參考第5圖的上述說明,可根據同步多 絮處理器目前所處理的絮個數,提供第一效能水平或第二 效能水平給指令快取722。此外,指令快取722亦可以與參 考第6圖的上述說明相似的方式,在不同效能水平下操 作。其中在低功率模式下,資料記憶體620只有在響應命 中的情況下才可被存取。舉例而言,當已經判定接下來的 5己fe、體存取為存取相同的快取線(c a c h e 1 i n e )時,可提供 不同效能水平給指令快取,以允許其執行直接存取。這種4 類型的限制可用一個可允許讀取標籤隨機存取記憶體 (Random Access Memory,RAM)的直接-定址快取 (direct-addressed cache)而避免,此法亦可避免執行標 载比較。此外’在直接—定址快取中,亦可避免虛擬位址 轉換為實際位址的運算。 第8圖係顯示根據本發明實施例具有不同效能水平的獨 立處理電路的方塊圖。請參考第8圖所示,一個第一浮點 電路805係架構成在第一效能水平下操作,而一個第二浮 點電路8 1 5則架構成在低於第一效能水平的一個第二效能 水平下操作。換言之,第一浮點電路8〇5可適用於高功率 模式’而第二浮點電路81 5則可適用於低功率模式。 一個第一整數/載入-儲存電路81〇係架構成在第一效能 水平下執行處理,而一個第二整數/載入—儲存電路82〇則 係架構成在第二效能水平下執行處理。一個絮管理電路200421180 V. Description of the invention (17) More specifically, the instruction cache 722 can respond to the first performance level and operate in the high power mode, and also the frame constitutes the response to the second performance level and operate in the low power mode. As described above with reference to FIG. 5, the first performance level or the second performance level may be provided to the instruction cache 722 according to the number of blobs currently being processed by the synchronous multiprocessor. In addition, the instruction cache 722 can be operated at different performance levels in a similar manner to the above description with reference to FIG. 6. Among them, in the low power mode, the data memory 620 can be accessed only in response to a hit. For example, when it has been determined that the next 5 fe, body access is to access the same cache line (c a c h e 1 i n e), different performance levels may be provided to the instruction cache to allow it to perform direct access. This type 4 limitation can be avoided by a direct-addressed cache that allows reading random access memory (RAM) of the tag. This method can also avoid performing tag comparison. In addition, in the direct-address caching, the operation of converting the virtual address to the actual address can be avoided. Fig. 8 is a block diagram showing independent processing circuits having different performance levels according to an embodiment of the present invention. Please refer to FIG. 8. A first floating-point circuit 805 is configured to operate at a first efficiency level, and a second floating-point circuit 8 1 5 is configured to be a second lower than the first efficiency level. Operate at performance levels. In other words, the first floating-point circuit 805 is applicable to the high power mode 'and the second floating-point circuit 815 is applicable to the low power mode. A first integer / load-storage circuit 810 constitutes the processing performed at the first performance level, and a second integer / load-storage circuit 820 constitutes the processing performed at the second performance level. Floc management circuit

13145pif.ptd 第22頁 200421180 五、發明說明(18) — 8」〇么架,構成提供兩個不同的效能水平。較明確地說,第 b 平係^^供至第一浮點電路805與第一整數 儲存電路810。由絮管理雷謂〇數/載入 ^ ^ 5 ^ .田^ &理電路8〇〇所誕供的第二效能水平係 二洋點電路815與第二整數/載入-儲存電路820。 二置點電路805與第一整數/載入-儲存電路81〇, 815盘第效能水平下操作的絮,而第二浮點電路 水平下載入-儲存電路82〇,可配置給在第二效能 二效能由1”:相關技藝者當知第-效能水平及第 當知當需要盆他電路800分別或同時提供。亦應 μm 平時,所提供的浮點電路與整數/ 載入-儲存電路的個數可大於兩個。电格—數/ 根據本發明實施彳列,當同步 ^ 絮個數係小於或等於第_ '、&器目則所操作的 =與第一整數/載入,電路心 冋步多絮處理器目前所操作的絮^政月:水千。當 應該對第二浮點電路8丨5鱼第二敕:匕弟一 界值時, 供第二效能水平。因此,卷同一+^數入-儲存電路820提 絮個數超過臨界值時,所^ /夕絮處^理器目前所操作的 絮)都可使用第二浮點電路8/5 :第前2與新建立的 820孰多絮處理器㈣耗功存電路 入-儲》Λ:技;= 浮點電路與整數/載 和/或不同電路類型(例 此水平的不同時脈速率, 例而言,在根據本發明的口 =廳裝置)操作。舉 刀貫細例中,與同步多絮處理 13145pif.ptd 第23頁 200421180 五、發明說明(19) 器中的絮動作相關之淳點雷政, 前所操作的絮個數,在t時脈 同步多絮處理器目 時脈速率的低功率模式;操二…、、而功率模式或是在低 第9圖係顯不一個包括;/κι # g # 器實施例的方塊圖:ϊ;;=電路的同步多絮處理 路900所楛供中處電路會響應由絮管理電 峪川U所棱供的不同效能水平电 電路900提供三種不同效能水平 理13145pif.ptd Page 22 200421180 V. Description of the invention (18) — 8 "〇 The frame, which provides two different levels of performance. More specifically, the b-th plane is supplied to the first floating-point circuit 805 and the first integer storage circuit 810. The second efficiency level provided by the management circuit 0 / load ^ ^ 5 ^. ^ ^ Circuit 8000 is a two-point circuit 815 and a second integer / load-store circuit 820. The two set point circuits 805 and the first integer / load-storage circuits 810, 815 operate at the first performance level, and the second floating-point circuit level is downloaded to the storage circuit 820, which can be configured for the second performance The second performance is from 1 ”: the relevant artist should know the first-efficiency level and the first when the other circuit 800 is required to be provided separately or at the same time. It should also be μm, the floating-point circuit and the integer / load-storage circuit provided The number can be greater than two. The grid-number / queue is implemented according to the present invention. When the number of synchronizations is less than or equal to the _ ', & The circuit's current multi-flop processor is currently operating on a monthly basis: water thousand. When the second floating-point circuit should be used, the second performance level is the second efficiency level. Therefore, When the number of times of the same + ^ number input-storage circuit 820 exceeds the critical value, all the ^ / evening processor (currently operated by the processor) can use the second floating point circuit 8/5: the first 2 With the newly established 820 multi-flop processor, the power consumption circuit is stored-stored Λ: technology; = floating-point circuit with integer / load and / or different circuit types (For example, different clock rates at this level, for example, in the mouth = hall device according to the present invention) operation. In the detailed example of the knife, and the simultaneous multi-float processing 13145pif.ptd page 23 200421180 5. Description of the invention ( 19) The point related to the action of the controller is the low-power mode of the clock rate of the multi-processor processor synchronized at t clock; In the lower figure 9 is shown a block diagram of the / κι # g # embodiment of the device: ;;; = The circuit of the synchronous multi-float processing circuit 900 is provided in the middle of the circuit and will respond to the management of the electric power. The provided electrical circuit 900 provides three different levels of efficiency.

資:快取965、第一及第二浮點電路9〇=取:、-個 及第二整數/载入—儲存雷政qin Q9n ,915以及第一 知提供給第H ΐ電熟習相關技藝者當 一 、 第手點電路9〇5, 915與提供給第一乃筮 8圖的ρ、+u η ,920的效能水平,可以參考第 _ 日*式操作。此外,資料快取965與指令快取 930’可分別以參考第游^r7lg!AAL丄 、取 /亏第2圖與第7圖的上述說明方式操作。 因此,可對不同處理電路提供不同效能 不同效能水平下操作,#此更能有二處理 功率消耗之間的權衡得失(tradeQf f卜舉例而言1 j 取3在第一效能水平下操作’而資料快取265與第-V笛、 二洋點電路905,915,以及第一及第二整數/载 :9H’ 920,可在第二效能水平下操作。此 = 其他效能水平的組合。 了使用 第1 〇圖係顯示一個用來說明包含在第9圖所示 電路90G中的效能水平控制電路實施例動作的方塊圖、。吕 明確地說,效能水平控制電路包括下列組件。一個β 乂 1 0 0 0,該計數器1 〇 〇 〇響應同步多絮處理器中新建立與妗^Information: cache 965, first and second floating-point circuits 90. == :, one, and second integers / loading-store Leizheng qin Q9n, 915, and the first knowledge provided to the Hth electrician to familiarize with relevant skills For the first and first hand point circuits 905, 915 and the ρ, + u η, 920 efficiency levels provided to the first 筮 8 figure, you can refer to the _ day * operation. In addition, the data cache 965 and the instruction cache 930 'can be operated in the manner described above with reference to FIG. ^ R7lg! AAL 丄, fetching / lossing FIG. 2 and FIG. 7, respectively. Therefore, different processing circuits can be provided with different performances and different performance levels. #This can have a trade-off between the power consumption of the second processing (tradeQf f. For example, 1 j takes 3 to operate at the first performance level. The data cache 265, the -V flute, the second foreign point circuit 905, 915, and the first and second integer / load: 9H '920, can operate at the second performance level. This = a combination of other performance levels. Figure 10 is used to show a block diagram illustrating the operation of an embodiment of the performance level control circuit included in the circuit 90G shown in Figure 9. Lu specifically stated that the performance level control circuit includes the following components. A β 乂1 0 0 0, the counter 1 000 responds to the new establishment and synchronization in the multi-fiber processor.

200421180 五、發明說明(20) 處理的絮的變& ’增加或減少其計數。第 1015, 1 020, 1 225,其中每一 」,二暫存器 步多絮虛捜哭曰今糾南m U —专评15可儲存對應於同 處理的絮個數的-個別臨界值。:個 Πί:臨界值的第-暫存器m5,=性ί:ίΓ:比 *器電路1 030。儲存第二臨界值的第 =第: 性連接至第二比較器電路1 035。儲!; 〇係: 存器叫5,係電性連接至第三比較器電㈣=值的第三暫 母一該些比較器電路1〇3〇,1〇35, 1()4(), 絮處理器目前所操作的絮個數,與 會將门乂夕 的如果第一比較器電路 ΐ = 乍絮個數係大於儲存在第-暫存器1015 中的第一臨界值,則第一比較器電路1〇3〇會產生 二所Λ,♦輸出至資料快取965的效能水平1045。因此,ί =多絮處理器目前所操作的絮個數超過儲存在第一暫; ° 的臨界值時,資料快取965中的效能水平,會 平變為第二效能水平(也就是從 = 低功率模式)。 1、八I兩 如果第二比較器電路1〇35決定同步多絮理 ΐ:的t Γ數係大於儲存在第二暫存器1 〇2°中的臨界則值, n匕較器電路1〇35會產生一個輸出值 水平mo。因此,當同步多絮處理器目前所操作30的的 數超過儲存在第二暫存器1 020中的臨界值時’指令快取 200421180 五、發明說明(21) 匕3〇的效能水平,會從第一效能水平 就是從高功率模式變為低功率模式)。, 如果第三比較器電路1〇4〇決定同 才呆作的絮個數係大於儲存在 Μ处15目别所 則笛一仏“祖; 于牡弟一暫存器1025中的臨界值, 則第二比較器電路1 040會產芥值 電路905, 915,以及第一 第,第二净點 920的效能水平1〇55。因此,第;:气’J二儲存電路㈣, 作的絮個數超過儲存在第三暫>;v; 0夕2 ^處理^目前所操 些處理電路的效能水平,合^器1 025中的臨界值時,該 水平(也就是從高功率模式變為 ^第一效此 藝者當知輸出至浮率模式)。#習相關技 平1〇55,整數/載入-儲存電路的效能水 係以參考弟8圖的上述說明方式操作。 控制^路圖膏:顯"V個用來說明如第10圖所示的效能水平 徑制電路實她例方法的流程 初始化同步多絮處理器•丄;::二::剛 _裡〜1、:(方鬼11〇〇)。§同步多絮處理器建立新絮或絮 成時,同步多絮處理器目前所操 1加 ==提供一個代表同步多絮處理器目前。二 %個數的個數N(方塊11〇5)。 第一如广灭同括步多絮處理器目前所操作的絮個數小於或等於 =一古 方塊1110),則所有處理電路會持續保持在第 夕同)效=水平上操作(方塊1115)。另一方面,如果同步 理器目前所操作的絮個數超過第一臨界值(方塊 ,則效能水平1 〇 4 5所輸入的處理電路會開始以第二 13145pif.ptd 第26頁 200421180 五、發明說明(22) 政月b水平(也就是低功率模式)操作(方塊1 1 2 〇 )。 當一如#果黃同步多絮處理器目前所操作的絮個數小於或等於 (方塊1125),則當則效能水平聰(如上所述) ==盘理電路持續保持以第二效能水平操作時,效能200421180 V. Description of the invention (20) The change of the treated floc & ′ increases or decreases its count. Nos. 1015, 1 020, 1 225, each of which is "two registers", and said that the current correction of the south m U — Special comment 15 can store the individual thresholds corresponding to the number of flocs treated in the same process. : Individual Πί: the first-register m5 of the critical value, = sex ί: ίΓ: ratio * circuit 1 030. The second threshold of the second threshold is stored in the second comparator circuit 1 035. Save! 〇 Series: The register is called 5, which is the third temporary mother that is electrically connected to the third comparator. The comparator circuits are 1030, 1035, 1 () 4 (). The number of processors currently operated by the processor will be the first comparator if the first comparator circuit ΐ = the number of processors is greater than the first critical value stored in the first register 1015, then the first comparator The circuit 1030 will generate two Λ, ♦ output to the data cache 965 at a performance level of 1045. Therefore, when the number of flocs currently operated by the multi-flop processor exceeds the threshold value stored in the first temporary; °, the performance level in the data cache 965 will flatten to the second performance level (that is, from = Low power mode). 1. Eight and two. If the second comparator circuit 1035 decides the synchronous multi-processing, the number of t Γ is greater than the critical value stored in the second register 1 0 °, and the n comparator circuit 1 〇35 will produce an output value level mo. Therefore, when the number of simultaneous multi-flop processors currently operating 30 exceeds the critical value stored in the second register 1 020, the 'instruction cache 200421180 V. Description of the invention (21) The performance level of 30 From the first performance level is to change from high power mode to low power mode). If the third comparator circuit 1040 decides that the number of co-workers is larger than the threshold value stored in the 15-mesh place at the M; the threshold value in the register 1025 of the first brother, Then the second comparator circuit 1 040 will produce must-have circuits 905, 915, and the first and second net points 920 at an efficiency level of 1055. Therefore, the first and second storage circuits 气, 气 ', and' 'storage circuits are The number exceeds the performance level of the processing circuits currently operated in the third temporary >v; 0 ^ 2 ^ processing ^, when the threshold value in the combiner 1 025, the level (that is, changed from high power mode For ^ the first effect, the artist should know the output to the float mode). # Xi Related Techniques level 1 055, the performance of the integer / load-storage circuit is based on the method described above with reference to Figure 8. Control ^ Road map paste: display " V is used to explain the efficiency of the horizontal control circuit as shown in Figure 10, the method of initialization of the synchronous multi-floating processor. 丄; :: 二 :: 刚 _ 里 ~ 1, : (Fang Gui 1100). § When a new multi-fiber processor is established or new, the synchronous multi-fiber processor currently operates 1 plus == provides a representative Step multi-flop processor currently. The number of 2% of the number N (box 1105). First, such as the elimination of the same step multi-flop processor is currently operating less than or equal to the number of blocks = an ancient block 1110) , All processing circuits will continue to operate at the same efficiency level (block 1115). On the other hand, if the number of clocks currently operated by the synchronization processor exceeds the first threshold (block, the efficiency level is 1) 〇4 5 The input processing circuit will start to operate at the second 13145pif.ptd page 26 200421180 V. Description of the invention (22) Political month b level (that is, low power mode) operation (block 1 1 2 0). When as # 果 黄 Synchronous multi-fiber processor is currently operating less than or equal to the number of blocks (block 1125), then when the performance level is Cong (as described above) == the management circuit continues to operate at the second performance level, the performance

^ 50( ”效能水平1 055 )所輸入的處理電路會開X 且持續保持)以第一效能水平操作(方塊"二路曰開始(並 界傕^ 5 3 絮處理器目前所操作的絮個數超過第二臨 盥效能水平1〇2,則效能水平1 050所輸入的處理電路連同 it伴持)以第?^入的處理電路,會一起開始(並且持 贿&入^能水平操作(方塊1135),而效能水平 如果:Λ處理電路’則會以第-效能水平操作。 一第=臨絮處理器目前所操作的絮個數小於或等於 電方塊1140) ’則效能水平1 055所輸入的處理 效能水平操作(方處理電路,則會持續保持以第二 7, 、乍(方塊Η 4 5)。如果同步多絮虛理哭 a 作的絮個數超過第二系處理目别所操 所輸入處理電路:^,值(方塊”4〇) ’則效能水平_ (j, θ έ開始(並且持續保持)以第二效能水平 (也=疋在低功率模式下)操作(方塊ii5G)。 “水千 處理器上中所的VJ:據本發明的實施例可提供與在同步多絮 電路係竿構上ΪΓ關之複數個處理電路。其中該些處理 數,以不、同步多絮處理器目前所操作的絮個 分實施例;二:操!:舉,而言’在根據本發明的部 /手點單元或負料快取的與在同步多絮處 13145pif.ptd 第27頁 五、發明說明(23) 理器中的絮動作相關之處理電路 目前所操作的絮個教,在高功率可根據同步多絮處理 作。 ’莫式或低功率模式下操 此外,當同步多絮處理器 時,可降低處理電路的效能水m,所操作的絮個數增加 理電路的消耗功率,並且藉此租藉此降低與絮相關之肩 優勢。舉例而t,在根據本發 ^步多絮處理器的架相 發明的處理電路可用可提供不同=$分實施例中,根據幸 率,和/或使用不同電路類型水平的不同時脈速 操作。舉例而言,在根據本 型的CMOS裝置) 點單元或資料快取的與在同步勿只施例中,例如浮 之處理電路,可根據同步多絮處理=理器中的絮動作相關 數,在高時脈速率的高功率模的絮個 式下操作。 、X低時脈速率的低功率模 一雖然本發明已以較佳實施例揭露如上,然 限定本發明,任何熟習此技藝者,在籬太^、、 Μ 和範圍内,當可作各種之更動盥 η ρ 士么月之精神 祀圍S視後附之申請專利範圍所界定者為準。 呆遵 200421180 圖式簡單說明 圖式簡單說明 第1圖係顯+ * 圖。 ”、個習知的同步多絮處理器架構的方塊 第2圖係顯示— 的方塊圖。 個根據本發明實施例的同步多絮處理器 第3圖係县g + j 塊圖。 ,、、、員不—個根據本發明實施例的絮管理電路的方 弟4圖係顯+ _ 路的方塊圖:一個根據本發明實施例的效能水平控制電 平控第用來說明根據本發明實施例的效能水 第6圖係顯示_初士 塊圖。 個根據本發明實施例的快取記憶體的方 第7圖係顯示—>flil 4e 1 器 的方塊圖。 個根據本發明實施例的同步多絮處理 弟8圖係顯示一個柄诚丄 的方塊圖。 個根據本發明實施例的同步多絮處理器 弟9圖係顯示一個;tp i旁士於 的方塊圖。 個根據本發明實施例的同步多絮處理器 第1 0圖係顯示一個椒檐士 a 電路的方塊圖。個根據本發明實施例的效能水平控制 第11圖係顯示一個用办>、 能 水平控制電路的流程圖來5兄明根據本發明實施例的效 圖式標記說明: 13145pif.ptd 第29頁 200421180 圖式簡單說明 200 同步多絮處理器 205 絮管理電路 210 擷取電路 215 程式計數器 220 指令快取 225 解碼器 230 暫存器更名電路 235 浮點指令佇列 240 整數指令彳宁列 245 浮點暫存器 250 整數暫存器 255 浮點電路 260 整數/載入-儲存電路 265 資料快取 300 絮管理電路 305 絮管理電路 330 絮配置電路 340 效能水平控制電路 405 計數器電路 410 比較器電路 50 0〜 5 2 0 :流程步驟 610 標籤記憶體 620 資料記憶體 630 標籤比較電路^ 50 (”Efficiency level 1 055) The input processing circuit will turn on X and keep it on) Operate at the first performance level (block " Second-way start) (Boundary ^ 5 3 If the number exceeds the second clinical efficiency level of 102, the processing circuit inputted at the efficiency level of 1 050 together with it will start with the first processing circuit (and hold the bribe & energy level). Operation (block 1135), and if the performance level is: Λ processing circuit ', it will operate at the-performance level.-First = the number of batts currently operated by the pro processor is less than or equal to the block 1140)' then the performance level 1 055 input processing efficiency level operation (square processing circuit, will continue to maintain the second 7 ,, Zha (block Η 4 5). If the number of simultaneous multi-fault virtual cry a made more than the second series of processing goals Do not operate the input processing circuit: ^, value (box "4〇) 'then the efficiency level _ (j, θ) starts (and continues to) operate at the second efficiency level (also = 疋 in low power mode) ( Box ii5G). "VJ on the Water Processor: According to an embodiment of the present invention, Synchronous multi-fiber circuits are used to construct a plurality of processing circuits. Among these processing numbers, the multiple multi-fiber processors are currently operated by a non-synchronous multi-fiber processor. Two: Fuck !: For example, As far as the processing circuit in the mini / hand unit or negative material cache according to the present invention is concerned with the multi-float at 13145pif.ptd page 27, the description of the invention (23) the processing circuit in the processor is currently The operation and operation of high-power can be based on synchronous multi-fiber processing. 'Mo mode or low-power mode operation In addition, when the multi-fiber processor is synchronized, the efficiency of the processing circuit can be reduced. This increases the power consumption of the logic circuit, and thereby reduces the shoulder-related advantages. For example, t, the processing circuit invented according to the framework of the multi-processor processor of this development step can be used to provide different = $ points In the embodiment, the operation is performed at different clock speeds based on the probability and / or the level of different circuit types. For example, in a CMOS device according to this type) the point unit or data cache is synchronized with that in the embodiment. , Such as the floating processing circuit, According to the synchronous multi-flood processing = floating action correlation number in the processor, the operation is performed in a high-power mode with a high clock rate. A low-power mode with a low clock rate X Although the present invention has been better implemented The example is disclosed as above, but the present invention is limited. Anyone skilled in this art can make various changes within the scope of the law, the scope of the patent application attached to the spiritual sacrifice of Shi Moyue. The definitions shall prevail. Follow the diagrams of 200421180. The diagrams are briefly explained. The diagrams are briefly shown in Figure 1. + "Figures.", A block diagram of a conventional synchronous multi-fiber processor architecture. Figure 2 shows a block diagram. Figure 3 of the synchronous multi-fiber processor according to the embodiment of the present invention is a block diagram of county g + j. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ... The performance water of the embodiment of the invention. Fig. 7 shows a block diagram of a cache memory according to an embodiment of the present invention-> flil 4e 1 device. Figure 8 shows a block diagram of a multi-fiber synchronous processing according to an embodiment of the present invention. A synchronous multi-flop processor according to an embodiment of the present invention is shown in FIG. 9; a block diagram of tp i is shown in FIG. FIG. 10 is a block diagram of a pepper eaves a circuit according to an embodiment of the present invention. Figure 11 of the performance level control according to the embodiment of the present invention is a flowchart of a useful > capable level control circuit to illustrate the effect diagram marking according to the embodiment of the present invention: 13145pif.ptd page 29 200421180 Schematic description 200 Synchronous multi-flop processor 205 Fuzz management circuit 210 Fetch circuit 215 Program counter 220 Instruction cache 225 Decoder 230 Register rename circuit 235 Floating point instruction queue 240 Integer instruction queue 245 Floating point Register 250 Integer register 255 Floating point circuit 260 Integer / load-store circuit 265 Data cache 300 Float management circuit 305 Float management circuit 330 Float configuration circuit 340 Efficiency level control circuit 405 Counter circuit 410 Comparator circuit 50 0 ~ 5 2 0: flow step 610 tag memory 620 data memory 630 tag comparison circuit

13145pif.ptd 第30頁 200421180 圖式簡單說明 640 :資料記憶體啟動電路 650 :輸出啟動電路 700 :絮管理電路 722 :指令快取 8 0 0 :絮管理電路 8 0 5 :第一浮點電路 810 :第一整數/載入-儲存電路 8 1 5 :第二浮點電路 820 :第二整數/載入-儲存電路 9 0 0 :絮管理電路 9 0 5 :第一浮點電路 910 :第一整數/載入-儲存電路 9 1 5 ··第二浮點電路 92 0 :第二整數/載入-儲存電路 930 :指令快取 9 6 5 :資料快取 1 0 0 0 :計數器 1 0 1 5 :第一暫存器 1 0 20 :第二暫存器 1 0 2 5 :第三暫存器 1 0 3 0 :第一比較器電路 1 0 3 5 :第二比較器電路 I 040 :第三比較器電路 II 0 0〜11 5 0 ·•流程步驟13145pif.ptd Page 30 200421180 Schematic description 640: Data memory start circuit 650: Output start circuit 700: Float management circuit 722: Instruction cache 8 0 0: Float management circuit 8 0 5: First floating point circuit 810 : First integer / load-store circuit 8 1 5: Second floating-point circuit 820: Second integer / load-store circuit 9 0 0: Float management circuit 9 0 5: First floating-point circuit 910: First Integer / load-storage circuit 9 1 5 ·· Second floating-point circuit 92 0: Second integer / load-storage circuit 930: Instruction cache 9 6 5: Data cache 1 0 0 0: Counter 1 0 1 5: first register 1 0 20: second register 1 0 2 5: third register 1 0 3 0: first comparator circuit 1 0 3 5: second comparator circuit I 040: first Three comparator circuit II 0 0 ~ 11 5 0 · • Process steps

13145pif.ptd 第31頁13145pif.ptd Page 31

Claims (1)

200421180 六、申請專利範圍 其中1該/處\路之同步多絮(SMT)處理器, =作相關,並且器中的-1的如一申^數…效 當該同步多絮處理祀器圍目u::之同步多絮處理器,其中 等於-臨界值時,該至;一二::些絮的該個數小於或 能水平操作, ·以及 4理電路係架構成以-第-效 ^ A ^ ^ a ^ ^ 二效能水平操作。 路係架構成以一第 包括3.如申請專利範圍第1項所述之同步多絮處理H,更加 写目一 1·效所能/1平沾控制電路,係架構成根據該同步多絮處理 該些絮的該個數,對該至少-處理電路, JrT半申Λ專Λ11圍第3項所述之同步多絮處理器,豆中 處理電路的;二:; = ;制ί路將提供至該至少-其中,當該同步多絮處理Λ一Λ一Λ能水平;以及 少-處理電路的能Γ:Π:;Γ路將提供至該至 的-第二效能水平 +降低至低於該第一效能水平 13145pif.ptd 第32頁 200421180 六、申請專利範ΐ ' ' 2·如申凊專利範圍第4項所述之同步多絮處理器,其中 二值包括一第一臨界值,而且當該同步多絮處理器目 二=刼作的該些絮的該個數超過高於該第一臨界值的一第 二臨界值時’該效能水平控制電路更加將提供至該至少一 J理電路的該效能水平,降低至㈣該第二效能水平;;一 第二效能水平。200421180 VI. Application scope of patents 1 of which is the Synchronous Multiple Float (SMT) processor, which is related, and the number of -1 in the device is the same as the number of…. It is effective for the synchronous multiple flop processing target u :: of a synchronous multi-fiber processor, which is equal to-critical value, it should be; one or two :: the number of these flocks is less than or can be operated horizontally, and the four-circuit circuit frame structure is composed of -th-effect ^ A ^ ^ a ^ ^ two efficiency level operations. The structure of the road frame includes the synchronous multi-flood processing H as described in item 1 of the scope of the patent application. It also writes 1 · Effective / 1 flat control circuit. The structure of the frame is based on the synchronous multi-float. The number of processing the flocs, for the at least-processing circuit, JrT Semi-Shen Λ11 Λ11 Synchronous multi-flop processor described in item 3, the processing circuit in the bean; two:; =; Provided to the at least-where, when the synchronous multi-floc processing Λ-Λ-Λ energy level; and the energy of the less-processing circuit Γ: Π:; Γ road will be provided to the -second efficiency level + reduced to low At the first performance level of 13145pif.ptd, page 32, 200421180 6. Patent application scope ΐ '2 · The synchronous multi-fiber processor described in item 4 of the patent scope of the application, where the two values include a first critical value, Moreover, when the number of the synchronized multi-floating processor item 2 is equal to or greater than a second critical value higher than the first critical value, the performance level control circuit will further provide the at least one J The performance level of the processing circuit is reduced to the second performance level; a second performance level 兮$6 · I如申請專利範圍第1項所述之同步多絮處理器,其中 :雷::ί理電路包括一快取記憶體電路,而該快取記憶 快取記戴記憶體與一資料記憶體,被架構來當該 該標籤記憶體同時的快取資料;以及〗^供,、存取 低於、兮中裳該/料圯憶體被架構來當快取記憶體電路是以 平的—第二效能水平操作時,響應該標 紙"己&體的一命中,提供該快取資料。 哕:敌如申:專利範圍第6項所述之同步多絮處理器,其中 μ陕取A fe體包括至少一資料快取記憶體, 由複數個指令所運瞀&次 ..jr . ^糸木構來儲存 架爐决μ六I ! 貝枓, 指令快取記憶體,係 架構來儲存在該些相關資料上操作的該些指令。 請專利範圍第6項所述之同步;絮i理器,其中 虽在該第二效能水平下操作時’該資料記憶體燼 響$在該標籤記憶體中的一非命中,提供快取資料? A μ' ,專利範圍第1項所述之同步多絮處理器,1中 讜至少一處理電路包括一浮點單元。 八中 10.如申請專利範圍第9項所述之同步多絮處理器,其$$ 6 · The synchronous multi-flop processor as described in item 1 of the patent application scope, wherein: Lei: The logic circuit includes a cache memory circuit, and the cache memory caches a memory and a data The memory is structured to cache the data at the same time as the tag memory; and〗, the access is lower, the memory is lower than the memory, and the memory is structured to be the cache memory circuit. -When operating at the second efficiency level, the cached data is provided in response to a hit of the target " self & body.哕: The enemy is as claimed: The synchronous multi-flop processor described in item 6 of the patent scope, in which the μ fetch A fe body includes at least one data cache memory, which is run by multiple instructions. ^ The wooden structure is used to store the rack furnace. The instruction cache memory is used to store the instructions that are operated on the relevant data. Please synchronize as described in item 6 of the patent scope; although the processor is operating at the second performance level, the data memory embers ringing a non-hit in the tag memory to provide cached data ? A μ ', the synchronous multi-fiber processor described in the first item of the patent scope, wherein at least one processing circuit includes a floating point unit. Eight in 10. 10. The synchronous multi-fiber processor as described in item 9 of the scope of patent application, which 200421180200421180 絮處理器目 值時,以一 括: =包括一第一浮點單元,係架構來當該同步多 前所操作的該些絮的該個數小於或等於一臨界 第一效能水平操作,該同步多絮處理器更加包 以 一第二浮點單元,係架構來當學大於該臨界值時 低於该效能水平的一第二效能水平操作。 11 ·如申請專利範圍第丨項所述之同步多絮處理器 中該至少一處理電路包括一整數暫存器。 w ^2·如申請專利範圍第2項所述之同步多絮處理器,苴 Κ = 制電路係架構來響應在該同步多絮處理器 多掌Ϊ理;:ί的該些絮的變化’分別增加或減少該同步 夕I處理器目則所操作的該些絮的該個數。 ^3·如申請專利範圍第丨項所述之同步多絮處理器,i “至少一處理電路包括一第一處理電路,係架構塑、 絮ί理器目前所操作的該些絮的該個數下“ 所操作第的一該處:絮電:”係數架= 一系的該個數增加至大於該臨界值時,Ml 該第一效能水平的一第二效能水平操作。吁以低於 ^4.如申請專利範圍第i項所述之同步多絮處理器,直 !二ϊ ί水平控制電路係架構來響應-新絮產1,°藉以將 5專於—臨界值的該同步多絮處理器目前所操^的兮 4的該個數增加至大於該臨界值’#此降低提供至該=In the case of the floc processor, the following is included: = includes a first floating point unit, which is structured to operate when the number of flocs operated before the synchronization is less than or equal to a critical first performance level. The multi-fiber processor further includes a second floating-point unit, which is structured to operate at a second performance level that is lower than the performance level when the threshold is greater than the critical value. 11 · The synchronous multi-fiber processor according to item 丨 of the patent application, wherein the at least one processing circuit includes an integer register. w ^ 2 · As described in the second patent application scope of the synchronous multi-fiber processor, 苴 Κ = system architecture to respond to the multiple processes in the synchronous multi-fiber processor; Increasing or decreasing the number of the blobs operated by the synchronization processor respectively. ^ 3. According to the synchronous multi-fiber processor described in item 丨 of the scope of the patent application, "at least one processing circuit includes a first processing circuit, which is the structure of the floc Count “the first one operated: floc:” coefficient rack = when the number of a series increases above the critical value, Ml operates at a second efficiency level of the first efficiency level. Call below ^ 4. Synchronous multi-fiber processor as described in item i of the scope of patent application. Straight! Two horizontal control circuit architecture to respond-new production 1, ° by which 5 specialization-critical value of the synchronous multi-fiber processor. The number of operations currently performed by the processor is increased to greater than the threshold value. This reduction is provided to the = 200421180 六、申請專利範圍 少一處理電 1 5 ·如申 中該效能水 前所操作的 臨界值時, 低至複數個 16.如申 中該效能水 目前所操作 升至大於該 效能水平, 平的一第二 17· — 種 一效能水平 前所操作的 複數個處理 1 8 ·如申 中該效能水 加該同步多 供該些操作 處理目前 路的一效能水平 請專利範圍 平控制電路 該些絮的該 將提供至該 遞減效能水 請專利範圍 平控制電路 的該些絮的 第2項所述之同步多絮處理器,其 係架構來在當該同步多絮處理器目 個數超過複數個遞增臨界值的每一 至少一處理電路的一效能水平,降 平的其中之一。 第2項所述之同步多絮處理器,其 係架構來響應當該同步多絮處理器 該個數從低於或專於一臨界值,提 臨界值時,保持對一第一處理電路提供一第一 及對一第二處理電路,提供低於該第一效能水 效能水平。 同步多絮處 控制電路, 該些絮的一 電路,提供 請專利範圍 平控制電路 絮處理器目 絮的一新個 所操作的該 提供一效能水平。 請專利範圍 多絮處理器 1 9 ·如申 中當該同步 理器,包括: 係架構來根據該同步多絮處理器目 個數,對在該同步多絮處理器中的 一效能水平。 第17項所述之同步多絮處理器,其 更加係架構成響應一新絮產生,增 前所操作的該些絮的該個數,以提 數,並且係架構成根據該同步多絮 些絮的該新個數,對該些處理電路 第17項所述之同步多絮處理器,其 目前所操作的該些絮的該個數小於200421180 VI. The scope of the patent application is one less than the treatment power. 15 · When the critical value of the water before the application is applied, it is as low as a plurality of 16. If the water used in the application is currently operated above the performance level, A second 17 · — a number of treatments that were performed before an efficiency level 1 8 · If you apply the performance water plus the synchronization for these operations to deal with the current performance of a current level, please patent the level control circuit The synchronous multi-floating processor described in item 2 of the multi-floating processor that will provide the decreasing performance water level patent control circuit is structured so that when the number of synchronous multi-floating processors exceeds a plural number An efficiency level of each of the at least one processing circuit that increases the critical value decreases one of them. The synchronous multi-fiber processor described in item 2 is structured to respond to the first processing circuit when the number of the synchronous multi-fiber processor is lower than or specialized in a critical value, and is provided to a first processing circuit. A first and second processing circuit provides a water efficiency level lower than the first efficiency. Synchronous multiple loop control circuit, a circuit of these loops, provides a patentable scope of flat control circuits, a new processor, and a new level of operation to provide a level of performance. Patent Scope Multi-Factor Processor 19 • As stated in the application, the synchronous processor includes: a system architecture to determine a performance level in the synchronous multi-processor based on the number of synchronous multi-processor processors. The synchronous multi-fiber processor described in item 17 is more structured in response to the generation of a new floc, increasing the number of the previously operated flocs to increase the number, and the frame structure is based on the synchronized multi-floc The new number of flocs, the synchronous multi-flop processor described in item 17 of the processing circuits, the number of the current flocs that are currently operating is less than 200421180 六、申請專利範圍 或等於一臨界值時,該效能水平控制電路將提供s_ 理電::該/能水平,提昇至-第-=些 個數超過該V亥Λ步時多絮處理器目前所操作的該些絮的該 處理電路的該二!平該路:提供至該些 第二效能水平。+ $低至低於該苐-效能水平的- 中以申利範圍第19項所述之同步多絮處理器,其 目;mr係架構來響應當該同步多絮處理器 呆、°亥些絮的該個數從低於或等於一臨界值,提 ρΐϊ該臨界值時,保持對ϋ理電路提供該第- 平:7筮,及對一第二處理電路’提供低於該第-效能水 十的一第二效能水平。 > j 1*如申明專利範圍第丨7項所述之同步多絮處理器,其 二处理電路包括一浮點單元與一資料快取記憶體的至 其中之一。 ^2·如申請專利範圍第17項所述之同步多絮處理器,其 中當該同步多絮處理器目前所操作的該些絮的該個數小ς ,等於一臨界值時,該些處理電路係架構成以一第—能 水平操作;以及 % 其中,當該同步多絮處理器目前所操作的該些絮的該 個數大於該臨界值時,該些處理電路係 一第二 能水平操作。 双 23· —種同步多絮處理器,包括: 一絮管理電路,係架構成當複數個絮建立時,將與該200421180 VI. When the scope of patent application is equal to a critical value, the performance level control circuit will provide s_ Electricity :: This / energy level will be increased to-the number of = = when the number exceeds the number of multi-flop processors The two of the processing circuits that are currently operating are equal to each other: provide to the second efficiency levels. + $ Lower than the 苐 -efficiency level-The synchronous multi-fiber processor described in item 19 of the China-Israel Shenli scope; the mr system architecture responds when the synchronous multi-fiber processor stays, The number of blobs is lower than or equal to a critical value, and when the critical value is raised, the first level of the processing circuit is provided: 7%, and the second processing circuit is provided below the first performance. A second efficiency level for water ten. > j 1 * The synchronous multi-fiber processor according to item 7 of the declared patent scope, and the second processing circuit includes one of a floating point unit and a data cache memory. ^ 2. The synchronous multi-fiber processor as described in item 17 of the scope of the patent application, wherein when the number of the flocs currently operated by the synchronous multi-fiber processor is small and equal to a critical value, the processing The circuit frame is configured to operate at a first-energy level; and%, wherein when the number of the flops currently operated by the synchronous multi-fiber processor is greater than the critical value, the processing circuits are at a second energy level operating. Double 23 · — a kind of synchronous multi-fiber processor, including: a flock management circuit, a frame structure, when a plurality of flocks are established, the 13145pif.ptd 第36頁 200421180 六、申請專利範圍 同步多絮處理器相 多絮處理器中操作 一效能水平控脊 理器目前所操作的 結果,對該些處理 24·如申請專利 中該至少一臨界值 的一臨界個數。 2 5 ·如申請專利 中當該同步多絮處 或等於該至少一臨 該些處理電路的一 及 其中,當該同^ 個數超過該至少一 至該些處理電路的 平的一第二效能水 2 6 ·如申請專利 中該效能水平控制 小於或等於該至少 作的該些絮的該個 低提供至該些處理 2 7.如申請專利 中該效能水平控制 關之複數個處理電路’指定給在該同步 的該些絮;以及 電路,係架構成根據將該同步多絮處 該些絮的一個數與至少一臨界值比較的 電路提供複數個效能水平的其中之一。 範圍第23項所述之同步多絮處理器,其 包括該同步多絮處理器所操作的該些絮 範圍第23項所述之同步多絮處理器 穴 理器目前所操作的該些絮的該個數小於_ 界值時,該效能水平控制電路將提供至 效能水平,提幵至一第一效能水平;以 b多絮處理器目前所操作的該些絮的該 臨界值時,該效能水平控制電路將提"供 該效能水平,降低至低於該第一效能 平。 範圍第2 3項所述之同步多絮處理器,其 電路係架構來響應一新絮產生,藉以將 一臨界值的該同步多絮處理器目前所操 數增加至大於該至少—臨界值,藉此^ 電路的一效能水平。 2圍第22項所述之同步多絮處理器,其 路係架構來在當該同步多絮處理器目 200421180 六、申請專利範圍 前所操作的該些絮的該個數超過複數個遞增臨界值的每一 臨界值時,將提供至該些處理電路的該效能水平,降低至 複數個遞減效能水平的其中之一。 28·如申請專利範圍第23項所述之同步多絮處理器,其 中該效能水平控制電路係架構來響應當該同步多絮處理器 目前所操作的該些絮的該個數從低於或等於該至少一臨界 值’提升至大於該至少一臨界值時,保持對一第一處理電 路&供一第一效旎水平,及對一第二處理電路,提供低於 該第一效能水平的一第二效能水平。 29· —種與一同步多絮處理器相關之快取記憶體,其中 該快取記憶體包括一標籤記憶體與一資料記憶^,而^根 據該同步多絮處理器目前所操作的該些絮的一個數,可同 時存取該標藏記憶體與該資料記憶體,或先存取該標籤記 憶體,再存取該資料記憶體。 30.如申請專利範圍第29項所述之快取記憶體,立中塑 應小於或等於-臨界值的該同步多絮處理器目前所操作二 該些絮的該個數,可同時存取該標籤記憶體與該資料記憶13145pif.ptd Page 36 200421180 VI. Scope of patent application Synchronous multi-fiber processor Phase multi-fiber processor operating a performance level control spinner currently operating results, for these processing 24. As in the patent application, the at least one A critical number of critical values. 2 5 · As in the patent application, when the synchronized multiples are equal to one or more of the at least one of the processing circuits, when the same number exceeds a flat second efficiency of the at least one to the processing circuits 2 6 · If the performance level control in the application for a patent is less than or equal to the low of the at least the floppies provided to the processes 2 7. If the performance level control in the application for a plurality of processing circuits is assigned to And the circuit at the synchronization; and the circuit frame is configured to provide one of a plurality of performance levels according to a circuit which compares a number of the locations at the synchronized multiple locations with at least one threshold value. The synchronous multi-fiber processor according to the scope item 23 includes the multiple multi-fiber processors operated by the synchronous multi-floc processor. When the number is less than the _ threshold value, the performance level control circuit will provide the performance level to a first performance level; when the threshold value of the blobs currently operated by the b multi-fiber processor, the performance The level control circuit will provide the performance level to be lower than the first performance level. The synchronous multi-fiber processor described in item 23 of the scope has a circuit architecture that responds to the generation of a new flop, thereby increasing a critical value of the current multi-fiber processor to be greater than the at least-critical value. This ^ a level of performance of the circuit. The synchronous multi-fiber processor described in item 2 in item 22 has a circuit architecture that when the synchronous multi-fiber processor is 200421180 6. The number of the flocs operated before the scope of the patent application exceeds a plurality of increasing thresholds At each critical value of the value, the efficiency level provided to the processing circuits is reduced to one of a plurality of decreasing performance levels. 28. The synchronous multi-fiber processor according to item 23 of the patent application scope, wherein the performance level control circuit is configured to respond to the number of the multiple flocks currently operated by the synchronous multi-fiber processor from less than or Equal to the at least one critical value 'is raised to be greater than the at least one critical value, maintaining a first efficiency level for a first processing circuit & and providing a second processing circuit below the first efficiency level A second level of effectiveness. 29 · —A kind of cache memory related to a synchronous multi-fiber processor, wherein the cache memory includes a tag memory and a data memory ^, and ^ according to the current operations of the synchronous multi-fiber processor It can access the tag memory and the data memory at the same time, or first access the tag memory and then the data memory. 30. According to the cache memory described in item 29 of the scope of the patent application, Lizhong Plastic should be less than or equal to the -critical value of the synchronous multi-fiber processor currently operating the number of the two flocks, which can be accessed simultaneously The tag memory and the data memory 31.如申請專利 該同步多絮處理器 界值時,響應在該 記憶體。 範圍第29項所述之 目前所操作的該些 標籤記憶體中的一 快取記憶體,其中當 絮的該個數大於一臨 命中,可存取該資料 32. —種同步多絮處理器之操作方法,包括: 根據該同步多絮處理器目前所操作的該些絮的一個數,對31. If a patent is filed, the synchronous multi-flop processor will respond in the memory when the threshold is reached. A cache memory in the tag memory currently being operated as described in the scope item 29, wherein when the number of flashes is greater than one hit, the data can be accessed. 32. A kind of synchronous multi-flash processor The operation method includes: according to a number of the flocs currently operated by the synchronous multifiber processor, 200421180200421180 至少一處理電路,提供一效能水平。 摇供牛:申:專利範圍第32項所述之操作方法,其中在該 提供步驟之後更加包括: 將二亥同步夕絮處理器目前所操作的該些絮的該個數, /、 α界值相比較,以提供該效能水平給該至少一處理電 路。 〜 包 34·如申請專利範圍第32項所述之操作方法,其中在該 比較方法之後更加包括: 一響#應。在該同步多絮處理器中剛啟動的一新絮,增加該 同ν夕絮處理器目前所操作的該些絮的該個數;以及 夕。響應在該同步多絮處理器中結束的一絮,減少該同步 夕絮處理器目前所操作的該些絮的該個數。 35·如申請專利範圍第34項所述之操作方法,其中該 供步驟更加包括: ^如果該同步多絮處理器目前所操作的該些絮的該個數 係小於或等於該臨界值,則提供一第一效能水平給該至少 一處理電路;以及 、如果,同步多絮處理器目前所操作的該些絮的該個數 超過該臨界值,則提供低於該第一效能水平的一 水平給該至少一處理電路。 $一效此 3 6 ·如申請專利範圍第3 5項所述之操作方法,更加包 括: 將降低的效能水平,提供至與用來增加該同步多絮處 理器目前所操作的該些絮的該個數的該些新絮相關之處理 第39頁 200421180At least one processing circuit provides a performance level. Supplying cattle: Application: The operation method described in item 32 of the patent scope, wherein after the providing step, the method further includes: Counting the number of the battens currently being operated by the Synchronization Synchro Processor, /, α boundary The values are compared to provide the performance level to the at least one processing circuit. ~ Package 34. The method of operation described in item 32 of the scope of patent application, which after the comparison method further includes: 一 响 # 应. A new batch just started in the synchronous multi-fiber processor increases the number of the multiple flocks currently operated by the same processor; and Responding to a batch that ends in the synchronous multi-fiber processor, reducing the number of the multiple flocks currently operated by the synchronous multi-fiber processor. 35. The operation method as described in item 34 of the scope of patent application, wherein the providing step further includes: ^ If the number of the flocs currently operated by the synchronous multi-fiber processor is less than or equal to the critical value, then Providing a first performance level to the at least one processing circuit; and, if the number of the flops currently operated by the synchronous multi-fiber processor exceeds the critical value, providing a level lower than the first performance level Give the at least one processing circuit. $ 一 效 此 3 6 · The operation method described in item 35 of the scope of patent application, further includes: providing a reduced level of performance to the same as used to increase the number of current operations of the synchronous multi-processor The number of related new treatments Page 39 200421180 電路,藉以超過該些增加的額外臨界值 37· —種同步多絮處理器,包括: 根據該同步多絮處理器目 數,提供一效能水平給至少 3 8 ·如申請專利範圍第3 7 加包括: 别所操作的該些絮的一個 一處理電路的一裝置。 項所述之同步多絮處理器, 更 將該同步多絮處理+ Μ > ^ ^ 15目刚所刼作的該些絮的該個數, 與’^ 界值相比較,以脸q , 電路的一裝置。 乂將該效此水平提供至該至少一處理 39.如申請專利範圍第37項所述之同步多絮處理器, 加包括: 響應在該同步多絮處理器中剛啟動的一新絮,增加該 同步多絮處理器目前所操作的該些絮的該個數的一裝置; 以及 響應在該同步多絮處理器中結束的一絮,減少該同步 多絮處理器目前所操作的該些絮的該個數的一裝置。 40·如申明專利範圍第gg項所述之同步多絮處理器,盆 中該提供裝置包括: 〃 §該同步多絮處理器目前所操作的該些絮的該個數係 小於或等於該臨界值時,提供一第一效能水平給該至少'一· 處理電路的一裝置;以及 、>當該同步多絮處理器目前所操作的該些絮的該個數超 過該臨界值時,提供低於該第一效能水平的一第二效沪 平給該至少一處理電路的一裝置。 200421180Circuit to exceed these additional additional critical values 37 · —a type of synchronous multi-fiber processor, including: providing a performance level of at least 3 8 according to the number of synchronous multi-fiber processors Including: a device for a one-to-one processing circuit of the flocs. The synchronous multi-floc processor described in the above item, further, the synchronous multi-floc processing + M > ^ ^ 15 The number of the flocks just made by the mesh is compared with the '^ cutoff value, and the face q, A device for a circuit.乂 Provide this level of efficiency to the at least one processing 39. The synchronous multi-fiber processor described in item 37 of the scope of patent application, further comprising: in response to a new flop just started in the synchronous multi-fiber processor, increasing A device of the number of the flocs currently operated by the synchronous multi-fiber processor; and in response to a flop that ends in the synchronous multi-floc processor, reducing the number of flocs currently operated by the synchronous multi-floc processor The number of a device. 40. The synchronous multi-fiber processor as described in item gg of the declared patent scope, the providing device in the basin includes: 〃 § The number of the multiple flocks currently operated by the synchronous multi-fiber processor is less than or equal to the threshold When a value is provided, a first performance level is provided to a device of the at least one processing circuit; and, > when the number of the flocs currently operated by the synchronous multi-fiber processor exceeds the critical value, provide A second-effect Huping that is lower than the first efficiency level is provided to a device of the at least one processing circuit. 200421180 六、申請專利範圍 41·如申請專利範圍第4〇項所述之同步多絮處理器, 加包括: w 尺 將降低的效能水平,提供至與用來增加該同步多絮處 理器目刖所操作的該些絮的該個數的該些新絮相關之處理 電路’藉以超過該些增加的額外臨界值的一裝置。 42· 種用來操作一同步多絮處理器之電腦程式產口 包括: σσ 具有電細可讀取程式碼的電腦可讀取媒體,而且該 電腦可讀取程式產品包括: μ6. Scope of patent application 41. The synchronous multi-fiber processor described in item 40 of the scope of the patent application, plus: w The ruler will reduce the level of performance provided to the purpose of increasing the synchronous multi-fiber processor. A device that operates the number of the new floes associated with the number of new flocculation processing circuits to exceed the added extra critical values. 42 · A computer program product port for operating a synchronous multi-fiber processor includes: σσ Computer-readable media with electronically readable code, and the computer-readable program products include: μ ^電腦可讀取程式碼,係架構來根據該同步多絮處理器 目别所操作的該些絮的一個數,提供一效能水平給在該同 步多絮處理器中的至少一處理電路。 43·如申請專利範圍第42項所述之電腦程式產品,更加 包括: ^電腦可讀取程式碼,係架構來將該同步多絮處理器目 如所操作的該些絮的該個數,與一臨界值相比較,以提供 該效能水平給該至少一處理電路。 44·如申請專利範圍第42項所述之電腦程式產品,更加 包括: ^電腦可讀取程式碼,係架構來響應在該同步多絮處理 器中剛啟動的一新絮,增加該同步多絮處理器目前所操作 的該些絮的該個數;以及 電腦可讀取程式碼,係架構來響應在該同步多絮處理 器中結束的一絮,降低該同步多絮處理器目前所操作的該^ The computer can read the code, and it is structured to provide a performance level to at least one processing circuit in the synchronous multi-fiber processor according to a number of the multiple-fiber processor operated by the synchronous multi-fiber processor. 43. The computer program product as described in item 42 of the scope of patent application, further comprising: ^ computer-readable code, a framework for setting the number of the synchronized multi-processors as the number of operations, Compare with a threshold value to provide the efficiency level to the at least one processing circuit. 44. The computer program product described in item 42 of the scope of patent application, further comprising: ^ Computer readable code, the system is structured to respond to a new program just started in the synchronous multi-processor, and increase the synchronous multi-processor The number of the flocs currently operated by the flocking processor; and the computer can read the code, and the system is structured to respond to a flocking that ends in the synchronous multi-fiber processor, reducing the current operating of the multi-fiber processor. The 13145pif.ptd 第41頁 200421180 六、申請專利範圍 些養的該個數。 粑:: ί利範圍第42項所述之電腦程式產品,其中 根據第”項:斤耠供的該電腦可讀取程式碼包括: 义電腦可讀取程式碼’係架構來當該同步多絮處理器目 =一 ί作的5亥些絮的該個數係小於或等於該臨界值時,提 、效此水平給該至少一處理電路;以及 電腦可讀取㈣碼,係㈣來當該 刖所操作的該此智从#, & ^ ^ 第一效能水平g二宽§,超過該臨界值時,提供低於該 46 ^ ^弟—效能水平給該至少一處理電路。 包括:·中㉖專利範圍第43項所述之電腦程式產品,更加 電恥可項取耘式碼,係構來將降 供至與用來增加該同牛夕 ^政月•水千,提 該個數的該此斬絮相二夕1處t】操作的該些絮的 額外臨界值:新絮相關之處理電路,“超過該些増加的13145pif.ptd Page 41 200421180 Sixth, the scope of the patent application The number of support.粑 :: The computer program product described in Item 42 of the scope of profit, wherein according to item ": The computer-readable code provided by the computer includes:" Computer-readable code "system structure when the synchronization is more When the number of the batches is less than or equal to the critical value, the level is provided to the at least one processing circuit; and the computer can read the code, which is necessary The first and second efficiency levels operated by the operator are the first and second efficiency levels of g and §. When the threshold value is exceeded, the efficiency level is lower than the 46 ^ ^-brother level-the at least one processing circuit. Including: · The computer program product described in Item 43 of Zhongli's patent scope has more electric shame, and the system can be used to reduce the supply to increase the number of Tong Niu Xi ^ Zhengyue • Shui Qian, mention this The number of the cuts is 1 additional point t] The additional threshold of the cuts: the new cut-related processing circuit, "exceeds the added
TW093103698A 2003-02-20 2004-02-17 Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating TWI261198B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR20030010759 2003-02-20
US10/631,601 US7152170B2 (en) 2003-02-20 2003-07-31 Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating

Publications (2)

Publication Number Publication Date
TW200421180A true TW200421180A (en) 2004-10-16
TWI261198B TWI261198B (en) 2006-09-01

Family

ID=32044744

Family Applications (1)

Application Number Title Priority Date Filing Date
TW093103698A TWI261198B (en) 2003-02-20 2004-02-17 Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating

Country Status (5)

Country Link
JP (1) JP4439288B2 (en)
KR (1) KR100594256B1 (en)
CN (1) CN100394381C (en)
GB (1) GB2398660B (en)
TW (1) TWI261198B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4574493B2 (en) * 2005-08-22 2010-11-04 キヤノン株式会社 Processor system and multi-thread processor
JP4687685B2 (en) * 2007-04-24 2011-05-25 株式会社デンソー Electronic control device for engine control and microcomputer
EP2159700A4 (en) * 2007-06-19 2011-07-20 Fujitsu Ltd Cache controller and control method
EP2423808B1 (en) * 2007-06-20 2014-05-14 Fujitsu Limited Arithmetic device
US9529727B2 (en) * 2014-05-27 2016-12-27 Qualcomm Incorporated Reconfigurable fetch pipeline
CN105808444B (en) * 2015-01-19 2019-01-01 东芝存储器株式会社 The control method of storage device and nonvolatile memory
WO2018018492A1 (en) * 2016-07-28 2018-02-01 张升泽 Method and system of allocating current in plurality of intervals in interior of multi-core chip
WO2018018494A1 (en) * 2016-07-28 2018-02-01 张升泽 Method and system for allocating power based on multi-zone allocation
CN112631960B (en) * 2021-03-05 2021-06-04 四川科道芯国智能技术股份有限公司 Method for expanding cache memory

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5218704A (en) * 1989-10-30 1993-06-08 Texas Instruments Real-time power conservation for portable computers
US5396635A (en) * 1990-06-01 1995-03-07 Vadem Corporation Power conservation apparatus having multiple power reduction levels dependent upon the activity of the computer system
JP3100241B2 (en) * 1992-10-09 2000-10-16 ダイヤセミコンシステムズ株式会社 Microprocessor drive controller
JP3461535B2 (en) * 1993-06-30 2003-10-27 株式会社日立国際電気 Wireless terminal device and control method therefor
US5630142A (en) * 1994-09-07 1997-05-13 International Business Machines Corporation Multifunction power switch and feedback led for suspend systems
US6073159A (en) 1996-12-31 2000-06-06 Compaq Computer Corporation Thread properties attribute vector based thread selection in multithreading processor
US5835705A (en) * 1997-03-11 1998-11-10 International Business Machines Corporation Method and system for performance per-thread monitoring in a multithreaded processor
US6272616B1 (en) * 1998-06-17 2001-08-07 Agere Systems Guardian Corp. Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths
US6493741B1 (en) * 1999-10-01 2002-12-10 Compaq Information Technologies Group, L.P. Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit
US7051329B1 (en) * 1999-12-28 2006-05-23 Intel Corporation Method and apparatus for managing resources in a multithreaded processor
US7487505B2 (en) * 2001-08-27 2009-02-03 Intel Corporation Multithreaded microprocessor with register allocation based on number of active threads
US6711447B1 (en) * 2003-01-22 2004-03-23 Intel Corporation Modulating CPU frequency and voltage in a multi-core CPU architecture

Also Published As

Publication number Publication date
JP2004252987A (en) 2004-09-09
CN100394381C (en) 2008-06-11
KR20040075287A (en) 2004-08-27
GB0403738D0 (en) 2004-03-24
JP4439288B2 (en) 2010-03-24
GB2398660A (en) 2004-08-25
TWI261198B (en) 2006-09-01
KR100594256B1 (en) 2006-06-30
CN1534463A (en) 2004-10-06
GB2398660B (en) 2005-09-07

Similar Documents

Publication Publication Date Title
US8661199B2 (en) Efficient level two memory banking to improve performance for multiple source traffic and enable deeper pipelining of accesses by reducing bank stalls
TWI628594B (en) User-level fork and join processors, methods, systems, and instructions
JP4472339B2 (en) Multi-core multi-thread processor
US11068264B2 (en) Processors, methods, systems, and instructions to load multiple data elements to destination storage locations other than packed data registers
Zhou et al. Scalable, high performance ethernet forwarding with cuckooswitch
US9606797B2 (en) Compressing execution cycles for divergent execution in a single instruction multiple data (SIMD) processor
US8751771B2 (en) Efficient implementation of arrays of structures on SIMT and SIMD architectures
US8639882B2 (en) Methods and apparatus for source operand collector caching
US9223578B2 (en) Coalescing memory barrier operations across multiple parallel threads
US9189242B2 (en) Credit-based streaming multiprocessor warp scheduling
BR112019010679A2 (en) systems, methods and apparatus for heterogeneous computing
Gwennap Sandy Bridge spans generations
US20230060900A1 (en) Method and apparatus for performing reduction operations on a plurality of associated data element values
US7152170B2 (en) Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating
US20140208075A1 (en) Systems and method for unblocking a pipeline with spontaneous load deferral and conversion to prefetch
EP3588297A1 (en) System, apparatus and method for barrier synchronization in a multi-threaded processor
US9069664B2 (en) Unified streaming multiprocessor memory
US8539130B2 (en) Virtual channels for effective packet transfer
TW201209709A (en) Multiprocessor system-on-a-chip for machine vision algorithms
US20230114164A1 (en) Atomic handling for disaggregated 3d structured socs
KR20200002606A (en) Apparatus and method for coherent, accelerated conversion between data representations
US9594395B2 (en) Clock routing techniques
US20190042432A1 (en) Reducing cache line collisions
TW200421180A (en) Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating
US9965395B2 (en) Memory attribute sharing between differing cache levels of multilevel cache