TWI512448B - Instruction for enabling a processor wait state - Google Patents
Instruction for enabling a processor wait state Download PDFInfo
- Publication number
- TWI512448B TWI512448B TW099136477A TW99136477A TWI512448B TW I512448 B TWI512448 B TW I512448B TW 099136477 A TW099136477 A TW 099136477A TW 99136477 A TW99136477 A TW 99136477A TW I512448 B TWI512448 B TW I512448B
- Authority
- TW
- Taiwan
- Prior art keywords
- processor
- core
- low power
- value
- power state
- Prior art date
Links
- 230000015654 memory Effects 0.000 claims description 63
- 238000000034 method Methods 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 16
- 230000004044 response Effects 0.000 claims description 13
- 238000012544 monitoring process Methods 0.000 claims description 9
- 230000008878 coupling Effects 0.000 claims description 2
- 238000010168 coupling process Methods 0.000 claims description 2
- 238000005859 coupling reaction Methods 0.000 claims description 2
- 230000002618 waking effect Effects 0.000 claims description 2
- 230000001427 coherent effect Effects 0.000 claims 2
- 230000009471 action Effects 0.000 description 23
- 239000000306 component Substances 0.000 description 20
- 238000010586 diagram Methods 0.000 description 12
- 238000007726 management method Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 3
- 238000007667 floating Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000000873 masking effect Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000000763 evoking effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3228—Monitoring task completion, e.g. by use of idle timers, stop commands or wait commands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3293—Power saving characterised by the action undertaken by switching to a less power-consuming processor, e.g. sub-CPU
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/30083—Power or thermal control instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/3009—Thread control instructions
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Description
本發明係有關用以啟用處理器等待狀態的指令。The present invention is directed to instructions for enabling a processor wait state.
隨著處理器技術的演進,具有較多核心的處理器也變得可得。為了有效率地執行軟體,該等核心受分派成可進行一單一應用程式的不同執行緒。該種配置稱為合作執行緒軟體。在現代合作執行緒軟體中,使一執行緒等待另一個執行緒完成是相當平常的事。習知地,上面有等待執行緒正在執行中的處理器會在等待時耗用主動電力。再者,等待時間可能是不確定的,且因此該處理器可能無法知悉它要等待多久。As processor technology evolves, processors with more cores are also available. In order to execute software efficiently, the cores are assigned to different threads that can perform a single application. This configuration is called cooperative thread software. In modern cooperative thread software, it is quite common to have one thread wait for another thread to complete. Conventionally, a processor with a waiting thread executing is using active power while waiting. Again, the latency may be uncertain, and thus the processor may not be aware of how long it will wait.
另一種使一核心等待的機構是把該核心置於一等待狀態中,例如一低電力狀態。為了實行此任務,將喚起一作業系統(OS)。該OS可執行一對指令,稱為一MONITOR指令以及一MWAIT指令。要注意的是,該等指令是應用程式階層軟體不可得的。反之,該等指令僅能用於OS特權階層,以設定供監看的一位址範圍並且使該處理器進入一低電力狀態,直到受監看的該位址範圍受到更新為止。然而,進入該OS以執行該等指令的過程中有著相當多的冗餘工作。此種冗餘工作呈高潛伏期間的形式,並且會進一步產生複雜性,因為當該等待中執行緒從該等待狀態退出時,OS排程議題可能會導致該等待中執行緒無法成為下一個受排程執行緒。Another mechanism that allows a core to wait is to place the core in a wait state, such as a low power state. In order to carry out this task, an operating system (OS) will be evoked. The OS can execute a pair of instructions, called a MONITOR instruction and a MWAIT instruction. It should be noted that these instructions are not available to the application hierarchy software. Conversely, these instructions can only be used in the OS privilege hierarchy to set an address range for monitoring and to put the processor into a low power state until the monitored address range is updated. However, there is considerable redundancy in the process of entering the OS to execute the instructions. This redundant operation takes the form of a high latency period and further complicates the complexity, because when the waiting thread exits from the wait state, the OS scheduling issue may cause the waiting thread to become the next one. Schedule thread.
依據本發明的一實施例,係特地提出一種處理器,其包含:一核心,其包括用以從一第一應用程式接收並且解碼一指令的一解碼邏輯組件,該指令指定欲受監看之一位置的一識別資料以及一計時器數值,該核心並且包括耦合至該解碼邏輯組件以針對該計時器數值進行一計數的一計時器;以及耦合至該核心的一電力管理單元,其用以至少部分地根據該計時器數值判定出用於該處理器之一低電力狀態的一類型,並且如果該受監看位置的一數值並不等於一目標值且該計時器數值尚未超過,該電力管理單元用以響應於此判定結果使該處理器進入該低電力狀態,而不需一作業系統(OS)的介入。In accordance with an embodiment of the present invention, a processor is specifically provided comprising: a core comprising a decoding logic component for receiving and decoding an instruction from a first application, the instruction specifying a watch to be monitored An identification data of a location and a timer value, the core and a timer coupled to the decoding logic component for counting the timer value; and a power management unit coupled to the core for Determining, based at least in part on the timer value, a type of low power state for the processor, and if the value of the monitored position is not equal to a target value and the timer value has not exceeded, the power The management unit is responsive to this determination to cause the processor to enter the low power state without the intervention of an operating system (OS).
第1圖以流程圖展示出根據本發明一實施例的一種方法。Figure 1 shows in a flow chart a method in accordance with an embodiment of the present invention.
第2圖以流程圖展示出根據本發明一實施例而可針對一目標值進行的一項測試。Figure 2 is a flow chart showing a test that can be performed for a target value in accordance with an embodiment of the present invention.
第3圖以方塊圖展示出根據本發明一實施例的一種處理器核心。Figure 3 is a block diagram showing a processor core in accordance with an embodiment of the present invention.
第4圖以方塊圖展示出根據本發明一實施例的一種處理器。Figure 4 is a block diagram showing a processor in accordance with an embodiment of the present invention.
第5圖以方塊圖展示出根據本發明另一實施例的一種處理器。Figure 5 is a block diagram showing a processor in accordance with another embodiment of the present invention.
第6圖以流程圖展示出根據本發明一實施例之多個合作執行緒之間的互動狀況。Figure 6 is a flow chart showing the interaction between a plurality of cooperative threads in accordance with an embodiment of the present invention.
第7圖以方塊圖展示出根據本發明一實施例的一種系統。Figure 7 is a block diagram showing a system in accordance with an embodiment of the present invention.
在各種不同實施例中,可備置並使用一使用者階層指令(換言之,一應用程式階層指令),以允許一應用程式等待一或多個狀況的發生。當該應用程式正在等待時,可使上面正有該應用程式執行的一處理器(例如一多核心處理器的一核心)處於一低電力狀態,或者該處理器可進行切換以執行另一個執行緒。雖然本發明的範圍不受限於此,該處理器可等待的狀況可包括檢測一數值、一計時器的逾時、接收到一中斷信號等狀況,例如從另一個處理器接收到一中斷信號。In various embodiments, a user hierarchy instruction (in other words, an application level instruction) may be provisioned and used to allow an application to wait for one or more conditions to occur. While the application is waiting, a processor (eg, a core of a multi-core processor) on which the application is executing may be placed in a low power state, or the processor may switch to perform another execution. thread. Although the scope of the present invention is not limited thereto, the condition that the processor can wait may include detecting a value, a timeout of a timer, receiving an interrupt signal, etc., for example, receiving an interrupt signal from another processor. .
依此,一應用程式可等待一或多個操作發生,例如,在另一個執行緒中,而不需要屈服於一作業系統(OS)或其他監管程式軟體。再者,根據該指令所備置的指令資訊,這種等待狀態可於一種依時性方式發生,以使該處理器可選出要進入的一適當低電力狀態。換言之,該處理器本身的控制邏輯組件可根據所備置的指令資訊以及在該處理器中執行的各種不同計算結果來判定出要進入的一適當低電力狀態。因此,可以避免需要牽連到OS以進入一低電力狀態的冗餘工作。要注意的是,該處理器不需要等待另一個同位體處理器,但可等待一共處理器,例如一浮點共處理器或其他固定功能裝置。Accordingly, an application can wait for one or more operations to occur, for example, in another thread without succumbing to an operating system (OS) or other supervisory software. Moreover, based on the instruction information provided by the instruction, the wait state can occur in a time-dependent manner to enable the processor to select an appropriate low power state to enter. In other words, the control logic component of the processor itself can determine an appropriate low power state to enter based on the prepared instruction information and various different calculations performed in the processor. Therefore, redundant work that needs to be tied to the OS to enter a low power state can be avoided. It should be noted that the processor does not need to wait for another peer processor, but can wait for a total of processors, such as a floating point coprocessor or other fixed function device.
在各種不同實施例中,一使用者階層指令可具有與其相關聯的各種不同資訊,包括要監看的一位置、要查找的一數值、以及一逾時值。雖然本發明的範圍不受限於此,為了討論方便,可把此使用者階層指令稱為一處理器等待指令。可以備置該種使用者階層指令的不同風格,其可各例如指出等待一特定數值、數值組、範圍,或使該種等待與一項操作連接,例如,當該數值成真時使一計數器增量。In various embodiments, a user hierarchy instruction can have various different information associated therewith, including a location to be monitored, a value to look for, and a timeout value. Although the scope of the present invention is not limited thereto, this user hierarchy instruction may be referred to as a processor wait instruction for the convenience of discussion. Different styles of such user-level instructions may be provided, each of which may, for example, indicate waiting for a particular value, set of values, range, or cause the wait to be connected to an operation, for example, incrementing a counter when the value is true the amount.
大致上,一處理器可響應於一處理器等待指令而使各種不同動作發生,該指令可包括下面的指令資訊或者與下面的指令資訊相關聯:一來源欄位,其指出欲受測試之一數值的位置;一逾時或期限計時器值,其指出該等待狀態應該要結束的一時點(如果並未達到欲受測試的該數值);以及一結果欄位,其指出欲獲取的該數值。在其他應用中,除了該等欄位之外,一目的地或遮罩欄位可存在於一實行方案中,其中該來源值受到遮罩且針對一預定值進行測試(例如不管該遮罩之結果的遮罩值是否為非零)。In general, a processor may cause various actions to occur in response to a processor waiting for an instruction, the instruction may include the following instruction information or associated with the following instruction information: a source field indicating one of the tests to be tested The position of the value; a timeout or expiration timer value indicating a point in time at which the wait state should end (if the value to be tested is not reached); and a result field indicating the value to be obtained . In other applications, in addition to the fields, a destination or mask field may exist in an implementation where the source value is masked and tested for a predetermined value (eg, regardless of the mask) Whether the resulting mask value is non-zero).
如上所述,該處理器可響應於此指令來執行各種不同操作。大致上,該等操作可包括:測試該受監看位置的一數值是否為一目標值(例如進行一布林(Boolean)運算以測試一〝真實〞狀況);以及測試是否已經達到該期限計時器數值。如果並未符合該等狀況中之任一種(例如真實),或者如果從另一個實體接收到一項中斷,便可以完成該指令。否則,可能要啟始一機構以監看該位置,來查看該數值是否將改變。因此在此時,可以進入一等待狀態。在此等待狀態中,該處理器可進入一低電力狀態,或者可啟始執行另一個處理器硬體執行緒的動作。如果一低電力狀態是所欲的,該處理器可至少部分地根據到該期限計時器為止的剩下時間長度,來選出一適當低電力狀態。隨後可以進入該低電力狀態,且該處理器可維持為此狀態,直到受到上述該等狀況中之一喚醒為止。儘管係以此種一般操作來進行說明,要了解的是,在不同實行方案中,各種不同特徵與操作可利用不同方式出現。As described above, the processor can perform various different operations in response to this instruction. In general, the operations may include: testing whether a value of the monitored position is a target value (eg, performing a Boolean operation to test a true 〞 condition); and testing whether the deadline has been reached. The value of the device. If one of these conditions is not met (eg, true), or if an interrupt is received from another entity, the instruction can be completed. Otherwise, you may want to start an organization to monitor the location to see if the value will change. Therefore, at this time, a waiting state can be entered. In this wait state, the processor can enter a low power state or can initiate an action of another processor hardware thread. If a low power state is desired, the processor can select an appropriate low power state based, at least in part, on the length of time remaining until the deadline timer. The low power state can then be entered and the processor can remain in this state until waking up by one of the aforementioned conditions. Although described in terms of such general operations, it is to be understood that various features and operations may be present in different ways in different implementations.
現在請參照第1圖,其以流程圖展示出根據本發明一實施例的一種方法。如第1圖所示,可由執行用以掌管一處理器等待操作之一使用者階層指令的一處理器來實行方法100。如所見地,方法100可藉由解碼一已接收指令來開始(方塊110)。舉一實例來說,該指令可為由一應用程式備置的一使用者階層指令,例如利用多個執行緒實行的一應用程式,該等執行緒各包括可與執行一合作執行緒應用程式的動作具有某種互相依賴性的指令。在解碼該指令之後,該處理器可把一記憶體數值載入到一快取記憶體以及一暫存器中(方塊120)。更確切來說,該指令的一來源運算元可識別出一位置,例如上面可取得一數值的記憶體。可把此數值載入到一快取記憶體中,例如與正執行該指令之該核心相關聯的一低階層快取記憶體,例如一私密快取記憶體。再者,可把該數值儲存到該核心的一暫存器中。舉一實例來說,此暫存器可為該執行緒之一邏輯性處理器的一般用途暫存器。接下來,控制動作前進至方塊130。在方塊130中,可響應於該指令資訊來計算一期限。更確切來說,如果並未符合一狀況(例如一所欲數值未受到更新),此期限可為該等待狀態應該要發生的一段時間。在一實施例中,該指令格式可包括提供一期限計時器數值的資訊。為了判定到達此期限為止的適當時間,在某些實行方案中,可比較所接收到的期限計時器數值以及存在於該處理器中的一目前時間計數器數值,例如一時間戳記計數器(TSC)數值。在某些實施例中,可把此差異載入到一期限計時器中,其可利用一計數器或暫存器來實行。在一實施例中,此期限計時器可為開始進行倒數的一倒數計時器。在此實行方案中,係從該目前TSC值減去該期限,且該倒數計時器針對該等多個循環週期而起作用。當該TSC數值超出該期限時,它便觸發該處理器的恢復動作。換言之,如以下將討論地,當使該期限計時器減量而成為零時,如果仍然在該時間進行的話,便可終止該等待狀態。在一暫存器實行方案中,一比較器可在每個循環周期中比較該TSC計數器的數值以及該期限。Referring now to Figure 1, a flowchart is shown in accordance with an embodiment of the present invention. As shown in FIG. 1, method 100 can be performed by a processor executing a user hierarchy command to control a processor to wait for operation. As can be seen, method 100 can begin by decoding a received instruction (block 110). For example, the command can be a user-level instruction prepared by an application, such as an application implemented by using multiple threads, each of which includes a cooperative thread application that can be executed with the execution thread. Actions have some sort of interdependent instruction. After decoding the instruction, the processor can load a memory value into a cache memory and a scratchpad (block 120). More specifically, a source of operands of the instruction can identify a location, such as a memory that can take a value above. This value can be loaded into a cache memory, such as a low level cache memory associated with the core that is executing the instruction, such as a private cache memory. Furthermore, the value can be stored in a register of the core. As an example, the scratchpad can be a general purpose register for one of the threads of the logic processor. Next, the control action proceeds to block 130. In block 130, a deadline may be calculated in response to the instruction information. More specifically, if a condition is not met (eg, a desired value is not updated), this period may be the period of time that the waiting state should occur. In an embodiment, the instruction format may include information providing a deadline timer value. In order to determine the appropriate time to reach this deadline, in some implementations, the received deadline timer value and a current time counter value present in the processor, such as a timestamp counter (TSC) value, may be compared. . In some embodiments, this difference can be loaded into a deadline timer that can be implemented using a counter or register. In an embodiment, the deadline timer can be a countdown timer that begins counting down. In this implementation, the deadline is subtracted from the current TSC value and the countdown timer is active for the plurality of cycles. When the TSC value exceeds the deadline, it triggers the recovery action of the processor. In other words, as will be discussed below, when the deadline timer is decremented to zero, the wait state can be terminated if it is still being performed at that time. In a register implementation, a comparator can compare the value of the TSC counter and the duration in each cycle.
上述操作因此可適切地設定在該等待狀態中欲受存取以及測試的各種不同結構。因此,可以進入一等待狀態。此等待狀態可大致上為迴圈155的部分,該迴圈可反覆地執行,直到多種狀況中之一發生為止。如所見地,可以判定出來自該指令資訊的一目標值是否與儲存在該暫存器中的數值相符(決策方塊140)。在該指令資訊包括該目標值的一實行方案中,從記憶體取得且儲存在該暫存器中的資料可受到測試,以判定其數值是否與此目標值相符。若是,尚未符合此狀況,且控制動作前進至方塊195,其中可以完成執行該等待指令的動作。完成該指令的此動作可另外設定各種不同旗標或其他數值,以致能指出要退出該等待狀態之原因的一項指示。一旦完成了該指令,可以繼續進行請求該等待狀態之該執行緒的操作。The above operation can thus appropriately set various different structures to be accessed and tested in the waiting state. Therefore, it is possible to enter a waiting state. This wait state can be substantially a portion of loop 155 that can be executed repeatedly until one of a plurality of conditions occurs. As can be seen, it can be determined whether a target value from the command information matches the value stored in the register (decision block 140). In an implementation in which the instruction information includes the target value, the data retrieved from the memory and stored in the register can be tested to determine if its value matches the target value. If so, the condition has not been met and the control action proceeds to block 195 where the action to execute the wait instruction can be completed. This action of completing the instruction may additionally set various different flags or other values so as to indicate an indication of the reason for exiting the waiting state. Once the instruction is completed, the operation of the thread requesting the wait state can continue.
反之,如果在決策方塊140中判定出尚未符合該狀況,控制動作便前進至決策方塊150,其中可判定出該期限是否已經產生。若是,該指令便以上述方式完成。否則,控制動作便前進至決策方塊160,其中可判定出另一個硬體部件是否正尋求著喚醒該處理器。若是,該指令便以上述方式完成。否則,控制動作便前進至方塊170,其中可至少部分地根據該期限計時器數值判定出一低電力狀態。換言之,該處理器本身可以根據該期限將發生之前的剩餘時間並且以不需要OS介入的方式來判定一適當低電力狀態。為了實現此項判定,在某些實施例中,可以使用一處理器之一非核心的邏輯組件。此邏輯組件可包括一圖表或者可與一圖表相關聯,該圖表連結了各種不同低電力狀態與期限計時器數值,如下所述地。根據方塊170中的此項判定結果,該處理器可進入一低電力狀態(方塊180)。在該低電力狀態中,可使該處理器的各種不同結構置於一低電力狀態,即上面執行有該等指令的一核心以及其他部件二者。欲置於一低電力狀態中的該等特定結構以及該低電力狀態的位準可依據實行方案而不同。要注意的是,如果因為一已更新數值並非為該目標值而越過了該迴圈,可根據已更新期限計時器數值來判定一新的低電力狀態,因為如果只剩下有限的時間,進入某一種低電力狀態(例如一深度睡眠狀態)可能是不適當的。Conversely, if it is determined in decision block 140 that the condition has not been met, the control action proceeds to decision block 150 where it can be determined if the deadline has been generated. If so, the instruction is completed in the above manner. Otherwise, control proceeds to decision block 160 where it can be determined if another hardware component is seeking to wake up the processor. If so, the instruction is completed in the above manner. Otherwise, control proceeds to block 170 where a low power state can be determined based at least in part on the deadline timer value. In other words, the processor itself can determine an appropriate low power state based on the time remaining before the deadline will occur and in a manner that does not require OS intervention. To achieve this determination, in some embodiments, one of the processors may be a non-core logical component. This logic component can include a chart or can be associated with a chart that links various different low power state and deadline timer values, as described below. Based on the result of this determination in block 170, the processor can enter a low power state (block 180). In this low power state, the various different configurations of the processor can be placed in a low power state, i.e., a core and other components on which the instructions are executed. The particular structure to be placed in a low power state and the level of the low power state may vary depending on the implementation. It should be noted that if the loop is crossed because an updated value is not the target value, a new low power state can be determined based on the updated deadline timer value, because if only a limited time remains, A certain low power state (eg, a deep sleep state) may be inappropriate.
可發生使該核心退出該低電力狀態的各種不同事件。要注意的是,如果經快取資料(即,對應於該受監看位置)已經受到更新(決策方塊190),便可執行該低電力狀態。若是,控制動作將返回到決策方塊140。相似地,如果該期限過期及/或從另一個硬體部件接收到一喚醒信號,控制動作可從該低電力狀態前進至決策方塊150與160中之一。儘管在第1圖的實施例中係以此種高階層實行方案來展示出本發明,要了解的是,本發明的範圍不受限於此。Various different events can occur that cause the core to exit the low power state. It is noted that the low power state can be executed if the cached data (i.e., corresponding to the monitored location) has been updated (decision block 190). If so, the control action will return to decision block 140. Similarly, if the deadline expires and/or a wake-up signal is received from another hardware component, the control action can proceed from the low power state to one of decision blocks 150 and 160. Although the present invention has been exhibited in such a high-level embodiment in the embodiment of Fig. 1, it is to be understood that the scope of the invention is not limited thereto.
在其他實行方案中,可發生針對一目標值的一項遮罩式測試。換言之,該使用者階層指令可隱含地表示欲獲取的一目標值。舉一實例來說,此目標值可為介於從記憶體取得之一來源值以及出現在該指令之一來源/目的地運算元中之一遮罩值之間之一項遮罩操作的一非零數值。在一實施例中,該使用者階層指令可為一載入、遮罩、等待,如果為一處理器ISA的零(LDMWZ)指令。在一實施例中,該指令可為LDMWZ r32/64、M32/64的格式。在此格式中,第一運算元(r32/64)可儲存一遮罩,且第二運算元(M32/64)可識別出一來源值(即,該受監看位置)。依次地,可把一逾時值儲存在一第三暫存器中。例如,該期限可位於一隱含暫存器中。尤其,可以使用該等EDX:EAX暫存器,其為當該TSC計數器受讀取時受寫入的相同暫存器組。大致上,該指令可進行一信號值的非繁忙輪詢,並且如果該信號是不可得,便進入一低電力等待狀態。在不同實行方案中,可以掌管位元式信號以及計數信號二種,其中零表示沒有項目正在等待中。該逾時值可指出在無條件地恢復操作之前,以該處理器應該等待一非零結果的TSC循環周期來測量的時間長度。在一實施例中,可經由一記憶體映射暫存器(例如一組態與狀態暫存器(CSR))而針對哪些實體處理器處於一低電力狀態的資訊來備置軟體。In other implementations, a masked test for a target value can occur. In other words, the user hierarchy instruction may implicitly represent a target value to be acquired. For example, the target value can be one of a masking operation between one source value from memory and one mask value in one of the source/destination operands of the instruction. Non-zero value. In one embodiment, the user hierarchy instruction can be a load, a mask, a wait, if it is a zero (LDMWZ) instruction of a processor ISA. In an embodiment, the instructions may be in the format of LDMWZ r32/64, M32/64. In this format, the first operand (r32/64) can store a mask, and the second operand (M32/64) can identify a source value (ie, the monitored location). In turn, a timeout value can be stored in a third register. For example, the deadline can be in an implicit register. In particular, these EDX:EAX registers can be used, which are the same set of registers that are written when the TSC counter is read. In general, the command can perform a non-busy polling of a signal value and enter a low power wait state if the signal is not available. In different implementations, it is possible to control both the bit signal and the count signal, where zero means that no item is waiting. The timeout value may indicate the length of time that the processor should wait for a non-zero result TSC cycle period before unconditionally restoring the operation. In one embodiment, the software may be provisioned via a memory mapping register (eg, a configuration and state register (CSR)) for information on which physical processors are in a low power state.
在此實施例中,該LDMWZ指令將從該來源記憶體位置載入資料、以該來源/目的地數值遮罩它、並且進行測試以確認所得數值是否為零。如果該遮罩值不為零,便使從記憶體載入的該數值置於未受遮罩的該來源/目的地暫存器中。否則,該處理器將進入一低電力等待狀態。要注意的是,此低電力狀態可或不可對應於一目前界定低電力狀態,例如根據進階組態與電源介面(ACPI)規格第4版(2009年6月16日發表)的所謂C-狀態。該處理器可維持為低電力狀態,直到指定的時間區間過期、發出表示一外部異常的信號(例如一般中斷(INTR)、非遮罩中斷(NMI)、或系統管理中斷(SMI))為止,或者以受遮罩時為非零的一數值來寫入該來源記憶體位置。作為進入此等待狀態的部分,該處理器可清除一記憶體映射暫存器(CSR)位元,其指出該處理器目前正處於等待中。In this embodiment, the LDMWZ instruction will load the data from the source memory location, mask it with the source/destination value, and test to confirm if the resulting value is zero. If the mask value is not zero, the value loaded from the memory is placed in the unmasked source/destination register. Otherwise, the processor will enter a low power wait state. It should be noted that this low power state may or may not correspond to a currently defined low power state, such as the so-called C- according to the Advanced Configuration and Power Interface (ACPI) Specification Version 4 (published on June 16, 2009). status. The processor can be maintained in a low power state until a specified time interval expires, signaling a signal indicating an external exception (eg, general interrupt (INTR), non-mask interrupt (NMI), or system management interrupt (SMI)), Or write the source memory location with a value that is non-zero when masked. As part of entering this wait state, the processor can clear a Memory Map Register (CSR) bit indicating that the processor is currently waiting.
因為以受遮罩時將產生一非零數值的一數值來寫入該受監看位置而從該等待狀態退出時,可以清除一旗標暫存器的非零數值指示符,並且把該未受遮罩值讀取置於該目的地暫存器中。如果計時器的過期狀況造成從該低電力狀態中退出,可設定該旗標暫存器的該零數值指示符,以允許軟體能檢測該種狀況。如果因為一外部異常而發生一項退出狀況,該處理器以及記憶體的狀態將使該指令不被視為已經執行。因此,在返回到該正常執行流程時,相同的LDMWZ指令將受到再次執行。The non-zero value indicator of a flag register can be cleared and the non-zero value indicator of a flag register can be cleared because a value that would generate a non-zero value when masked is written to the monitored position and exits from the wait state. The masked value read is placed in the destination scratchpad. If the expiration condition of the timer causes an exit from the low power state, the zero value indicator of the flag register can be set to allow the software to detect the condition. If an exit condition occurs due to an external exception, the state of the processor and memory will cause the instruction to not be considered to have been executed. Therefore, the same LDMWZ instruction will be executed again when returning to the normal execution flow.
現在請參照第2圖,以流程圖展示出根據本發明另一實施例而可針對一目標值進行的一項測試。如第2圖所示,方法200可藉著把來源資料載入到一第一暫存器中來開始(方塊210)。可利用在一第二暫存器中出現的一遮罩來遮罩此來源資料(方塊220)。在各種不同實施例中,該等第一與第二暫存器可由一指令來指定,且可對應於分別用以儲存該來源資料與目的地資料的位置。可隨後判定出該遮罩操作的結果是否為零(決策方塊230)。若是,尚未符合該所欲狀況,且該處理器可進入一低電力狀態(方塊240)。否則,可把該來源資料儲存到該第二暫存器中(方塊250),且指令執行動作便完成(方塊260)。Referring now to Figure 2, a flow chart showing a test for a target value in accordance with another embodiment of the present invention is shown. As shown in FIG. 2, method 200 can begin by loading source data into a first register (block 210). The source material can be masked by a mask appearing in a second register (block 220). In various embodiments, the first and second registers may be specified by an instruction and may correspond to locations for storing the source and destination data, respectively. It can then be determined if the result of the masking operation is zero (decision block 230). If so, the desired condition has not been met and the processor can enter a low power state (block 240). Otherwise, the source data can be stored in the second register (block 250) and the instruction execution action is complete (block 260).
在該等待狀態中,將更新該目標位置,如決策方塊265中所判定地,控制動作將返回到方塊220以進行該遮罩操作。如果判定出在該等待狀態中已經發生了另一個狀況(如決策方塊270所判定地),控制動作便前進至方塊260以供完成該指令。儘管在第2圖的實施例中係以此種高階層實行方案來展示出本發明,要了解的是,本發明的範圍不受限於此。In the wait state, the target location will be updated, as determined in decision block 265, the control action will return to block 220 to perform the masking operation. If it is determined that another condition has occurred in the wait state (as determined by decision block 270), then control proceeds to block 260 for completion of the command. Although the present invention has been exhibited in such a high-level embodiment in the embodiment of Fig. 2, it is to be understood that the scope of the invention is not limited thereto.
現在請參照第3圖,其以方塊圖展示出根據本發明一實施例的一種處理器核心。如第3圖所示,處理器核心300可為一種多階段管線式脫序處理器。在第3圖中,係以相對簡化的視圖展示出處理器核心300,以展示出根據本發明一實施例之結合處理器等待狀態使用的各種不同特徵。Referring now to FIG. 3, a block diagram illustrates a processor core in accordance with an embodiment of the present invention. As shown in FIG. 3, processor core 300 can be a multi-stage pipelined out-of-order processor. In FIG. 3, processor core 300 is shown in a relatively simplified view to illustrate various different features in connection with processor wait state usage in accordance with an embodiment of the present invention.
如第3圖所示,核心300包括前端單元310,其可用來擷取欲受執行的指令並且製備該等指令以供後續用於該處理器中。例如,前端單元310可包括擷取單元301、指令快取記憶體303、以及指令解碼器305。在某些實行方案中,前端單元310可另包括一線跡快取記憶體,以及微碼儲存體與一微操作儲存體。擷取單元301可擷取巨集指令,例如從記憶體或指令快取記憶體303,並且把該等指令饋送到指令解碼器305以把它們解碼為基元,即供該處理器執行的微操作。根據本發明一實施例,欲在前端單元310中受到掌管的該種指令可為一使用者階層處理器等待指令。此指令可令該等前端單元能存取各種不同微操作,以致能執行該等操作,例如上面與該等待指令相關聯的多項操作。As shown in FIG. 3, core 300 includes a front end unit 310 that can be used to retrieve instructions to be executed and to prepare the instructions for subsequent use in the processor. For example, the front end unit 310 can include a capture unit 301, an instruction cache 303, and an instruction decoder 305. In some implementations, the front end unit 310 can further include a stitch cache memory, and a microcode storage body and a micro operation storage body. The capture unit 301 can retrieve macro instructions, such as memory 303 from memory or instructions, and feed the instructions to the instruction decoder 305 to decode them into primitives, ie, micro-processors for execution by the processor. operating. According to an embodiment of the invention, the instruction to be managed in the front end unit 310 may be a user hierarchy processor waiting for an instruction. This instruction may enable the front end unit to access various different micro operations such that the operations can be performed, such as the multiple operations associated with the wait instruction above.
在前端單元310以及執行單元320之間耦合的是脫序(OOO)引擎315,其可用來接收該微指令並且製備該指令以供執行。更確切來說,OOO引擎315可包括各種不同緩衝器,其用以重新定序微指令流程並且配置執行所需的各種不同資源,並且在各種不同暫存器檔案(例如,暫存器檔案330以及延伸式暫存器檔案335)中的儲存位置上提供重新命名邏輯性暫存器的動作。暫存器檔案330可包括用於整數與浮點操作的分別暫存器檔案。延伸式暫存器檔案335可提供用於向量大小單元的儲存體,例如每個暫存器256個或512個位元。Coupled between front end unit 310 and execution unit 320 is an out of order (OOO) engine 315 that can be used to receive the microinstructions and prepare the instructions for execution. More specifically, the OOO engine 315 can include a variety of different buffers for reordering the microinstruction flow and configuring the various different resources required for execution, and in various different scratchpad archives (eg, the scratchpad archive 330) And the act of renaming the logical scratchpad is provided at a storage location in the extended scratchpad file 335). The scratchpad file 330 can include separate scratchpad files for integer and floating point operations. The extended scratchpad file 335 can provide a bank for vector size units, such as 256 or 512 bits per register.
各種不同資源可出現在執行單元320中,例如包括各種不同整數、浮點、以及單一指令多個資料(SIMD)邏輯組件單元,以及其他專業化硬體。例如,該等執行單元可包括一或多個運算邏輯單元(ALU) 322。此外,可存在著根據本發明一實施例的喚醒邏輯組件324。該種喚醒邏輯組件可響應於一使用者階層指令而用來執行與進行一處理器等待模式有關之該等操作中的某些。如以下進一步討論地,用以掌管該等等待狀態的其他邏輯組件可存在於一處理器的另一個部分中,例如一非核心。同樣展示於第3圖中的是一組計時器326。在本文中用以進行分析的相關計時器包括一TSC計時器,以及一期限計時器,其可藉由與一期限對應的一數值來設定,而如果並未符合任何其他狀況,該處理器將在該期限之前離開該等待狀態。當該期限計時器到達一預定計數值(其在某些實施例中可為一倒數到零的動作)時,喚醒邏輯組件324可啟動某些操作。可把結果提供給收回邏輯組件,即一重新定序緩衝器(ROB) 340。更確切來說,ROB 340可包括各種不同陣列以及用以接收與受執行指令相關聯之資訊的邏輯組件。此資訊隨後受到ROB 340檢視,以判定是否可有效地收回該等指令並且提交給該處理器之架構式狀態的結果資料,或者是否有能防止該等指令之適當收回而發生的一或多個異常。當然,ROB 340可掌管與收回動作相關聯的其他操作。在根據本發明一實施例之一處理器等待指令的脈絡中,收回動作可使ROB 340設定一旗標暫存器或其他狀態暫存器之一或多個指示符的狀態,其可指出一處理器退出一等待狀態的原因。A variety of different resources may be present in execution unit 320, including, for example, various different integers, floating point, and single instruction multiple data (SIMD) logic component units, as well as other specialized hardware. For example, the execution units may include one or more operational logic units (ALUs) 322. Additionally, there may be a wake-up logic component 324 in accordance with an embodiment of the present invention. The wake-up logic component can be used to perform some of the operations associated with performing a processor wait mode in response to a user hierarchy instruction. As discussed further below, other logic components to manage these wait states may exist in another portion of a processor, such as a non-core. Also shown in Figure 3 is a set of timers 326. The associated timer for analysis herein includes a TSC timer and a deadline timer that can be set by a value corresponding to a deadline, and if not in any other condition, the processor will Leave the wait state before the deadline. When the deadline timer reaches a predetermined count value (which may be a countdown to zero action in some embodiments), the wake logic component 324 may initiate certain operations. The result can be provided to a reclaim logic component, a reorder buffer (ROB) 340. More specifically, ROB 340 can include a variety of different arrays and logic components to receive information associated with the executed instructions. This information is then viewed by the ROB 340 to determine whether the instructions can be effectively retrieved and submitted to the processor's architectural status, or whether there are one or more instances that prevent proper retraction of the instructions. abnormal. Of course, the ROB 340 can take over other operations associated with the retract action. In a context in which the processor waits for an instruction in accordance with an embodiment of the present invention, the retracting action may cause the ROB 340 to set the state of one or more indicators of a flag register or other state register, which may indicate a The reason the processor exits a wait state.
如第3圖所示,ROB 340係耦合至快取記憶體350,其在一實施例中可為一低階層快取記憶體(例如一L1快取記憶體),然本發明的範圍並不受限於此。同樣地,執行單元320可直接地耦合至快取記憶體350。如所見地,快取記憶體350包括監看引擎352,其可受組配成監看一特定快取行,即一受監看位置,並且當該數值受到更新時、當該快取行中的快取同調狀態發生一項改變時、及/或當該快取行遺失時,可對喚醒邏輯組件324(及/或對非核心部件)提供一項反饋。監看引擎352取得一條給定行並且使它維持於共享狀態。如果監看引擎352曾經從該共享狀態遺失該快取行,它將喚醒該處理器。從快取記憶體350,資料通訊可藉由較高階層的快取記憶體、系統記憶體等等來進行。儘管在第3圖的實施例中係以此種高階層實行方案來展示出本發明,要了解的是,本發明的範圍不受限於此。As shown in FIG. 3, the ROB 340 is coupled to the cache memory 350, which in one embodiment may be a low-level cache memory (eg, an L1 cache memory), although the scope of the present invention is not Limited by this. Likewise, execution unit 320 can be directly coupled to cache memory 350. As can be seen, the cache memory 350 includes a watch engine 352 that can be assembled to monitor a particular cache line, ie, a monitored location, and when the value is updated, when in the cache line A wakeup logic component 324 (and/or a non-core component) may be provided with a feedback when a change occurs in the cache coherency state and/or when the cache line is lost. The watch engine 352 takes a given row and maintains it in a shared state. If the watch engine 352 has lost the cache line from the shared state, it will wake up the processor. From the cache memory 350, data communication can be performed by a higher level of cache memory, system memory, and the like. Although the present invention has been described in the embodiment of Fig. 3 in such a high-level embodiment, it is to be understood that the scope of the invention is not limited thereto.
現在請參照第4圖,其以方塊圖展示出根據本發明一實施例的一種處理器。如第4圖所示,處理器400可為一種多核心處理器,包括多個核心410a 至410n 。在一實施例中,可把各個該種核心組配為上面參照第3圖所述的核心300。該等各種不同核心可經由互連體415耦合至包括各種不同部件的非核心420。如所見地,非核心420可包括共享快取記憶體430,其可為一最後階層快取記憶體。此外,該非核心可包括整合式記憶體控制器440、各種不同介面450、以及電力管理單元455。在各種不同實施例中,可把與執行一處理器等待指令相關聯之功能性中的至少某些實行於電力管理單元455中。例如,根據與該指令一起接收到的資訊,例如該期限計時器數值,電力管理單元455可判定當中要把正在執行該等待指令的一給定核心置於此狀態的一適當低電力狀態。在一實施例中,電力管理單元455可包括使計時器數值與一低電力狀態相關聯的一圖表。單元455可根據與一指令相關聯的經判定期限值來查找此圖表,並且選出對應的等待狀態。依次地,電力管理單元455可產生多個控制信號,以使各種不同部件,即一給定核心以及其他處理器單元二者,進入一低電力狀態。如所見地,處理器400可經由一記憶體匯流排與系統記憶體460通訊。此外,藉由介面450,可以與各種不同晶片下部件進行連結,例如周邊裝置、大量儲存體等等。儘管在第4圖的實施例中係以特定實行方案來展示出本發明,要了解的是,本發明的範圍不受限於此。Referring now to Figure 4, a block diagram illustrates a processor in accordance with an embodiment of the present invention. As shown in FIG. 4, the processor 400 may be a multi-core processor comprising a plurality of core 410 a to 410 n. In an embodiment, each such core may be grouped as core 300 as described above with reference to FIG. The various cores can be coupled via interconnect 415 to a non-core 420 that includes a variety of different components. As can be seen, the non-core 420 can include a shared cache 430, which can be a last-level cache memory. Additionally, the non-cores can include an integrated memory controller 440, various different interfaces 450, and a power management unit 455. In various embodiments, at least some of the functionality associated with executing a processor wait instruction may be implemented in power management unit 455. For example, based on information received with the instruction, such as the expiration timer value, power management unit 455 can determine an appropriate low power state in which a given core that is executing the wait instruction is placed in this state. In an embodiment, power management unit 455 can include a chart that correlates the timer value to a low power state. Unit 455 can look up the chart based on the determined deadline value associated with an instruction and select a corresponding wait state. In turn, power management unit 455 can generate a plurality of control signals to cause various different components, namely a given core and other processor units, to enter a low power state. As can be seen, the processor 400 can communicate with the system memory 460 via a memory bus. In addition, through the interface 450, it is possible to connect with various different under-wafer components, such as peripheral devices, bulk storage, and the like. Although the present invention has been shown in a specific embodiment in the embodiment of Fig. 4, it is to be understood that the scope of the invention is not limited thereto.
在其他實施例中,一處理器架構可包括模擬特徵,以使該處理器可執行一第一ISA的指令,稱為一來源ISA,其中該架構係根據一第二ISA,稱為一目標ISA。大致上,包括該OS以及應用程式二者的軟體係受彙編到該來源ISA中,且硬體將以特別效能及/或能源效率特徵來針對一給定硬體實行方案實行特別設計的該目標ISA。In other embodiments, a processor architecture may include an analog feature such that the processor can execute a first ISA instruction, referred to as a source ISA, wherein the architecture is referred to as a target ISA according to a second ISA. . In general, the soft system including both the OS and the application is compiled into the source ISA, and the hardware will specifically design the target for a given hardware implementation with special performance and/or energy efficiency characteristics. ISA.
現在請參照第5圖,其以方塊圖展示出根據本發明另一實施例的一種處理器。如第5圖所示,系統500包括處理器510與記憶體520。記憶體520包括保有系統與應用程式軟體二者的習知記憶體522,以及保有針對該目標ISA而裝備之軟體的隱蔽記憶體524。如所見地,處理器510包括把來源碼轉換成目標碼的模擬引擎530。模擬動作可藉由解譯或二進制轉譯來完成。解譯通常用於它首先遇到的程式碼。隨後,因為係經由動態特徵研究而發現經常執行程式碼區域(例如熱點),它們被轉譯成該目標ISA且受儲存在隱蔽記憶體524的一程式碼快取記憶體中。最佳化動作係做為轉譯程序的部分來完成,而相當經常使用的程式碼將後續地受到進一步最佳化。該等受轉譯程式碼區塊將維持在程式碼快取記憶體524中,以使它們可以重複地受到使用。Referring now to Figure 5, a block diagram illustrates a processor in accordance with another embodiment of the present invention. As shown in FIG. 5, system 500 includes a processor 510 and a memory 520. The memory 520 includes a conventional memory 522 that holds both the system and the application software, and a covert memory 524 that holds software for the target ISA. As can be seen, the processor 510 includes a simulation engine 530 that converts the source code into a target code. The simulation action can be done by interpretation or binary translation. Interpretation is usually used for the code it first encounters. Subsequently, because of the frequent execution of code regions (eg, hotspots) via dynamic feature studies, they are translated into the target ISA and stored in a code cache of hidden memory 524. The optimized actions are done as part of the translation process, and the code that is used quite often is subsequently further optimized. The translated code blocks will be maintained in the code cache 524 so that they can be used repeatedly.
仍請參照第5圖,處理器510,其可為一種多核心處理器的一核心,可包括對指令快取記憶體(I-快取記憶體) 550提供指令指標器位址的程式計數器540。如所見地,I-快取記憶體550可另直接地接收來自隱蔽記憶體部分524而未達到一給定指令位址的目標ISA指令。因此,I-快取記憶體550可儲存目標ISA指令,其可提供給為該目標ISA之一解碼器的解碼器560以接收處於巨集指令階層的進入指令,並且把該等指令轉換成微指令以供在處理器管線570中執行。儘管本發明的範圍並不受限於此,管線570可為一脫序管線,包括用以執行與收回指令的各種不同階段。如上所述的各種不同執行單元、計時器、計數器、儲存位置與監視器可位於管線570中,以執行根據本發明一實施例的一處理器等待指令。換言之,即使在當中處理器510為不同於對其提供一使用者階層處理器等待指令之一種微架構之一微架構的一實行方案中,可在基本硬體上執行該指令。Still referring to FIG. 5, the processor 510, which may be a core of a multi-core processor, may include a program counter 540 that provides an instruction pointer address to the instruction cache (I-cache memory) 550. . As can be seen, the I-cache memory 550 can additionally receive the target ISA instructions from the covert memory portion 524 that do not reach a given instruction address. Thus, I-cache memory 550 can store target ISA instructions that can be provided to decoder 560, which is one of the target ISA decoders, to receive incoming instructions at the macro instruction level and convert the instructions into micro The instructions are for execution in processor pipeline 570. Although the scope of the present invention is not limited in this regard, the pipeline 570 can be a separate pipeline including various stages for executing and retracting instructions. Various different execution units, timers, counters, storage locations and monitors as described above may be located in pipeline 570 to perform a processor wait instruction in accordance with an embodiment of the present invention. In other words, even though the processor 510 is in an implementation different from one of the microarchitectures that provide a user hierarchy processor to wait for instructions, the instructions can be executed on the base hardware.
現在請參照第6圖,其以流程圖展示出根據本發明一實施例之合作執行緒之間的互動。如第6圖所示,方法600可用來執行多個執行緒,例如在一多執行緒處理器中。在第6圖的脈絡中,二個執行緒,執行緒1與執行緒2,為一種單一應用程式,且可為互相依賴的,因此欲由一執行緒使用的資料必須首先受到該第二執行緒更新。因此,如所見地,執行緒1可在其執行過程中接收一處理器等待指令(方塊610)。在執行此等待指令的過程中,可判定出是否已經符合一測試狀況(決策方塊620)。若否,該執行緒可進入一低電力狀態(方塊630)。儘管並未展示於第6圖,要了解的是,當各種不同狀況中之一發生時可退出此狀態。反之,如果判定出已經符合該測試狀況,控制動作將前進至方塊640,其中可在該第一執行緒中繼續進行程式碼執行動作。要注意的是,該測試狀況可參照一受監看位置,以指出該第二執行緒已經成功地於何時完成一項更新。因此,在執行參照執行緒2所示的該程式碼之前,並未符合該測試狀況,且該處理器將進入一低電力狀態。Referring now to Figure 6, a flow chart illustrates the interaction between cooperative threads in accordance with an embodiment of the present invention. As shown in FIG. 6, method 600 can be used to execute multiple threads, such as in a multi-thread processor. In the context of Figure 6, two threads, thread 1 and thread 2, are a single application and can be interdependent, so the data to be used by a thread must first be subjected to the second execution. Update. Thus, as can be seen, thread 1 can receive a processor wait instruction during its execution (block 610). In the course of executing this wait instruction, it may be determined whether a test condition has been met (decision block 620). If not, the thread can enter a low power state (block 630). Although not shown in Figure 6, it is to be understood that this state can be exited when one of various conditions occurs. Conversely, if it is determined that the test condition has been met, the control action will proceed to block 640 where the code execution action may continue in the first thread. It should be noted that the test condition can refer to a monitored location to indicate when the second thread has successfully completed an update. Therefore, the test condition is not met before the execution of the code shown in reference thread 2, and the processor will enter a low power state.
仍請參照第6圖,有關執行緒2,它可執行與該第一執行緒相互依賴的程式碼(方塊650)。例如,該第二執行緒可執行用以更新一或多個數值的程式碼,該(等)數值係用於執行該第一執行緒的過程。為了確保該第一執行緒係使用該等經更新數值來進行執行動作,可以寫入該應用程式,以使該第一執行緒進入該低電力狀態,直到該資料受到該第二執行緒更新為止。因此,在執行該第二執行緒的過程中,可以判定出它是否已經完成相互依賴程式碼的執行動作(決策方塊660)。若否,可以繼續相互依賴程式碼的執行動作。反之,如果已經完成此相互依賴程式碼片段,控制動作將前進至方塊670,其中可把一預定值寫入到該受監看位置中(方塊670)。例如,此預定值可對應於與該處理器等待指令相關聯的一測試值。在其他實施例中,該預定值可為一數值,以使得受遮罩或受使用作為一遮罩時(在該受監看位置中有一數值),該結果不為零,其表示已經符合該測試狀況,且該第一執行緒可繼續執行。仍請參照執行緒2,在寫下此預定值之後,可繼續進行該第二執行緒的程式碼執行動作(方塊680)。儘管在第6圖的實施例中係以特定實行方案來展示出本發明,要了解的是,本發明的範圍不受限於此。Still referring to FIG. 6, regarding thread 2, it can execute a code that is interdependent with the first thread (block 650). For example, the second thread can execute a code for updating one or more values, the value being used to execute the first thread. To ensure that the first thread performs the action using the updated values, the application can be written to cause the first thread to enter the low power state until the data is updated by the second thread. . Therefore, during execution of the second thread, it can be determined whether it has completed the execution of the interdependent code (decision block 660). If not, you can continue to rely on the execution of the code. Conversely, if the interdependent code segment has been completed, control will proceed to block 670 where a predetermined value can be written to the monitored location (block 670). For example, the predetermined value may correspond to a test value associated with the processor waiting for an instruction. In other embodiments, the predetermined value can be a value such that when masked or used as a mask (having a value in the monitored position), the result is not zero, indicating that the The condition is tested and the first thread can continue to execute. Still referring to thread 2, after writing the predetermined value, the code execution action of the second thread can continue (block 680). Although the present invention has been shown in a specific embodiment in the embodiment of Fig. 6, it is to be understood that the scope of the invention is not limited thereto.
因此,本發明的實施例可致能一種輕量級耽誤機構,其允許一處理器能延遲等待一或多個預定狀況發生,而不需要OS的介入。於此,不需要使一應用程式輪巡一信號/數值以在一迴圈中變為真實,包括使該處理器消耗電力且在一超執行緒機器中防止其他執行緒使用該等循環週期的測試、中止、以及跳躍操作。因此,可以避免冗餘工作以及排程限制二者中的OS監看動作(該等待應用程式可能不是受排程的下一個執行緒)。因此,可以在多個合作執行緒之間出現輕量級通訊,再者,一處理器可根據使用者已經指出的時間參數彈性地選出一睡眠狀態。Thus, embodiments of the present invention can enable a lightweight corruption mechanism that allows a processor to delay waiting for one or more predetermined conditions to occur without the intervention of the OS. There is no need for an application to patrol a signal/value to become true in a loop, including causing the processor to consume power and preventing other threads from using the cycles in a hyper-thread machine. Test, abort, and jump operations. Therefore, it is possible to avoid OS monitoring actions in both redundant work and scheduling restrictions (the waiting application may not be the next thread of the schedule). Therefore, lightweight communication can occur between multiple cooperative threads. Further, a processor can flexibly select a sleep state according to time parameters that the user has indicated.
本發明的實施例可實行於多種不同類型的系統中。現在請參照第7圖,其以方塊圖展示出根據本發明一實施例的一種系統。如第7圖所示,多處理器系統700為一種點對點互連體系統,且包括經由點對點互連體750耦合的第一處理器770以及第二處理器780。如第7圖所示,處理器770與處理器780可各為多核心處理器,包括第一處理器核心與第二處理器核心(即,處理器核心774a與處理器核心774b以及處理器核心784a與處理器核心784b),然更多核心可潛在地位於該等處理器中。該等處理器核心可執行各種不同指令,包括一使用者階層處理器等待指令。Embodiments of the invention may be implemented in a variety of different types of systems. Referring now to Figure 7, a block diagram illustrates a system in accordance with an embodiment of the present invention. As shown in FIG. 7, multiprocessor system 700 is a point-to-point interconnect system and includes a first processor 770 and a second processor 780 coupled via a point-to-point interconnect 750. As shown in FIG. 7, processor 770 and processor 780 can each be a multi-core processor, including a first processor core and a second processor core (ie, processor core 774a and processor core 774b and processor core 784a and processor core 784b), although more cores may potentially be located in the processors. The processor cores can execute a variety of different instructions, including a user hierarchy processor waiting for instructions.
仍請參照第7圖,第一處理器770另包括記憶體控制器中樞(MCH) 772以及點對點(P-P)介面776與點對點(P-P)介面778。相似地,第二處理器780包括MCH 782以及P-P介面786與P-P介面788。如第7圖所示,MCH 772與MCH 782使該等處理器耦合至個別記憶體,即記憶體732與記憶體734,其為本地式附接至該等個別處理器之主要記憶體的部分(例如一動態隨機存取記憶體(DRAM))。第一處理器770與第二處理器780可分別經由P-P互連體752與P-P互連體754耦合至晶片組790。如第7圖所示,晶片組790包括P-P介面794與P-P介面798。Still referring to FIG. 7, the first processor 770 further includes a memory controller hub (MCH) 772 and a point-to-point (P-P) interface 776 and a point-to-point (P-P) interface 778. Similarly, second processor 780 includes MCH 782 and P-P interface 786 and P-P interface 788. As shown in FIG. 7, MCH 772 and MCH 782 couple the processors to individual memories, namely memory 732 and memory 734, which are locally attached to the main memory of the individual processors. (eg, a dynamic random access memory (DRAM)). First processor 770 and second processor 780 can be coupled to chip set 790 via P-P interconnect 752 and P-P interconnect 754, respectively. As shown in FIG. 7, the chip set 790 includes a P-P interface 794 and a P-P interface 798.
再者,晶片組790包括用以藉由P-P互連體739使晶片組790耦合至高效能圖形引擎738的介面792。依次地,晶片組790可經由介面796耦合至第一匯流排716。如第7圖所示,各種不同輸入/輸出(I/O)裝置714可耦合至第一匯流排716,與匯流排橋接器718一起,其使第一匯流排716耦合至第二匯流排720。各種不同裝置可耦合至第二匯流排720,例如包括鍵盤/滑鼠722、通訊裝置726、以及資料儲存單元728,例如碟片驅動機或其他大量儲存裝置,其在一實施例中可包括程式碼730。再者,音訊I/O 724可耦合至第二匯流排720。Moreover, chipset 790 includes an interface 792 for coupling wafer set 790 to high performance graphics engine 738 via P-P interconnect 739. In turn, wafer set 790 can be coupled to first bus bar 716 via interface 796. As shown in FIG. 7, various input/output (I/O) devices 714 can be coupled to the first bus bar 716, along with the bus bar bridge 718, which couples the first bus bar 716 to the second bus bar 720. . A variety of different devices can be coupled to the second bus 720, including, for example, a keyboard/mouse 722, a communication device 726, and a data storage unit 728, such as a disc drive or other mass storage device, which in one embodiment can include a program Code 730. Further, the audio I/O 724 can be coupled to the second bus 720.
本發明的實施例可實行於程式碼中,並且可受儲存在上面儲存有指令的一儲存媒體上,其可用來規劃一系統以執行該等指令。該儲存媒體可包括但不限於:任何類型的碟片,包括軟碟片、光碟片、固態硬碟驅動機(SSD)、小型光碟唯讀記憶體(CD-ROM)、可複寫式光碟(CD-RW)、以及磁電性光碟;半導體裝置,例如唯讀記憶體(ROM)、諸如動態隨機存取記憶體(DRAM)的隨機存取記憶體(RAM)、靜態隨機存取記憶體(SRAM)、可抹除式可規劃唯讀記憶體(EPROM)、快閃記憶體、電性可抹除式可規劃唯讀記憶體(EEPROM);磁性或光學卡、或適於儲存電子指令的任何其他類型媒體。Embodiments of the present invention can be implemented in a code and can be stored on a storage medium having stored thereon instructions that can be used to plan a system to execute the instructions. The storage medium may include, but is not limited to, any type of disc, including floppy discs, optical discs, solid state drive (SSD), compact disc read only memory (CD-ROM), rewritable compact disc (CD) -RW), and magneto-optical discs; semiconductor devices such as read-only memory (ROM), random access memory (RAM) such as dynamic random access memory (DRAM), static random access memory (SRAM) Erasable, programmable read-only memory (EPROM), flash memory, electrically erasable programmable read-only memory (EEPROM); magnetic or optical card, or any other suitable for storing electronic instructions Type media.
儘管已經參照有限數量的實施例來揭露本發明,熟知技藝者將可從其意會到各種不同的修改方案與變化方案。所意圖的是,以下的申請專利範圍涵蓋屬於本發明之真實精神與範圍內的該等修改方案與變化方案。Although the invention has been described with reference to a limited number of embodiments, those skilled in the art will recognize various modifications and variations. It is intended that the following claims are intended to cover such modifications and alternatives
100、200、600...方法100, 200, 600. . . method
110~195、210~270、610~680...步驟方塊110~195, 210~270, 610~680. . . Step block
300、774a-b、784a-b...處理器核心300, 774a-b, 784a-b. . . Processor core
301...擷取單元301. . . Capture unit
303...指示快取記憶體303. . . Indicating cache memory
305...指示解碼器305. . . Indicating decoder
310...前端單元310. . . Front end unit
315...脫序(OOO)引擎315. . . Out of order (OOO) engine
320...執行單元320. . . Execution unit
322...運算邏輯單元(ALU)322. . . Arithmetic logic unit (ALU)
324...喚醒邏輯組件324. . . Wake up logic component
326...計時器326. . . Timer
330...暫存器檔案330. . . Scratch file
335...延伸式暫存器檔案335. . . Extended register file
340...重新定序緩衝器(ROB)340. . . Reordering buffer (ROB)
350...快取記憶體350. . . Cache memory
352...監看引擎352. . . Monitor engine
400、510...處理器400, 510. . . processor
410a-n ...核心410 an . . . core
415...互連體415. . . Interconnect
420...非核心420. . . Non-core
430...共享快取記憶體430. . . Shared cache memory
440...整合式記憶體控制器440. . . Integrated memory controller
450a-n、792、796...介面450a-n, 792, 796. . . interface
455...電力管理單元455. . . Power management unit
460...系統記憶體460. . . System memory
500...系統500. . . system
520、732~734...記憶體520, 732~734. . . Memory
522...習知記憶體522. . . Traditional memory
524...隱蔽記憶體524. . . Covert memory
530...模擬引擎530. . . Simulation engine
540...程式計數器540. . . Program counter
550...指示快取記憶體(I-快取記憶體)550. . . Indicates cache memory (I-cache memory)
560...解碼器560. . . decoder
570...處理器管線570. . . Processor pipeline
700...多處理器系統700. . . Multiprocessor system
714...輸入/輸出(I/O)裝置714. . . Input/output (I/O) device
716...第一匯流排716. . . First bus
718...匯流排橋接器718. . . Bus bar bridge
720...第二匯流排720. . . Second bus
722...鍵盤/滑鼠722. . . Keyboard/mouse
724...音訊I/O724. . . Audio I/O
726...通訊裝置726. . . Communication device
728...資料儲存單元728. . . Data storage unit
730...程式碼730. . . Code
738...高效能圖形引擎738. . . High performance graphics engine
739、752~754...點對點(P-P)互連體739, 752~754. . . Point-to-point (P-P) interconnect
750...點對點互連體750. . . Point-to-point interconnect
770...第一處理器770. . . First processor
772、782...記憶體控制器中樞(MCH)772,782. . . Memory Controller Hub (MCH)
776~778、786~788、794、798...點對點(P-P)介面776~778, 786~788, 794, 798. . . Point-to-point (P-P) interface
780...第二處理器780. . . Second processor
790...晶片組790. . . Chipset
第1圖以流程圖展示出根據本發明一實施例的一種方法。Figure 1 shows in a flow chart a method in accordance with an embodiment of the present invention.
第2圖以流程圖展示出根據本發明一實施例而可針對一目標值進行的一項測試。Figure 2 is a flow chart showing a test that can be performed for a target value in accordance with an embodiment of the present invention.
第3圖以方塊圖展示出根據本發明一實施例的一種處理器核心。Figure 3 is a block diagram showing a processor core in accordance with an embodiment of the present invention.
第4圖以方塊圖展示出根據本發明一實施例的一種處理器。Figure 4 is a block diagram showing a processor in accordance with an embodiment of the present invention.
第5圖以方塊圖展示出根據本發明另一實施例的一種處理器。Figure 5 is a block diagram showing a processor in accordance with another embodiment of the present invention.
第6圖以流程圖展示出根據本發明一實施例之多個合作執行緒之間的互動狀況。Figure 6 is a flow chart showing the interaction between a plurality of cooperative threads in accordance with an embodiment of the present invention.
第7圖以方塊圖展示出根據本發明一實施例的一種系統。Figure 7 is a block diagram showing a system in accordance with an embodiment of the present invention.
100...方法100. . . method
110~195...步驟方塊110~195. . . Step block
Claims (24)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/641,534 US8464035B2 (en) | 2009-12-18 | 2009-12-18 | Instruction for enabling a processor wait state |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201131349A TW201131349A (en) | 2011-09-16 |
TWI512448B true TWI512448B (en) | 2015-12-11 |
Family
ID=44152840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW099136477A TWI512448B (en) | 2009-12-18 | 2010-10-26 | Instruction for enabling a processor wait state |
Country Status (8)
Country | Link |
---|---|
US (3) | US8464035B2 (en) |
JP (2) | JP5571784B2 (en) |
KR (1) | KR101410634B1 (en) |
CN (1) | CN102103484B (en) |
DE (1) | DE102010052680A1 (en) |
GB (1) | GB2483012B (en) |
TW (1) | TWI512448B (en) |
WO (1) | WO2011075246A2 (en) |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9672019B2 (en) | 2008-11-24 | 2017-06-06 | Intel Corporation | Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program to multiple parallel threads |
US10621092B2 (en) | 2008-11-24 | 2020-04-14 | Intel Corporation | Merging level cache and data cache units having indicator bits related to speculative execution |
US8464035B2 (en) * | 2009-12-18 | 2013-06-11 | Intel Corporation | Instruction for enabling a processor wait state |
US8775153B2 (en) * | 2009-12-23 | 2014-07-08 | Intel Corporation | Transitioning from source instruction set architecture (ISA) code to translated code in a partial emulation environment |
US8977878B2 (en) * | 2011-05-19 | 2015-03-10 | Texas Instruments Incorporated | Reducing current leakage in L1 program memory |
US9207730B2 (en) * | 2011-06-02 | 2015-12-08 | Apple Inc. | Multi-level thermal management in an electronic device |
WO2013048468A1 (en) | 2011-09-30 | 2013-04-04 | Intel Corporation | Instruction and logic to perform dynamic binary translation |
US9063760B2 (en) * | 2011-10-13 | 2015-06-23 | International Business Machines Corporation | Employing native routines instead of emulated routines in an application being emulated |
WO2013089685A1 (en) | 2011-12-13 | 2013-06-20 | Intel Corporation | Enhanced system sleep state support in servers using non-volatile random access memory |
CN107025093B (en) | 2011-12-23 | 2019-07-09 | 英特尔公司 | For instructing the device of processing, for the method and machine readable media of process instruction |
WO2013101165A1 (en) * | 2011-12-30 | 2013-07-04 | Intel Corporation | Register error protection through binary translation |
JP5900606B2 (en) * | 2012-03-30 | 2016-04-06 | 富士通株式会社 | Data processing device |
US20140075163A1 (en) * | 2012-09-07 | 2014-03-13 | Paul N. Loewenstein | Load-monitor mwait |
JP5715107B2 (en) * | 2012-10-29 | 2015-05-07 | 富士通テン株式会社 | Control system |
CN104813277B (en) * | 2012-12-19 | 2019-06-28 | 英特尔公司 | The vector mask of power efficiency for processor drives Clock gating |
US9081577B2 (en) | 2012-12-28 | 2015-07-14 | Intel Corporation | Independent control of processor core retention states |
US9164565B2 (en) | 2012-12-28 | 2015-10-20 | Intel Corporation | Apparatus and method to manage energy usage of a processor |
US9405551B2 (en) | 2013-03-12 | 2016-08-02 | Intel Corporation | Creating an isolated execution environment in a co-designed processor |
JP6175980B2 (en) * | 2013-08-23 | 2017-08-09 | 富士通株式会社 | CPU control method, control program, and information processing apparatus |
US9513687B2 (en) * | 2013-08-28 | 2016-12-06 | Via Technologies, Inc. | Core synchronization mechanism in a multi-die multi-core microprocessor |
US9891936B2 (en) | 2013-09-27 | 2018-02-13 | Intel Corporation | Method and apparatus for page-level monitoring |
WO2015057819A1 (en) * | 2013-10-15 | 2015-04-23 | Mill Computing, Inc. | Computer processor with deferred operations |
CN105094747B (en) * | 2014-05-07 | 2018-12-04 | 阿里巴巴集团控股有限公司 | The device of central processing unit based on SMT and the data dependence for detection instruction |
US10467011B2 (en) * | 2014-07-21 | 2019-11-05 | Intel Corporation | Thread pause processors, methods, systems, and instructions |
KR20160054850A (en) * | 2014-11-07 | 2016-05-17 | 삼성전자주식회사 | Apparatus and method for operating processors |
US20160306416A1 (en) * | 2015-04-16 | 2016-10-20 | Intel Corporation | Apparatus and Method for Adjusting Processor Power Usage Based On Network Load |
KR102476357B1 (en) | 2015-08-06 | 2022-12-09 | 삼성전자주식회사 | Clock management unit, integrated circuit and system on chip adopting the same, and clock managing method |
US20170177336A1 (en) * | 2015-12-22 | 2017-06-22 | Intel Corporation | Hardware cancellation monitor for floating point operations |
US11023233B2 (en) | 2016-02-09 | 2021-06-01 | Intel Corporation | Methods, apparatus, and instructions for user level thread suspension |
US10185564B2 (en) | 2016-04-28 | 2019-01-22 | Oracle International Corporation | Method for managing software threads dependent on condition variables |
US11016893B2 (en) | 2016-09-30 | 2021-05-25 | Intel Corporation | Method and apparatus for smart store operations with conditional ownership requests |
US11061730B2 (en) * | 2016-11-18 | 2021-07-13 | Red Hat Israel, Ltd. | Efficient scheduling for hyper-threaded CPUs using memory monitoring |
US10289516B2 (en) | 2016-12-29 | 2019-05-14 | Intel Corporation | NMONITOR instruction for monitoring a plurality of addresses |
US10627888B2 (en) | 2017-01-30 | 2020-04-21 | International Business Machines Corporation | Processor power-saving during wait events |
US11086672B2 (en) * | 2019-05-07 | 2021-08-10 | International Business Machines Corporation | Low latency management of processor core wait state |
CN113867518A (en) * | 2021-09-15 | 2021-12-31 | 珠海亿智电子科技有限公司 | Processor low-power consumption blocking type time delay method, device and readable medium |
CN113986663A (en) * | 2021-10-22 | 2022-01-28 | 上海兆芯集成电路有限公司 | Electronic device and power consumption control method thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1282030A1 (en) * | 2000-05-08 | 2003-02-05 | Mitsubishi Denki Kabushiki Kaisha | Computer system and computer-readable recording medium |
US20060005197A1 (en) * | 2004-06-30 | 2006-01-05 | Bratin Saha | Compare and exchange operation using sleep-wakeup mechanism |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7363474B2 (en) | 2001-12-31 | 2008-04-22 | Intel Corporation | Method and apparatus for suspending execution of a thread until a specified memory access occurs |
US7127561B2 (en) | 2001-12-31 | 2006-10-24 | Intel Corporation | Coherency techniques for suspending execution of a thread until a specified memory access occurs |
US7213093B2 (en) | 2003-06-27 | 2007-05-01 | Intel Corporation | Queued locks using monitor-memory wait |
JP4376692B2 (en) | 2004-04-30 | 2009-12-02 | 富士通株式会社 | Information processing device, processor, processor control method, information processing device control method, cache memory |
GB2414573B (en) | 2004-05-26 | 2007-08-08 | Advanced Risc Mach Ltd | Control of access to a shared resource in a data processing apparatus |
US7810083B2 (en) | 2004-12-30 | 2010-10-05 | Intel Corporation | Mechanism to emulate user-level multithreading on an OS-sequestered sequencer |
US8607235B2 (en) | 2004-12-30 | 2013-12-10 | Intel Corporation | Mechanism to schedule threads on OS-sequestered sequencers without operating system intervention |
US8719819B2 (en) | 2005-06-30 | 2014-05-06 | Intel Corporation | Mechanism for instruction set based thread execution on a plurality of instruction sequencers |
US8516483B2 (en) | 2005-05-13 | 2013-08-20 | Intel Corporation | Transparent support for operating system services for a sequestered sequencer |
US8010969B2 (en) | 2005-06-13 | 2011-08-30 | Intel Corporation | Mechanism for monitoring instruction set based thread execution on a plurality of instruction sequencers |
US7882339B2 (en) * | 2005-06-23 | 2011-02-01 | Intel Corporation | Primitives to enhance thread-level speculation |
US8028295B2 (en) | 2005-09-30 | 2011-09-27 | Intel Corporation | Apparatus, system, and method for persistent user-level thread |
GB0519981D0 (en) * | 2005-09-30 | 2005-11-09 | Ignios Ltd | Scheduling in a multicore architecture |
US7941681B2 (en) * | 2007-08-17 | 2011-05-10 | International Business Machines Corporation | Proactive power management in a parallel computer |
US20090150696A1 (en) * | 2007-12-10 | 2009-06-11 | Justin Song | Transitioning a processor package to a low power state |
US9081687B2 (en) | 2007-12-28 | 2015-07-14 | Intel Corporation | Method and apparatus for MONITOR and MWAIT in a distributed cache architecture |
US8156362B2 (en) | 2008-03-11 | 2012-04-10 | Globalfoundries Inc. | Hardware monitoring and decision making for transitioning in and out of low-power state |
DE102009001142A1 (en) * | 2009-02-25 | 2010-08-26 | Robert Bosch Gmbh | Electromechanical brake booster |
US8156275B2 (en) * | 2009-05-13 | 2012-04-10 | Apple Inc. | Power managed lock optimization |
US8464035B2 (en) * | 2009-12-18 | 2013-06-11 | Intel Corporation | Instruction for enabling a processor wait state |
-
2009
- 2009-12-18 US US12/641,534 patent/US8464035B2/en active Active
-
2010
- 2010-10-26 TW TW099136477A patent/TWI512448B/en not_active IP Right Cessation
- 2010-11-11 JP JP2012517935A patent/JP5571784B2/en active Active
- 2010-11-11 KR KR1020127018822A patent/KR101410634B1/en active IP Right Grant
- 2010-11-11 WO PCT/US2010/056320 patent/WO2011075246A2/en active Application Filing
- 2010-11-11 GB GB1119728.2A patent/GB2483012B/en not_active Expired - Fee Related
- 2010-11-26 DE DE102010052680A patent/DE102010052680A1/en not_active Withdrawn
- 2010-12-17 CN CN201010615167.0A patent/CN102103484B/en not_active Expired - Fee Related
-
2013
- 2013-03-06 US US13/786,939 patent/US9032232B2/en active Active
- 2013-05-10 US US13/891,747 patent/US8990597B2/en active Active
-
2014
- 2014-06-26 JP JP2014131157A patent/JP5795820B2/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1282030A1 (en) * | 2000-05-08 | 2003-02-05 | Mitsubishi Denki Kabushiki Kaisha | Computer system and computer-readable recording medium |
US20060005197A1 (en) * | 2004-06-30 | 2006-01-05 | Bratin Saha | Compare and exchange operation using sleep-wakeup mechanism |
Also Published As
Publication number | Publication date |
---|---|
JP2012531681A (en) | 2012-12-10 |
CN102103484A (en) | 2011-06-22 |
GB2483012A (en) | 2012-02-22 |
TW201131349A (en) | 2011-09-16 |
US20130185580A1 (en) | 2013-07-18 |
US8464035B2 (en) | 2013-06-11 |
US8990597B2 (en) | 2015-03-24 |
DE102010052680A1 (en) | 2011-07-07 |
KR101410634B1 (en) | 2014-06-20 |
KR20120110120A (en) | 2012-10-09 |
US9032232B2 (en) | 2015-05-12 |
CN102103484B (en) | 2015-08-19 |
GB201119728D0 (en) | 2011-12-28 |
WO2011075246A2 (en) | 2011-06-23 |
WO2011075246A3 (en) | 2011-08-18 |
JP2014222520A (en) | 2014-11-27 |
US20130246824A1 (en) | 2013-09-19 |
GB2483012B (en) | 2017-10-18 |
JP5571784B2 (en) | 2014-08-13 |
JP5795820B2 (en) | 2015-10-14 |
US20110154079A1 (en) | 2011-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI512448B (en) | Instruction for enabling a processor wait state | |
TWI742032B (en) | Methods, apparatus, and instructions for user-level thread suspension | |
TWI590153B (en) | Methods for multi-threaded processing | |
JP5801372B2 (en) | Providing state memory in the processor for system management mode | |
US8539485B2 (en) | Polling using reservation mechanism | |
TWI266987B (en) | Method for monitoring locks, processor, system for monitoring locks, and machine-readable medium | |
US9128781B2 (en) | Processor with memory race recorder to record thread interleavings in multi-threaded software | |
US7127561B2 (en) | Coherency techniques for suspending execution of a thread until a specified memory access occurs | |
TW201508635A (en) | Dynamic reconfiguration of multi-core processor | |
TW201508643A (en) | Propagation of microcode patches to multiple cores in multicore microprocessor | |
US8447960B2 (en) | Pausing and activating thread state upon pin assertion by external logic monitoring polling loop exit time condition | |
US9886396B2 (en) | Scalable event handling in multi-threaded processor cores | |
US20110173420A1 (en) | Processor resume unit | |
JP5474926B2 (en) | Electric power retirement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |