JP5710712B2

JP5710712B2 - Centralized interrupt controller

Info

Publication number: JP5710712B2
Application number: JP2013172486A
Authority: JP
Inventors: ボートライト，ブライアン，デイヴィッド; クリアリー，ジェイムズ，マイケル
Original assignee: インテルコーポレイション
Priority date: 2013-08-22
Filing date: 2013-08-22
Publication date: 2015-04-30
Anticipated expiration: 2026-11-27
Also published as: JP2013232249A

Description

本発明は、割り込み又はインタラプト（ｉｎｔｅｒｒｕｐｔ）を制御する電子回路に関する。より詳細には、本発明は、複数の処理ユニットについて中央化されたアドバンスド・プログラマブル・インタラプト・コントローラ（ＡｄｖａｎｃｅｄＰｒｏｇｒａｍｍａｂｌｅＩｎｔｅｒｒｕｐｔＣｏｎｔｒｏｌｌｅｒ）に関する。 The present invention relates to an electronic circuit for controlling interrupts or interrupts. More particularly, the present invention relates to an Advanced Programmable Interrupt Controller that is centralized for multiple processing units.

コンピュータシステムのパフォーマンスの基礎となる処理ユニットは、コンピュータシステムに接続されている周辺装置によりリクエストされる各種の断続的な“サービス”の制御を含むいくつかの処理を実行する。プリンタ、スキャナ、表示装置などのコンピュータアイテムを含む入出力（Ｉ／Ｏ）周辺装置は、適切な機能を保証するためホストプロセッサによる断続的なサービスの提供を要求する。例えば、サービスは、デーら送信、データキャプチャ及び／又は制御信号を含む。 The processing unit underlying the performance of the computer system performs several processes including control of various intermittent “services” requested by peripheral devices connected to the computer system. Input / output (I / O) peripheral devices, including computer items such as printers, scanners, display devices, etc., require the host processor to provide intermittent service to ensure proper functionality. For example, the service includes data transmission, data capture and / or control signals.

各周辺装置は、典型的には、装置タイプに依存するだけでなく、それのプログラムされた使用にも依存する異なるサービス提供スケジュールを有している。ホストプロセッサは、１以上のバックグラウンドプログラムを実行しながら、各自の要求に従って装置間へのサービス提供を多重化する。サービスのホストをアドバイスするための少なくとも２つの方法、すなわち、ポーリング（ｐｏｌｌｉｎｇ）とインタラプト（ｉｎｔｅｒｒｕｐｔ）メソッドが使用される必要があった。前者の方法では、各周辺装置は、サービスリクエストを示すフラグがセットされているか確認するため、定期的にチェックがされる。後者の方法では、装置のサービスリクエストは、ホストを中断することが可能なインタラプトコントローラに転送され、それの現在のプログラムから特別なインタラプトサービスルーチンへの分岐が実行される。インタラプトメソッドは、ホストがポーリングのための不要なクロックサイクルを割り当てる必要がないため効果的である。本開示が着目するのはこの後者の方法である。 Each peripheral device typically has a different service delivery schedule that depends not only on the device type, but also on its programmed use. The host processor multiplexes the provision of services between devices according to its own request while executing one or more background programs. At least two methods for advising the service host had to be used: polling and interrupt methods. In the former method, each peripheral device is periodically checked to confirm whether a flag indicating a service request is set. In the latter method, the device service request is forwarded to an interrupt controller capable of interrupting the host and a branch from its current program to a special interrupt service routine is executed. The interrupt method is effective because the host does not need to allocate unnecessary clock cycles for polling. It is this latter method that the present disclosure focuses on.

マルチプロセッサコンピュータシステムの出現によって、プロセッサ間のインタラプトを動的に分散するインタラプト管理システムが実現されてきた。ＡＰＩＣ（ＡｄｖａｎｃｅｄＰｒｏｇｒａｍｍａｂｌｅＩｎｔｅｒｒｕｐｔＣｏｎｔｒｏｌｌｅｒ）は、マルチプロセッサインタラプト管理システムの一例である。多数のマルチプロセッサコンピュータシステムにおいて使用されるＡＰＩＣインタラプト送信機構は、他の処理ユニット又は周辺装置からのインタラプトリクエストを検出し、当該インタラプトリクエストに対応する特定のサービスが実行される必要があることを１以上の処理ユニットに通知するのに利用される。ＡＰＩＣインタラプト送信システムに関するさらなる詳細は、Ｃａｒｓｏｎらによる米国特許出願第５，２８３，９０４号“ＭｕｌｔｉｐｒｏｃｅｓｓｏｒＰｒｏｇｒａｍｍａｂｌｅＩｎｔｅｒｒｕｐｔＣｏｎｔｒｏｌｌｅｒＳｙｓｔｅｍ”に記載されている。 With the advent of multiprocessor computer systems, interrupt management systems that dynamically distribute interrupts between processors have been realized. APIC (Advanced Programmable Interrupt Controller) is an example of a multiprocessor interrupt management system. The APIC interrupt transmission mechanism used in many multiprocessor computer systems detects an interrupt request from another processing unit or peripheral device and indicates that a specific service corresponding to the interrupt request needs to be executed. It is used to notify the above processing unit. Further details regarding the APIC interrupt transmission system are described in US Pat. No. 5,283,904, “Multiprocessor Programmable Interrupt Controller System” by Carson et al.

多くの従来のＡＰＩＣは、多数のハードウェアによる設計となっており、このため、多数のゲート（すなわち、ハイ・ゲート・カウント）を必要とする。多くのマルチプロセッサシステムでは、各コアはコア内部に完全に自己完結した自らの専用のＡＰＩＣを有している。他のマルチプロセッサシステムでは、各コアは複数の論理プロセッサによる同時的なマルチスレッドコアである。このようなシステムでは、各論理プロセッサはＡＰＩＣに関連付けされ、各マルチスレッドコアは複数のＡＰＩＣインタラプト送信機構を有し、各機構は、自らのアーキテクチャ状態を維持し、一般に他のすべてのＡＰＩＣの制御ロジックと同一な自らの制御ロジックを実現する。何れかのタイプのマルチプロセッサシステムについて、複数のＡＰＩＣのためのダイエリアとリークパワーのコストは、望ましくないが大きなものとなりうる。さらに、マルチプロセッサシステムにおけるインタラプトを送信するため、複数のＡＰＩＣの動作に関するダイナミックパワーコストがまた、望ましくは大きなものとなりうる。 Many conventional APICs are designed with a large number of hardware and therefore require a large number of gates (ie, high gate counts). In many multiprocessor systems, each core has its own dedicated APIC that is completely self-contained within the core. In other multiprocessor systems, each core is a simultaneous multithreaded core with multiple logical processors. In such a system, each logical processor is associated with an APIC, each multithreaded core has multiple APIC interrupt transmission mechanisms, each mechanism maintaining its own architectural state and generally controlling all other APICs. Realize own control logic that is the same as logic. For any type of multiprocessor system, the cost of die area and leakage power for multiple APICs can be undesirably large. Further, because of the transmission of interrupts in a multiprocessor system, the dynamic power cost associated with the operation of multiple APICs can also be desirably high.

本発明の課題は、複数の処理ユニットについて中央化されたアドバンスド・プログラマブル・インタラプト・コントローラを提供することである。 An object of the present invention is to provide an advanced programmable interrupt controller that is centralized for a plurality of processing units.

上記課題を解決するため、本発明の一態様は、複数の処理ユニットに対するインタラプトメッセージの提供のための優先化及び制御機能を実行し、前記複数の処理ユニットに共有される単一のロジックブロックと、前記ロジックブロックに接続され、前記ロジックブロックにより処理のため前記複数の処理ユニットのインタラプトイベントをスケジューリングするインタラプトシーケンサブロックと、前記複数の処理ユニットのそれぞれのアーキテクチャインタラプト状態情報を維持するストレージエリアと、入力インタラプトメッセージを受信し、前記メッセージからの情報を前記ストレージエリアに配置する１以上の入力メッセージキューと、出力インタラプトメッセージを送信する１以上の出力メッセージキューとを有する装置に関する。 In order to solve the above-described problem, an aspect of the present invention performs a priority and control function for providing an interrupt message for a plurality of processing units, and a single logic block shared by the plurality of processing units; An interrupt sequencer block connected to the logic block for scheduling interrupt events of the plurality of processing units for processing by the logic block; and a storage area for maintaining respective architecture interrupt state information of the plurality of processing units; An apparatus having one or more input message queues for receiving an input interrupt message and placing information from the message in the storage area and one or more output message queues for transmitting an output interrupt message .

本発明によると、複数の処理ユニットについて中央化されたアドバンスド・プログラマブル・インタラプト・コントローラを提供することができる。 According to the present invention, an advanced programmable interrupt controller can be provided that is centralized for a plurality of processing units.

図１は、複数の処理ユニットのインタラプト制御を提供する中央化されたインタラプトコントローラの少なくとも１つの実施例を示すブロック図である。FIG. 1 is a block diagram illustrating at least one embodiment of a centralized interrupt controller that provides interrupt control for a plurality of processing units. 図２は、中央化されたインタラプトコントローラの少なくとも１つの実施例のさらなる詳細を示すブロック図である。FIG. 2 is a block diagram illustrating further details of at least one embodiment of a centralized interrupt controller. 図３は、マルチシーケンサシステムの各種実施例を示すブロック図である。FIG. 3 is a block diagram showing various embodiments of the multi-sequencer system. 図４は、複数のコアのインタラプト状態の中央レポジトリの少なくとも１つの実施例を示すブロック図である。FIG. 4 is a block diagram illustrating at least one embodiment of a central repository in a multiple core interrupt state. 図５は、中央化されたインタラプトコントローラのインタラプトシーケンサブロックの動作の少なくとも１つの実施例を示す状態遷移図である。FIG. 5 is a state transition diagram illustrating at least one embodiment of the operation of the interrupt sequencer block of the centralized interrupt controller. 図６は、開示された技術を実行可能な計算システムの少なくとも１つの実施例を示すブロック図である。FIG. 6 is a block diagram illustrating at least one embodiment of a computing system capable of performing the disclosed techniques.

以下において、複数の処理ユニットの中央化されたＡＰＩＣのための方法、システム及び製造物の選択された各実施例が説明される。ここに記載される各機構は、シングルコア又はマルチコアマルチスレッドシステムにより利用される。以下において、プロセッサタイプ、マルチスレッド環境、システムコンフィギュレーション、並びにマルチシーケンサシステムのシーケンサの個数及びタイプなどの多数の具体的な詳細が、本発明のより完全な理解を提供するため与えられる。しかしながら、本発明がそのような具体的な詳細なく実現可能であるということは当業者に理解されるであろう。さらに、本発明を不要に不明りょうにしないように、周知の構成、回路などは詳細には説明されない。 In the following, selected embodiments of a method, system and product for centralized APIC of a plurality of processing units are described. Each mechanism described herein is utilized by a single core or multi-core multi-thread system. In the following, numerous specific details such as processor type, multi-threaded environment, system configuration, and the number and type of sequencers in a multi-sequencer system are provided to provide a more complete understanding of the present invention. However, it will be understood by one skilled in the art that the present invention may be practiced without such specific details. Furthermore, well-known structures, circuits, etc., are not described in detail so as not to unnecessarily obscure the present invention.

図１は、中央化されたインタラプトコントローラ１１０を有するシステム１００の少なくとも１つの実施例を示すブロック図である。システム１００は、複数のコア１０４（０）〜１０４（ｎ）を有する。図１の破線と楕円は、システム１００が任意数ｎ（ｎ≧２）のコアを有することが可能であることを示す。当業者は、システムの他の実施例が、以下に説明されるように、単一の同時マルチスレッディング（ＳＭＴ）コア（ｎ＝１となるように）を有することが可能であることを認識しているであろう。 FIG. 1 is a block diagram illustrating at least one embodiment of a system 100 having a centralized interrupt controller 110. The system 100 includes a plurality of cores 104 (0) to 104 (n). The dashed lines and ellipses in FIG. 1 indicate that the system 100 can have any number n (n ≧ 2) cores. Those skilled in the art will recognize that other embodiments of the system can have a single simultaneous multi-threading (SMT) core (so that n = 1), as described below. There will be.

図１は、単一の中央化されたインタラプトコントローラ１１０が物理的にコア１０４（０）〜１０４（ｎ）から分離されていることを示す。図１はまた、システム１００の各コア１０４（０）〜１０４（ｎ）がローカルインターコネクト１０２を介し中央化されたインタラプトコントローラ１１０に接続されていることを示す。中央化されたインタラプトコントローラ１１０は、ローカルインターコネクト１０２を介し各処理コアとインタフェースをとる。中央化されたインタラプトコントローラ１１０のハイレベルな目的は、複数のＡＰＩＣの動作を、当該ＡＰＩＣが従来のコア単位のＡＰＩＣシステムと同様にパラレルに動作しているようにシステム１００に見えるようにシリアルに模倣することである。 FIG. 1 shows that a single centralized interrupt controller 110 is physically separated from the cores 104 (0) -104 (n). FIG. 1 also shows that each core 104 (0)-104 (n) of the system 100 is connected to a centralized interrupt controller 110 via the local interconnect 102. The centralized interrupt controller 110 interfaces with each processing core via the local interconnect 102. The high-level purpose of the centralized interrupt controller 110 is to make the operation of multiple APICs serially so that the APIC appears to the system 100 as if it were operating in parallel as in a conventional core-based APIC system. To imitate.

システム１００の単一のコア１０４は、同時マルチスレッディング（ＳＭＴ）、スイッチ・オン・イベントマルチスレッディング（ＳｏｅＭＴ）及び／又はタイム多重化マルチスレッディング（ＴＭＵＸ）を含む各種マルチスレッディングスキームの何れかを実現することが可能である。複数のハードウェアスレッドコンテクスト（論理プロセッサ）からの命令が、何れかの時点において同時にプロセッサ３０４上で実行されると、それはＳＭＴと呼ばれる。あるいは、シングル・コアマルチスレッディングシステムは、プロセッサパイプラインが複数のハードウェアスレッドコンテクスト間で多重化されるが、所与の時点では、１つのハードウェアスレッドコンテクストからの命令のみがパイプラインで実行されるＳｏｅＭＴを実装する。ＳｏｅＭＴについては、スレッドスイッチイベントがタイムベースである場合、それはＴＭＵＸとなる。ＳｏｅＭＴ及びＴＭＵＸをサポートするシングルコアはマルチスレッディングをサポート可能であるが、それらは“シングルスレッド”コアと呼ばれる。なぜなら、１つのハードウェアスレッドコンテクストからの命令しか何れか所与の時点で実行されないためである。 A single core 104 of the system 100 can implement any of a variety of multithreading schemes including simultaneous multithreading (SMT), switch on event multithreading (SoeMT), and / or time multiplexed multithreading (TMUX). is there. If instructions from multiple hardware thread contexts (logical processors) are executed on the processor 304 at any point in time, it is called SMT. Alternatively, in a single core multithreading system, the processor pipeline is multiplexed between multiple hardware thread contexts, but only instructions from one hardware thread context are executed in the pipeline at a given time. Implement SoeMT. For SoeMT, if the thread switch event is time-based, it will be TMUX. Single cores that support SoeMT and TMUX can support multithreading, but they are referred to as “single thread” cores. This is because only instructions from one hardware thread context are executed at any given time.

各コア１０４は、シングルスレッドを実行可能な単一の処理ユニットである。あるいは、１以上のコア１０４が、当該コアしかある時点で１つのスレッドに対する命令を実行しないように、ＳｏｅＭＴ又はＴＭＵＸマルチスレッディングを実行するマルチスレッディングコアである。このような実施例では、コア１０４は“処理ユニット”と呼ばれる。 Each core 104 is a single processing unit capable of executing a single thread. Alternatively, one or more cores 104 are multi-threading cores that perform SoeMT or TMUX multi-threading so that instructions for one thread are executed only when the core 104 is present. In such an embodiment, the core 104 is referred to as a “processing unit”.

少なくとも１つの他の実施例では、各コア１０４は、ＳＭＴコアなどのマルチスレッドコアである。ＳＭＴコア１０４については、コア１０４の各論理プロセッサが“処理ユニット”と呼ばれる。ここで使用される“処理ユニット”とは、スレッドを実行可能な任意の物理的又は論理的ユニットである。各処理ユニットは、所与のスレッドについて実行される次の命令を決定するため、次命令ポインタロジックを有する。また、処理ユニットは、“シーケンサ”と同義的に使用される。 In at least one other embodiment, each core 104 is a multi-threaded core, such as an SMT core. For the SMT core 104, each logical processor of the core 104 is referred to as a “processing unit”. As used herein, a “processing unit” is any physical or logical unit that can execute a thread. Each processing unit has next instruction pointer logic to determine the next instruction to be executed for a given thread. The processing unit is used synonymously with “sequencer”.

何れかの実施例について（シングルスレッドコアとマルチスレッドコア）、各処理ユニットは、自らのインタラプトコントローラの機能のロジックが各処理ユニット内で自己完結せず、中央化されたインタラプトコントローラ１１０により提供されたとしても、当該機能に関連付けされる。何れかのコア１０４がＳＭＴコアである場合、各コア１０４の各論理プロセッサは、ローカルインターコネクト１０２を介し中央化されたインタラプトコントローラ１１０に接続される。 For either embodiment (single thread core and multi-thread core), each processing unit is provided by a centralized interrupt controller 110 where the logic of its interrupt controller functionality is not self-contained within each processing unit. If so, it is associated with the function. If any core 104 is an SMT core, each logical processor in each core 104 is connected to a centralized interrupt controller 110 via a local interconnect 102.

図３を参照するに、上述されるように、処理ユニット（又は“シーケンサ”）は、論理プロセッサ又は物理プロセッサである。図３において、論理処理ユニットと物理処理ユニットの間の相違が示されている。図３は、開示される技術を実行可能なマルチシーケンサシステムの実施例３１０と３５０の選択されたハードウェア構成を示すブロック図である。 Referring to FIG. 3, as described above, a processing unit (or “sequencer”) is a logical or physical processor. In FIG. 3, the difference between the logical processing unit and the physical processing unit is shown. FIG. 3 is a block diagram illustrating selected hardware configurations of multi-sequencer system embodiments 310 and 350 capable of performing the disclosed techniques.

図３は、シングルコアマルチシーケンサマルチスレッディング環境３１０の選択されたハードウェア構成を示す。図３はまた、各シーケンサが独立した物理プロセッサコアとなっているマルチコアマルチスレッディング環境３５０の選択されたハードウェア構成を示す。 FIG. 3 shows a selected hardware configuration for a single core multi-sequencer multi-threading environment 310. FIG. 3 also illustrates a selected hardware configuration of a multi-core multi-threading environment 350 where each sequencer is an independent physical processor core.

シングルコアマルチスレッディング環境３１０では、単一の物理プロセッサ３０４は、オペレーティングシステム及びユーザプログラムに対してＬＰ_１〜ＬＰ_ｎにより参照される複数の論理プロセッサ（図示せず）として見えるようにされる。各論理プロセッサＬＰ_１〜ＬＰ_ｎはそれぞれ、アーキテクチャ状態ＡＳ_１〜ＡＳ_ｎの完全なセットを維持する。このアーキテクチャ状態は、少なくとも１つの実施例については、データレジスタ、セグメントレジスタ、コントロールレジスタ、デバッグレジスタ及びモデルに固有のレジスタの大部分を有する。論理プロセッサＬＰ_１〜ＬＰ_ｎは、キャッシュ、実行ユニット、ブランチプレディクタ、コントロールロジック、バスなどの物理プロセッサ３０４の他の大部分のリソースを共有する。しかしながら、各論理プロセッサＬＰ_１〜ＬＰ_ｎは、自らのＡＰＩＣに関連付けされる。 In single-core multithreading environment 310, a single physical processor 304 is made to appear as multiple logical processors that are referenced by the LP ₁ ~LP _n to the operating system and user programs (not shown). Each logical processor LP ₁ -LP _n maintains a complete set of architectural states AS ₁ -AS _n , respectively. This architectural state has most of the data registers, segment registers, control registers, debug registers and model specific registers for at least one embodiment. The logical processors LP ₁ -LP _n share most other resources of the physical processor 304 such as caches, execution units, branch predictors, control logic, and buses. However, each logical processor LP ₁ -LP _n is associated with its own APIC.

多数のハードウェア構成が共有されるが、マルチスレッディング環境３１０における各スレッドコンテクストは、次命令アドレスを独立に生成することが可能である（及び、例えば、命令キャッシュ、実行命令キャッシュ又はトレースキャッシュなどからのフェッチの実行が可能である）。このため、プロセッサ３０４は、複数の論理シーケンサが単一の物理フェッチ／デコードユニット３２２において実現可能であっても、各スレッドコンテクストの命令をフェッチする論理的に独立した次命令ポインタフェッチロジック３２０を有する。シングルコアマルチスレッディングの実施例では、“シーケンサ”という用語は、スレッドコンテクストの少なくとも次命令ポインタフェッチロジックと共に、当該スレッドコンテクストの関連付けされたアーキテクチャ状態３１２の少なくともいくつかを含む。シングルコアマルチスレッディングシステム３１０のシーケンサはシンメトリックである必要はないことに留意すべきである。例えば、同じ物理コアの２つのシングルコアマルチスレッディングシーケンサは、それぞれが維持するアーキテクチャ状態情報の情報量について異なる。 Although multiple hardware configurations are shared, each thread context in the multithreading environment 310 can independently generate the next instruction address (and, for example, from an instruction cache, an execution instruction cache, or a trace cache, etc. Fetch can be performed). Thus, the processor 304 has a logically independent next instruction pointer fetch logic 320 that fetches the instructions of each thread context even if multiple logical sequencers can be implemented in a single physical fetch / decode unit 322. . In the single-core multithreading embodiment, the term “sequencer” includes at least some of the thread context's associated architectural state 312 along with at least the next instruction pointer fetch logic of the thread context. It should be noted that the sequencer of single core multithreading system 310 need not be symmetric. For example, two single-core multithreading sequencers of the same physical core differ in the amount of architecture state information they maintain.

このため、少なくとも１つの実施例では、マルチシーケンサシステム３１０は、同時マルチスレッディングをサポートするシングルコアプロセッサ３０４である。このような実施例について、各シーケンサは、同一の物理プロセッサコア３０４がすべてのスレッド命令を実行するが、自らの次命令ポインタフェッチロジックと自らのアーキテクチャ状態情報とを有する論理プロセッサである。このような実施例について、論理プロセッサは、シングルプロセッサコアの実行リソースが同時に実行されるスレッド間で共有される可能性があるが、自らのアーキテクチャ状態のバージョンを維持する。 Thus, in at least one embodiment, multi-sequencer system 310 is a single core processor 304 that supports simultaneous multi-threading. For such an embodiment, each sequencer is a logical processor with the same physical processor core 304 executing all thread instructions, but with its own next instruction pointer fetch logic and its own architecture state information. For such an embodiment, a logical processor may maintain a version of its architectural state, although the execution resources of a single processor core may be shared among threads executing simultaneously.

図３はまた、マルチコアマルチスレッディング環境３５０の少なくとも１つの実施例を示す。このような環境３５０は、異なるスレッド／シュレッドの少なくとも一部の実行が同時に進行可能となるとなるように、異なるスレッド／シュレッドをそれぞれが実行可能な２以上の独立した物理プロセッサ３０４ａ〜３０４ｎを有する。各プロセッサ３０４ａ〜３０４ｎは、各自のスレッド又はシュレッドのための命令情報をフェッチするため、物理的に独立したフェッチユニット３２２を有する。各プロセッサ３０４ａ〜３０４ｎが単一のスレッド／シュレッドを実行する実施例では、フェッチ／デコードユニット３２２は、単一の次命令ポインタフェッチロジック３２０を実装する。しかしながら、各プロセッサ３０４ａ〜３０４ｎが複数のスレッドコンテクストをサポートする実施例では、フェッチ／デコードユニット３２２は、サポートされる各スレッドコンテクストについて異なる次命令ポインタフェッチロジック３２０を実装する。マルチプロセッサ環境３５０における追加的な次命令ポインタフェッチロジック３２０の任意的な性質が、図３において破線により示される。 FIG. 3 also illustrates at least one embodiment of a multi-core multi-threading environment 350. Such an environment 350 has two or more independent physical processors 304a-304n each capable of executing different threads / shreds so that execution of at least some of the different threads / shreds can proceed simultaneously. Each processor 304a-304n has a physically independent fetch unit 322 to fetch instruction information for its own thread or shred. In an embodiment where each processor 304a-304n executes a single thread / shred, the fetch / decode unit 322 implements a single next instruction pointer fetch logic 320. However, in embodiments where each processor 304a-304n supports multiple thread contexts, fetch / decode unit 322 implements different next instruction pointer fetch logic 320 for each supported thread context. The optional nature of the additional next instruction pointer fetch logic 320 in the multiprocessor environment 350 is illustrated in FIG.

図３に示されるマルチコアシステム３５０の少なくとも１つの実施例について、各シーケンサは、単一のチップパッケージ３６０にある複数のコア３０４ａ〜３０４ｎを有するプロセッサコア３０４であるかもしれない。各コア３０４ａ〜３０４ｎは、シングルスレッド又はマルチスレッドプロセッサコアである。チップパッケージ３６０は、マルチコアシステム３５０の図示されたシングルチップの実施例が単なる一例であることを示すため、図３において破線により示されている。他の実施例では、マルチコアシステムのプロセッサコアは別々のチップに常駐する。すなわち、マルチコアシステムは、マルチソケットシンメトリックマルチプロセッシングシステムである。 For at least one embodiment of the multi-core system 350 shown in FIG. 3, each sequencer may be a processor core 304 having multiple cores 304a-304n in a single chip package 360. Each core 304a-304n is a single thread or multi-thread processor core. Chip package 360 is indicated by a dashed line in FIG. 3 to indicate that the illustrated single chip embodiment of multi-core system 350 is merely an example. In other embodiments, the processor cores of a multi-core system reside on separate chips. That is, the multi-core system is a multi-socket symmetric multi-processing system.

説明の簡単化のため、以下の説明ではマルチコアシステム３５０の実施例に着目する。しかしながら、この着目は、後述される機構がマルチコア又はシングルコアの何れのマルチシーケンサ環境においても実行可能であるという点で限定的なものと解されるべきでない。 For simplicity of explanation, the following description focuses on an embodiment of the multi-core system 350. However, this focus should not be construed as limiting in that the mechanisms described below can be executed in either multi-core or single-core multi-sequencer environments.

図１を参照するに、システム１００のコア１０４（０）〜１０４（ｎ）は、ローカルインターコネクト１０２を介し互いに接続可能であることは理解できる。ローカルインターコネクト１０２は、コア間で要求されるすべての通信機能（キャッシュスヌープなど）を提供する。各コア１０４（０）〜１０４（ｎ）は、ローカルインターコネクト１０２を介しインタラプト関連メッセージを送受信するための比較的小さなインタフェースブロックを有する。一般に、コアのこのようなインタフェースは、それがインタラプト関連メッセージに関するアーキテクチャ状態を保持せず、またそれがここに記載される中央化されたインタラプトコントローラ１１０により実行される他の関連する機能を実行せず、又はインタラプトを優先させないため、比較的にシンプルなものである。 Referring to FIG. 1, it can be seen that the cores 104 (0)-104 (n) of the system 100 can be connected to each other via the local interconnect 102. The local interconnect 102 provides all communication functions (cache snoop, etc.) required between cores. Each core 104 (0)-104 (n) has a relatively small interface block for sending and receiving interrupt-related messages over the local interconnect 102. In general, such an interface of the core does not maintain the architectural state for interrupt-related messages and it does not perform other related functions performed by the centralized interrupt controller 110 described herein. Or is not relatively prioritized, so it is relatively simple.

コア１０４（０）〜１０４（ｎ）は、シングルダイ１５０（０）上にあるかもしれない。少なくとも１つの実施例について、図１に示されるシステム１００はさらに、任意的な追加的ダイを有する。１以上の追加的なダイ（〜１５０（ｎ））の任意的な性質は、図１において破線及び楕円により示される。図１は、他のダイ（１５０（ｎ））上の処理ユニットからのインタラプトメッセージが、システムインターコネクト１０６を介し第１ダイ（１５０（０））に通信されることを示す。中央化されたインタラプトコントローラ１１０は、システムインターコネクト１０６を介し他の何れかのダイ（〜１５０（ｎ））及び周辺のＩ／Ｏ装置１１４に接続される。 Cores 104 (0) -104 (n) may be on a single die 150 (0). For at least one embodiment, the system 100 shown in FIG. 1 further includes an optional additional die. The optional nature of one or more additional dies (˜150 (n)) is indicated in FIG. 1 by dashed lines and ellipses. FIG. 1 shows that an interrupt message from a processing unit on the other die (150 (n)) is communicated to the first die (150 (0)) via the system interconnect. The centralized interrupt controller 110 is connected to any other die (˜150 (n)) and peripheral I / O devices 114 via the system interconnect 106.

当業者は、図１に示されるダイ１５０の構成が、単なる一例に過ぎず、限定的なものと解されるべきでないことを認識するであろう。他の実施例では、例えば、１５０（０）と１５０（ｎ）の両方の要素が、同一のシリコン部分に常駐し、同一のローカルインターコネクト１０２に接続される。他方、各コア１０４は、必ずしも同一のチップ上に常駐する必要はない。各コア１０４（０）〜１０４（ｎ）及び／又はローカルインターコネクト１０２は、同一のダイ１５０上に常駐しない。 Those skilled in the art will recognize that the configuration of the die 150 shown in FIG. 1 is merely an example and should not be construed as limiting. In other embodiments, for example, both 150 (0) and 150 (n) elements reside in the same silicon portion and are connected to the same local interconnect 102. On the other hand, each core 104 does not necessarily have to reside on the same chip. Each core 104 (0) -104 (n) and / or local interconnect 102 is not resident on the same die 150.

システム１００の各コア１０４（０）〜１０４（ｎ）はさらに、ローカルインターコネクト１０２を介し他のシステムインタフェースロジック１１２に接続される。このようなロジック１１２は、例えば、シーケンサがシステムインターコネクトを介し他のシステム要素とインタフェースをとることを可能にするキャッシュコヒーレンスロジック又は他のインタフェースロジックなどを有する。他のシステムインタフェースロジック１１２はさらに、システムインターコネクト１０６を介し他のシステム要素１１６（メモリなど）に接続される。 Each core 104 (0)-104 (n) of the system 100 is further connected to other system interface logic 112 via the local interconnect 102. Such logic 112 includes, for example, cache coherence logic or other interface logic that allows the sequencer to interface with other system elements via the system interconnect. Other system interface logic 112 is further connected to other system elements 116 (such as memory) via system interconnect 106.

図２は、中央化されたインタラプトコントローラ１１０の少なくとも１つの実施例のさらなる詳細を示すブロック図である。一般に、図２は、中央化されたインタラプトコントローラ１１０がシステムのコア（図１のコア１０４（０）〜１０４（ｎ）など）とは物理的に分離されているが、中央化されたインタラプトコントローラ１１０は各シーケンサに関連付けされる各ＡＰＩＣインスタンスの完全なアーキテクチャ状態を維持することを示す。中央化されたインタラプトコントローラ１１０は、従来システムのコア単位の専用のＡＰＩＣにより通常は処理されるインタラプトキューイング及び優先付け機能のすべてを管理する。以下に詳細に説明されるように、中央化されたインタラプトコントローラ１１０はまた、システムインターコネクト１０６に接続されるシステムの残りとシーケンサとの間のファイアウォールとして機能する。 FIG. 2 is a block diagram illustrating further details of at least one embodiment of the centralized interrupt controller 110. In general, FIG. 2 shows that the centralized interrupt controller 110 is physically separate from the core of the system (such as cores 104 (0) -104 (n) of FIG. 1), but the centralized interrupt controller 110 indicates that the complete architectural state of each APIC instance associated with each sequencer is maintained. A centralized interrupt controller 110 manages all of the interrupt queuing and prioritization functions normally handled by a dedicated APIC per core of conventional systems. As described in detail below, the centralized interrupt controller 110 also acts as a firewall between the rest of the system connected to the system interconnect 106 and the sequencer.

図２は、中央化されたインタラプトコントローラ１１０が中央化されたＡＰＩＣ状態２０２を有することを示す。ＡＰＩＣ状態２０２は、典型的なＡＰＩＣ処理に通常関連付けされるアーキテクチャ状態を含む。すなわち、ＡＰＩＣ処理は、アプリケーションプログラマにはアーキテクチャ的に可視的な構成であり、このようなインタフェースは本開示により変更されることを意図したものでない。システムが従来のＡＰＩＣハードウェア（すなわち、各処理ユニットについて１つの自己完結したＡＰＩＣ）又はここに記載される中央化されたインタラプトコントローラを有するか否かに関係なく、少なくとも１つの実施例について、このようなハードウェア設計の選択はアプリケーションプログラマに透過であるべきことが予期される。このように、同時にオペレーティングシステムベンダ及びアプリケーションプログラマが期待する同一のアーキテクチャインタフェースを維持しながら、面積、ダイナミックパワー及びパワーリークの各コストを、システムの単一の中央化されたインタラプトコントローラ１１０を利用することにより低減することができる。 FIG. 2 shows that the centralized interrupt controller 110 has a centralized APIC state 202. The APIC state 202 includes architectural states that are typically associated with typical APIC processing. That is, APIC processing is architecturally visible to application programmers, and such an interface is not intended to be modified by the present disclosure. Regardless of whether the system has conventional APIC hardware (ie, one self-contained APIC for each processing unit) or a centralized interrupt controller as described herein, for at least one embodiment, this It is expected that such hardware design choices should be transparent to the application programmer. In this way, the area, dynamic power, and power leak costs are utilized by the system's single centralized interrupt controller 110 while simultaneously maintaining the same architectural interface expected by operating system vendors and application programmers. Can be reduced.

このため、ブロック２０２におけるＡＰＩＣ状態情報の中央レポジトリとして維持されるアーキテクチャ状態は、一般には従来システムのおける各ＡＰＩＣについて維持される状態となる。例えば、システムに８つのシーケンサがある場合、中央化されたＡＰＩＣ状態２０２は、各エントリが従来システムのシーケンサについて維持されるアーキテクチャＡＰＩＣ状態を反映した８つのエントリのアレイを含む。（以下の図４の説明は、各エントリがまた特定のマイクロアーキテクチャ状態を含むことを示す。）
少なくとも１つの実施例について、中央化されたＡＰＩＣ状態２０２は、レジスタファイル又はアレイなどの単一のメモリ記憶領域として実現される。レジスタファイル構成は、ランダムロジックとしてコア単位ＡＰＩＣ状態を実現した従来アプローチより良好なエリア効率を可能にする。 For this reason, the architectural state maintained as the central repository of APIC state information in block 202 is generally the state maintained for each APIC in conventional systems. For example, if there are 8 sequencers in the system, the centralized APIC state 202 includes an array of 8 entries reflecting the architectural APIC state where each entry is maintained for the sequencer of the conventional system. (The description of FIG. 4 below indicates that each entry also includes a specific microarchitecture state.)
For at least one embodiment, the centralized APIC state 202 is implemented as a single memory storage area, such as a register file or an array. The register file structure enables better area efficiency than the conventional approach that realized the core unit APIC state as random logic.

一般に、中央化されたインタラプトコントローラ１１０は、ローカルインターコネクト１０２及び／又はシステムインターコネクト１０６を介し受信されるインタラプトメッセージの受信を監視し、レジスタファイル２０２の適切なエントリに関連するメッセージを格納する。少なくとも１つの実施例では、これは、入力メッセージのデスティネーションアドレスを監視し、当該デスティネーションアドレスに係るＡＰＩＣインスタンスエントリにメッセージを格納することにより実現される。このような機能は、以下で詳細に説明されるように、入力メッセージキュー２０４、２０６により実行される。 In general, the centralized interrupt controller 110 monitors receipt of interrupt messages received via the local interconnect 102 and / or the system interconnect 106 and stores messages associated with the appropriate entries in the register file 202. In at least one embodiment, this is accomplished by monitoring the destination address of the incoming message and storing the message in the APIC instance entry associated with the destination address. Such functions are performed by the input message queues 204, 206, as will be described in detail below.

同様に、中央化されたインタラプトコントローラ１１０は、出力インタラプトメッセージの生成を監視し、このようなメッセージがサービス及び送信されるまで、レジスタファイル２０２の適切なエントリにメッセージを格納する。少なくとも１つの実施例では、これは、出力メッセージのソースアドレスを監視し、ソースアドレスに係るＡＰＩＣインスタンスエントリにメッセージを格納することにより実現される。このような機能は、以下で詳細に説明されるように、出力メッセージキュー２０８、２１０により実行される。 Similarly, the centralized interrupt controller 110 monitors the generation of output interrupt messages and stores the messages in the appropriate entries in the register file 202 until such messages are serviced and transmitted. In at least one embodiment, this is accomplished by monitoring the source address of the outgoing message and storing the message in the APIC instance entry associated with the source address. Such functions are performed by the output message queues 208, 210, as described in detail below.

一般に、中央化されたインタラプトコントローラ１１０のインタラプトシーケンサブロック２１４は、その後、サービスのため中央化されたＡＰＩＣ状態２０２に反映されるように、保留中のインタラプトメッセージをスケジューリングする。以下で詳細に説明されるように、これは、シーケンサの何れの保留中のインタラプト動作も繰り返し無視されないように、公平性スキームに従って実現される。インタラプトシーケンサブロック２１４は、サービスを実行するため、ＡＰＩＣインタラプト提供ロジック２１２を呼び出す。 In general, the interrupt sequencer block 214 of the centralized interrupt controller 110 then schedules pending interrupt messages to be reflected in the centralized APIC state 202 for service. As explained in detail below, this is accomplished according to a fairness scheme so that any pending interrupt operations of the sequencer are not repeatedly ignored. The interrupt sequencer block 214 calls the APIC interrupt providing logic 212 to execute the service.

図２は、中央化されたインタラプトコントローラ１１０がＡＰＩＣインタラプト提供ロジック２１２を有することを示す。システムの各シーケンサのＡＰＩＣロジックを複製するのでなく（ＳＭＴコアの各論理プロセッサ又は各シングルスレッドコアなど）、中央化されたインタラプトコントローラ１１０は、システムのすべてのシーケンサについてインタラプトをサービスするため、ＡＰＩＣロジック２１２の単一の冗長でないコピーを提供する。 FIG. 2 shows that the centralized interrupt controller 110 has APIC interrupt providing logic 212. Rather than duplicating the APIC logic of each sequencer in the system (such as each logical processor in the SMT core or each single-threaded core), the centralized interrupt controller 110 services the interrupt for all sequencers in the system, so the APIC logic 212 single non-redundant copies are provided.

例えば、システム（図１のシステム１００など）が、それぞれが８つの同時的なＳＭＴスレッドをサポートする４つのコアを有する場合、システムは従来はＡＰＩＣロジック２１２の３２このコピーを必要とする。他方、図２に示される中央化されたインタラプトコントローラ１１０は、ＡＰＩＣロジック２１２の単一のコピーを利用して、所与の時点でアクティブな３２個のスレッドのすべてにインタラプトコントローラのサービスを提供する。 For example, if a system (such as system 100 of FIG. 1) has four cores, each supporting eight simultaneous SMT threads, the system conventionally requires 32 copies of APIC logic 212. On the other hand, the centralized interrupt controller 110 shown in FIG. 2 utilizes a single copy of APIC logic 212 to provide interrupt controller services to all 32 active threads at a given time. .

システムの複数のシーケンサが同時に保留中のインタラプト動作を有する可能性があるため、ＡＰＩＣロジック２１２は、複数のシーケンサから競合を受ける。このため、中央化されたインタラプトコントローラ１１０は、インタラプトシーケンサブロック２１４を有する。インタラプトシーケンサブロック２１４は、ＡＰＩＣロジック２１２の各シーケンサに公平なアクセスを提供するように、システムのすべてのインタラプトのサービスをシーケンス処理する。中央化されたインタラプトコントローラ１１０のインタラプトシーケンサブロック２１４は、単一のＡＰＩＣロジックブロック２１２へのアクセスを制御する。 APIC logic 212 receives contention from multiple sequencers because multiple sequencers in the system may have pending interrupt operations at the same time. For this reason, the centralized interrupt controller 110 has an interrupt sequencer block 214. The interrupt sequencer block 214 sequences the services of all interrupts in the system so as to provide fair access to each sequencer of the APIC logic 212. The interrupt sequencer block 214 of the centralized interrupt controller 110 controls access to a single APIC logic block 212.

このため、インタラプトシーケンサブロック２１４は、共有されるＡＰＩＣロジック２１２へのシーケンサのアクセスを制御する。この機能は、各シーケンサがＡＰＩＣロジックへの直接的なアドホックアクセスを有するように、各シーケンサの専用のＡＰＩＣロジックブロックを提供する従来のＡＰＩＣシステムと対照的なものである。シングルＡＰＩＣロジックブロック２１２は、システムの各処理ユニットなどのインタラプト優先化に関するＡＰＩＣの完全なアーキテクチャ要求を提供する。 Thus, interrupt sequencer block 214 controls sequencer access to shared APIC logic 212. This feature is in contrast to conventional APIC systems that provide a dedicated APIC logic block for each sequencer so that each sequencer has direct ad hoc access to the APIC logic. The single APIC logic block 212 provides APIC's complete architectural requirements for interrupt prioritization, such as each processing unit of the system.

システムの何れかの処理ユニットについて、ＡＰＩＣを通過するインタラプトのソース／デスティネーションは、他の処理ユニット又は周辺装置とすることが可能である。イントラダイ処理ユニットのインタラプトは、ローカルインターコネクト１０２を介し中央化されたインタラプトコントローラ１１０により提供される。他のダイ上の周辺装置又は処理ユニットに対するインタラプトは、システムインターコネクト１０６を介し提供される。 For any processing unit in the system, the source / destination of the interrupt passing through the APIC can be another processing unit or a peripheral device. Intra-die processing unit interrupts are provided by a centralized interrupt controller 110 via the local interconnect 102. Interrupts to peripheral devices or processing units on other dies are provided via the system interconnect 106.

図２は、ローカルインターコネクト１０２及びシステムインターコネクト１０６を介し入出力インタラプトメッセージを処理するため、入力システムメッセージキュー２０４、入力ローカルメッセージキュー２０６、出力ローカルメッセージキュー２０８及び出力システムメッセージキュー２１０の４つのメッセージキューを有することを示す。入力ローカルメッセージキュー２０６と出力ローカルメッセージキュー２０８は、ローカルインターコネクト１０２に接続され、入力システムメッセージキュー２０４と出力システムメッセージキュー２１０は、システムインターコネクト１０６に接続される。各キュー２０４、２０６、２０８、２１０は、制御ロジックと共にデータストレージを有するミニコントローラキューである。 FIG. 2 illustrates four message queues, input system message queue 204, input local message queue 206, output local message queue 208, and output system message queue 210, for processing input / output interrupt messages via local interconnect 102 and system interconnect 106. It has shown that. Input local message queue 206 and output local message queue 208 are connected to local interconnect 102, and input system message queue 204 and output system message queue 210 are connected to system interconnect 106. Each queue 204, 206, 208, 210 is a mini-controller queue that has data storage with control logic.

キュー２０４、２０６、２０８、２１０の動作のさらなる説明が、図１、２及び４を参照してなされる。図４は、中央化されたＡＰＩＣ状態２０２の少なくとも１つの実施例の詳細な図を提供する。図４は、中央化されたＡＰＩＣ状態２０２が、アーキテクチャ状態３０２と共にマイクロアーキテクチャ状態３０１、３０３の両方を有することを示す。上述されるように、各シーケンサ１０４（０）〜１０４（ｎ）について維持されるアーキテクチャ状態３０２は、シーケンサに関連付けされるＡＰＩＣ状態を反映する。アーキテクチャＡＰＩＣ状態３０２の各エントリ４１０は、ここでは“ＡＰＩＣインスタンス”と呼ばれる。例えば、ＡＰＩＣンスタンスの入力インタラプトメッセージは、当該インスタンスに係るアーキテクチャＡＰＩＣ状態３０２のエントリ４１０に格納される。少なくとも１つの実施例について、２４０までの入力インタラプトメッセージが、ＡＰＩＣインスタンスのエントリ４１０に維持される。 A further description of the operation of the queues 204, 206, 208, 210 will be made with reference to FIGS. FIG. 4 provides a detailed view of at least one embodiment of the centralized APIC state 202. FIG. 4 shows that the centralized APIC state 202 has both microarchitecture states 301, 303 along with architecture state 302. As described above, the architectural state 302 maintained for each sequencer 104 (0) -104 (n) reflects the APIC state associated with the sequencer. Each entry 410 in the architecture APIC state 302 is referred to herein as an “APIC instance”. For example, an APIC instance input interrupt message is stored in the entry 410 of the architecture APIC state 302 associated with the instance. For at least one embodiment, up to 240 input interrupt messages are maintained in entry 410 of the APIC instance.

アーキテクチャ状態３０２に加えて、中央化されたＡＰＩＣ状態２０２は、一般的なマイクロアーキテクチャ状態３０３と共に各ＡＰＩＣインスタンス４１０に係るマイクロアーキテクチャ状態３０１を有する。一般的なマイクロアーキテクチャ状態３０３は、インタラプトシーケンサブロック２１４（図２を参照されたい）がＡＰＩＣロジック２１２（図２を参照されたい）にアクセスする必要があるのは何れのシーケンサであるか決定するのに役立つためのスコアボード３０５を有する。少なくとも１つの実施例では、スコアボード３０５は、システムの各シーケンサのためのビットを維持する。シーケンサのビットの値は、ＡＰＩＣロジック２１２が要求される何れかの保留中の動作をシーケンサが有しているか示す。少なくとも１つの実施例では、インタラプトシーケンサブロック２１４（図２）がＡＰＩＣロジック２１２へのアクセスを何れのシーケンサが必要とするか容易かつ迅速に確認することが可能となるように、スコアボード３０５がアトミックに読み込まれる。 In addition to the architecture state 302, the centralized APIC state 202 has a microarchitecture state 301 for each APIC instance 410 along with a general microarchitecture state 303. The general microarchitecture state 303 determines which sequencer the interrupt sequencer block 214 (see FIG. 2) needs to access the APIC logic 212 (see FIG. 2). It has a scoreboard 305 to help. In at least one embodiment, scoreboard 305 maintains a bit for each sequencer in the system. The value of the sequencer bit indicates whether the sequencer has any pending action that is required by the APIC logic 212. In at least one embodiment, the scoreboard 305 is atomic so that the interrupt sequencer block 214 (FIG. 2) can easily and quickly determine which sequencer needs access to the APIC logic 212. Is read.

インタラプトシーケンサブロック２１４の１つの特徴は、ＡＰＩＣロジック２１２へのアクセスを公平に可能にすることであるが、スコアボード３０５は、インタラプトシーケンサブロック２１４がＡＰＩＣロジック２１２の処理を同時には必要としないシーケンサの処理リソースを浪費することを要求することなく、公平性スキームが利用されることを可能にする。このため、スコアボードは、入力メッセージと上記発行されたリクエストの処理の現在の状態とに基づき、何れのＡＰＩＣインスタンスが実行するべき作業を有しているか追跡する。インタラプトシーケンサブロック２１４は、アクティブなＡＰＩＣインスタンスについて中央化されたＡＰＩＣ状態２０２から現在の状態を読み込み、（当該ＡＰＩＣインスタンス４１０のアーキテクチャ状態３０２とマイクロアーキテクチャ状態３０１の両方に記録されるような）現在状態に適したアクションをとり、その後に（スコアボード３０５のビットにより示されるような）保留中の作業による次のＡＰＩＣインスタンスについて当該処理を繰り返す。 One feature of the interrupt sequencer block 214 is that it allows fair access to the APIC logic 212, but the scoreboard 305 allows the sequencer block 214 not to simultaneously process the APIC logic 212. Allows fairness schemes to be used without requiring processing resources to be wasted. Thus, the scoreboard tracks which APIC instances have work to perform based on the input message and the current state of processing of the issued request. The interrupt sequencer block 214 reads the current state from the centralized APIC state 202 for the active APIC instance and records the current state (as recorded in both the architectural state 302 and the microarchitecture state 301 of the APIC instance 410). Then take the appropriate action, then repeat the process for the next APIC instance with pending work (as indicated by the bits on the scoreboard 305).

入力インタラプトメッセージがローカルインターコネクト１０２を介し同一ダイ上の他のシーケンサに向けて投入されると、入力ローカルメッセージキュー２０６がこのメッセージを受信し、それの送信先を決定する。インタラプトメッセージは、シーケンサの１つ、多数又はすべてを対象とすることが可能であり、又はその何れも対象としないこともある。キュー２０６は、インタラプトをキューアップするため、対象とされる各シーケンサのアーキテクチャ状態エントリ（図４の４１０など）に書き込む。このようなケースでは、インタラプト動作が保留され、シングルＡＰＩＣロジックブロック２１２のサービスが対象となるシーケンサに必要とされていることを示すため、スコアボードエントリがすでに設定されていない場合、キュー２０６はまた対象とされるシーケンサのスコアボードエントリを設定する。 When an input interrupt message is submitted to another sequencer on the same die via the local interconnect 102, the input local message queue 206 receives this message and determines its destination. The interrupt message can be targeted to one, many or all of the sequencers, or none of them. The queue 206 writes to each sequencer's architecture state entry (such as 410 in FIG. 4) to queue up the interrupt. In such a case, if the scoreboard entry has not already been set to indicate that the interrupt action is pending and the service of the single APIC logic block 212 is required for the target sequencer, the queue 206 will also Set the scoreboard entry for the target sequencer.

しかしながら、図４は、いくつかのインタラプトが中央化されたＡＰＩＣ状態２０２においてキューアップされることなく、入力ローカルメッセージキュー２０６から出力キュー２０８、２１０に直接的にバイパスされることを示す。これは、例えば、特定のプロセッサに具体的にはアドレス指定されていないブロードキャストメッセージについて起こるかもしれない。図４は、同様のバイパス処理が入力システムメッセージキュー２０４（後述される）からもまた行われることを示す。 However, FIG. 4 shows that some interrupts are bypassed directly from the input local message queue 206 to the output queues 208, 210 without being queued up in the centralized APIC state 202. This may occur, for example, for broadcast messages that are not specifically addressed to a particular processor. FIG. 4 shows that a similar bypass process is also performed from the input system message queue 204 (described below).

キュー２０６について上述されたものと同様の処理がまた、入力インタラプトメッセージがシステムインターコネクト１０６を介し（他のダイ上のＩ／Ｏ装置又はシーケンサから）シーケンサ１０４（０）〜１０４（ｎ）の１つを対象として投入されるときに行われる。入力システムメッセージキュー２０４は、メッセージを受信し、それの送信先を決定する。キュー２０６は、インタラプトをキューアップし、何れか対象とされたシーケンサのスコアボードエントリ４１２を更新するため、対象とされる各シーケンサのアーキテクチャ状態エントリ４１０に書き込む。もちろん、入力メッセージは上述されるようにバイパスされる。 A process similar to that described above for queue 206 is also provided by one of the sequencers 104 (0) -104 (n) where the input interrupt message is routed via system interconnect 106 (from an I / O device or sequencer on another die). This is done when a target is entered. The input system message queue 204 receives a message and determines the destination of the message. The queue 206 writes up to the architecture state entry 410 of each targeted sequencer to queue up the interrupt and update the scoreboard entry 412 of any targeted sequencer. Of course, the input message is bypassed as described above.

メッセージキュー２０４、２０６、２０８、２１０の１以上が、入出力メッセージのファイアウォール機能を実装してもよい。このファイアウォール機能に関して、図２が図１と共に説明される。 One or more of the message queues 204, 206, 208, 210 may implement an input / output message firewall function. With respect to this firewall function, FIG. 2 is described together with FIG.

入力メッセージに関して、入力システムメッセージキュー２０４は、中央化されたインタラプトコントローラ１１０に係るダイ１５０上のシーケンサを対象としないメッセージの不要な処理を回避するため、インタラプトファイアウォールとして機能するかもしれない。図１に示されるように、システム１００は複数のマルチシーケンサダイ１５０（０）〜１５０（ｎ）を有する。特定のダイのシーケンサにより生成されるインタラプトは、システムインターコネクト１０６を介し他のダイに送信される。同様に、周辺装置１１４により生成されるインタラプトは、システムインターコネクト１０６を介しダイに送信される。 For input messages, the input system message queue 204 may function as an interrupt firewall to avoid unnecessary processing of messages that are not targeted to the sequencer on the die 150 associated with the centralized interrupt controller 110. As shown in FIG. 1, the system 100 has a plurality of multi-sequencer dies 150 (0) -150 (n). Interrupts generated by a particular die's sequencer are sent to other dies via the system interconnect 106. Similarly, interrupts generated by peripheral device 114 are transmitted to the die via system interconnect 106.

ダイ１５０の中央化されたインタラプトコントローラ１１０（及び入力システムメッセージキュー２０４）は、このようなメッセージの送信先アドレスがダイ１５０上の何れかのシーケンサ（コア又は論理プロセッサなど）を有しているか判断する。メッセージが当該ダイに係るローカルインターコネクト１０２上の何れかのコア又は論理プロセッサを対象としていない場合、入力システムメッセージキュー２０４は、ローカルインターコネクト１０２上のシーケンサの何れかにメッセージを転送することを拒否する。このように、入力システムメッセージキューは、単に何れのアクションも不要であると決定するため、上記コア／スレッドの“ウェイキング（ｗａｋｉｎｇ）”を回避する。これは、電力を節約し、ローカルインターコネクト１０２の帯域幅を節約する。なぜなら、それは複数の各シーケンサがメッセージがそれらを対象としたものでないことを決定するためだけに、パワーセービング状態からウェイクアップすることを不要にするためである。 The centralized interrupt controller 110 (and input system message queue 204) of the die 150 determines whether the destination address of such a message has any sequencer (such as a core or logical processor) on the die 150. To do. If the message is not intended for any core or logical processor on the local interconnect 102 associated with the die, the input system message queue 204 refuses to forward the message to any of the sequencers on the local interconnect 102. In this way, the input system message queue simply determines that no action is required, thus avoiding the “waking” of the core / thread. This saves power and saves local interconnect 102 bandwidth. This is because it eliminates the need for each sequencer to wake up from the power saving state just to determine that the message is not intended for them.

１以上の論理プロセッサがパワーセービング状態にない場合でさえ、入力システムメッセージキュー２０４は、入力インタラプトメッセージがそれらの側でアクションを必要としないことを単に決定するため、論理プロセッサが現在実行している作業から論理プロセッサをインタラプトしないように、ファイアウォール機能を実行する。 Even if one or more logical processors are not in a power saving state, the input system message queue 204 is currently executing by the logical processor to simply determine that the input interrupt message requires no action on their part. The firewall function is executed so that the logical processor is not interrupted from the work.

少なくとも１つの実施例について、ファイアウォールはまた出力メッセージに対して実装される。これは、出力システムメッセージと共に、少なくとも１つの実施例については、出力ローカルメッセージについても成り立つ。少なくとも１つの実施例について、ローカルメッセージのファイアウォール構成は、ローカルインターコネクト１０２上の各メッセージがすべてのシーケンサに配信されることを要求するのでなく、対象とされるインタラプトメッセージが特定のシーケンサに提供されることを可能にする構成をサポートするローカルインターコネクト１０２を有するシステムについてのみ実現される。このようなケースでは、出力ローカルメッセージキュー２０８は、ローカルインターコネクト１０２を介し各インタラプトメッセージをユニキャスト又はマルチキャストメッセージとして、当該メッセージにより対象とされるシーケンサのみに送信する。このように、対象とされていないシーケンサは、それらのアクションが当該インタラプトメッセージに必要とされていないことを決定するため、それらの処理をインタラプトする必要はない。出力システムメッセージは、それらが対象とされていないエンティティに不必要に送信されないように、同様に対象とされる。 For at least one embodiment, a firewall is also implemented for outgoing messages. This is true for output local messages as well as output system messages for at least one embodiment. For at least one embodiment, the local message firewall configuration does not require that each message on the local interconnect 102 be delivered to all sequencers, but the targeted interrupt message is provided to a particular sequencer. It is implemented only for systems having a local interconnect 102 that supports a configuration that enables this. In such a case, the output local message queue 208 transmits each interrupt message as a unicast or multicast message via the local interconnect 102 only to the sequencer targeted by the message. Thus, sequencers that are not targeted do not need to interrupt their processing to determine that their actions are not required in the interrupt message. Output system messages are similarly targeted so that they are not unnecessarily sent to entities that are not targeted.

図２は、入力インタラプトメッセージが入力メッセージキュー２０４、２０６により中央化されたＡＰＩＣ状態２０２に配置された後、インタラプトシーケンサブロック２１４が、システムのＡＰＩＣ処理を実行するため、ＡＰＩＣロジック２１２（図２を参照されたい）の１つのコピーへのシステムの各シーケンサ間への公平なアクセスを提供することを示す。インタラプトシーケンサブロック２１４は、実際には、ＡＰＩＣ状態２０２をシーケンシャルにトラバースし、それを必要とする次のシーケンスのＡＰＩＣロジック２１２へのアクセスを提供することによって、この公平性スキームを実現する。インタラプトシーケンサブロック２１４により実現される公平性スキームは、各シーケンサがインタラプト提供ブロックへの等しいアクセスを有することを許可する。 FIG. 2 illustrates that after an input interrupt message is placed in the APIC state 202 centralized by the input message queues 204, 206, the interrupt sequencer block 214 performs APIC processing of the system to execute the APIC logic 212 (FIG. 2). Show fair access between each sequencer of the system to one copy of (see). The interrupt sequencer block 214 actually implements this fairness scheme by traversing the APIC state 202 sequentially and providing access to the next sequence of APIC logic 212 that requires it. The fairness scheme implemented by the interrupt sequencer block 214 allows each sequencer to have equal access to the interrupt providing block.

少なくとも１つの実施例について、ＡＰＩＣ状態２０２のエントリを介した上記概念的なシーケンシャルなステッピングは、スコアボード（図４の３０５を参照されたい）を利用することによってより効率的なものとなり、スコアボードは、何れのアクティブなシーケンサが次にＡＰＩＣサービスを必要とするか決定するため、アトミックにクエリされる。少なくとも１つの実施例について、図５に関連してより詳細に以下で説明される方法に従って、シーケンシャルなアクセスが制御可能である。 For at least one embodiment, the conceptual sequential stepping via the APIC state 202 entry is made more efficient by utilizing a scoreboard (see 305 in FIG. 4). Are queried atomically to determine which active sequencer next needs the APIC service. For at least one embodiment, sequential access can be controlled according to the method described in more detail below in connection with FIG.

図５は、システムのＡＰＩＣ処理を実行するために、ＡＰＩＣロジック２１２（図２を参照されたい）の１つのコピーへのシステムの各シーケンサへの公平なアクセスを提供するため、インタラプトシーケンサブロック２１４（図２を参照されたい）の少なくとも１つの実施例により利用される方法５００を示す状態図である。図５の以下の説明は、図２及び４を参照してなされる。 FIG. 5 illustrates an interrupt sequencer block 214 (to provide equitable access to each sequencer of the system to one copy of APIC logic 212 (see FIG. 2) to perform the APIC processing of the system. FIG. 3 is a state diagram illustrating a method 500 utilized by at least one embodiment of (see FIG. 2). The following description of FIG. 5 is made with reference to FIGS.

全体として、図５は、インタラプトシーケンサブロック２１４がアクティブなＡＰＩＣインスタンスの中央化されたＡＰＩＣ状態２０２０から現在状態を読み取り、現在状態に適したアクションをとり、保留中の作業を有する次のＡＰＩＣインスタンスについて当該処理を繰り返すことを示す。 Overall, FIG. 5 shows that the interrupt sequencer block 214 reads the current state from the centralized APIC state 2020 of the active APIC instance, takes action appropriate to the current state, and has the next APIC instance with pending work. Indicates that the process is repeated.

図５は、方法５００が状態５０２でスタートすることを示す。状態５０２において、インタラプトシーケンサブロック２１４は、何れのＡＰＩＣインスタンスが実行すべき作業を有しているか決定するため、スコアボード３０５を照会する。上述されるように、各ＡＰＩＣインスタンスについてスコアボード３０５に１つのエントリ４１２が存在する。エントリ４１２は、少なくとも１つの実施例では、１ビットエントリである。ビット４１２は、入力メッセージが当該ＡＰＩＣインスタンスの中央化されたＡＰＩＣ状態２０２に書き込まれるときに設定される。 FIG. 5 shows that the method 500 starts at state 502. In state 502, interrupt sequencer block 214 queries scoreboard 305 to determine which APIC instances have work to perform. As described above, there is one entry 412 in the scoreboard 305 for each APIC instance. The entry 412 is a 1-bit entry in at least one embodiment. Bit 412 is set when an incoming message is written to the centralized APIC state 202 of the APIC instance.

もちろん、当業者は、スコアボード３０５がすべての実施例に必ずしも存在する必要がないパフォーマンスエンハンスメントであることを認識するであろう。少なくとも他の１つの実施例について、例えば、インタラプトシーケンサブロック２１４は、何れのアクティブなＡＰＩＣインスタンスがサービスを必要としているか決定するため、順番に（シーケンシャルなど）中央化されたＡＰＩＣ状態２０２の各エントリをトラバースする。 Of course, those skilled in the art will recognize that the scoreboard 305 is a performance enhancement that need not be present in all embodiments. For at least one other embodiment, for example, the interrupt sequencer block 214 may query each entry in the centralized APIC state 202 in order (eg, sequential) to determine which active APIC instance requires service. Traverse.

スコアボード３０５の何れのビットも設定されていない場合、何れのシーケンサも保留中のＡＰＩＣイベントを有していない。このようなケースでは、方法５００は状態５０２から５０８に移行する。状態５０８において、方法５００は、ロジック２１２が必要としない間、ＡＰＩＣロジックブロック２１２の少なくとも一部を省電力にパワーダウンする。パワーダウンが完了すると、方法５００は、新たなＡＰＩＣ動作が検出されるか判断するため、状態５０２に移行する。 If none of the scoreboard 305 bits are set, no sequencer has a pending APIC event. In such a case, method 500 transitions from state 502 to 508. In state 508, method 500 powers down at least a portion of APIC logic block 212 to power savings while logic 212 is not required. Once power down is complete, method 500 transitions to state 502 to determine if a new APIC operation is detected.

状態５０２において、新たな動作が検出されず（すなわち、スコアボード３０５の何れのエントリも設定されていない）、ＡＰＩＣロジック２１２がすでにパワーダウンされていた場合、方法５００は、新たなＡＰＩＣ動作を待機するため、状態５０２から５０６に移行する。 In state 502, if no new action is detected (ie, no entry in scoreboard 305 is set) and APIC logic 212 has already been powered down, method 500 waits for a new APIC action. Therefore, the state 502 is shifted to 506.

待機状態５０６の期間中、方法５００は、何れかのＡＰＩＣインスタンスが保留中有のＡＰＩＣ作業を取得したか決定するため、スコアボード３０５のコンテンツを定期的に評価する。スコアボード３０５のコンテンツに反映されるような入力ＡＰＩＣメッセージは、状態５０６から５０２への移行を引き起こす。入力ローカルメッセージキュー２０４及び入力システムメッセージキュー２０６の上記説明は、アーキテクチャＡＰＩＣ状態３０２と、少なくとも一部の実施例では、スコアボード３０５のエントリが、保留中のＡＰＩＣ作業をＡＰＩＣインスタンスが取得したことを反映するようどのように更新されるか説明する。 During the wait state 506, the method 500 periodically evaluates the content of the scoreboard 305 to determine which APIC instance has acquired pending APIC work. An input APIC message as reflected in the content of the scoreboard 305 causes a transition from state 506 to 502. The above description of the input local message queue 204 and the input system message queue 206 indicates that the architecture APIC state 302 and, in at least some embodiments, that an entry in the scoreboard 305 has acquired an APIC work pending. Explain how it is updated to reflect.

方法５００は、状態５０２において、スコアボード３０５の何れかのエントリ４１２が設定されている場合、少なくとも１つのＡＰＩＣインスタンスが実行すべき保留中のＡＰＩＣ作業を有していると判断する。このようなエントリが複数設定されている場合、インタラプトシーケンサブロック２１４は、何れのＡＰＩＣインスタンスがＡＰＩＣロジック２１２によりサービスを次に受け付けるか決定する。少なくとも１つの実施例では、インタラプトシーケンサブロック２１４は、設定されている次のスコアボードエントリを選択することによって、上記決定を実行する。このように、インタラプトシーケンサブロック２１４は、ＡＰＩＣロジック２１２へのアクセスのため、次にアクティブなＡＰＩＣインスタンスをシーケンシャルに選択することによって、公平性スキームを課す。 Method 500 determines that at least one APIC instance has pending APIC work to perform if any entry 412 of scoreboard 305 is set in state 502. If a plurality of such entries are set, the interrupt sequencer block 214 determines which APIC instance will receive the service next by the APIC logic 212. In at least one embodiment, interrupt sequencer block 214 performs the determination by selecting the next set scoreboard entry. In this manner, interrupt sequencer block 214 imposes a fairness scheme by sequentially selecting the next active APIC instance for access to APIC logic 212.

状態５０２においてＡＰＩＣインスタンスを選択すると、方法５００はブロック５０２から５０４に移行する。ブロック５０４において、インタラプトシーケンサブロック２１４は、中央化されたＡＰＩＣ状態３０２から選択されたバーチャルＡＰＩＣのエントリ４１０を読み込む。このように、インタラプトシーケンサブロック２１４は、選択されたＡＰＩＣインスタンスについて何れのＡＰＩＣイベントが保留中であるか決定する。複数のＡＰＩＣイベントが保留中であるかもしれず、ＡＰＩＣエントリ４１０に反映される。状態５０４の各繰り返し中、ＡＰＩＣインスタンスに１つの保留中のイベントしか処理されない。このため、ラウンドロビンタイプのシーケンシャルな公平性スキームが維持される。 Upon selecting an APIC instance in state 502, method 500 transitions from block 502 to 504. At block 504, the interrupt sequencer block 214 reads the selected virtual APIC entry 410 from the centralized APIC state 302. Thus, interrupt sequencer block 214 determines which APIC events are pending for the selected APIC instance. Multiple APIC events may be pending and are reflected in APIC entry 410. During each iteration of state 504, only one pending event is processed for the APIC instance. This maintains a round robin type sequential fairness scheme.

同一のアクティブなＡＰＩＣインスタンスについて複数の保留中のインタラプトイベントから選択するため、インタラプトシーケンサブロック２１４は、状態５０４において優先化処理を実行する。このような優先化処理は、従来システムの専用ＡＰＩＣにより実行される優先化スキームをエミュレートする。例えば、ＡＰＩＣインタラプトは、重要性の各クラスに入るよう規定される。各ＡＰＩＣインスタンスのアーキテクチャ状態エントリ４１０（図４）は、少なくとも１つの実施例では、論理プロセッサ毎に２４０までの保留中のインタラプトを保持する。これらは、１６の重要性のクラスに属することが可能であり、１６の優先化されたグループに分類される。クラス１６〜３１のインタラプトは、クラス３２〜４７のものより高い優先度を有する。インタラプトクラス番号が低いほど、インタラプト優先度は高くなる。このため、インタラプトシーケンサブロック２１４は、ＡＰＩＣインスタンスの２４０ビットを観察し、複数設定されている場合、状態５０４において１つのイベントのみを抽出する（ＡＰＩＣの既存のアーキテクチャ優先化ルールに基づき）。少なくとも１つの実施例では、インタラプトシーケンサブロック２１４は、この優先化を実行するためＡＰＩＣロジック２１２を呼び出す。 In order to select from multiple pending interrupt events for the same active APIC instance, interrupt sequencer block 214 performs prioritization in state 504. Such prioritization processing emulates a prioritization scheme that is executed by a dedicated APIC of a conventional system. For example, APIC interrupts are defined to fall into each class of importance. The architecture state entry 410 (FIG. 4) for each APIC instance holds up to 240 pending interrupts per logical processor in at least one embodiment. These can belong to 16 importance classes and fall into 16 prioritized groups. Class 16-31 interrupts have higher priority than classes 32-47. The lower the interrupt class number, the higher the interrupt priority. Thus, the interrupt sequencer block 214 observes 240 bits of the APIC instance and, if multiple, sets only one event in the state 504 (based on APIC's existing architecture priority rules). In at least one embodiment, interrupt sequencer block 214 calls APIC logic 212 to perform this prioritization.

その後、方法５００は、状態５０４において、選択されたイベントに適したアクションをスケジューリング又は実行する。例えば、このイベントは、出力メッセージキューの１つから以前に送信されたインタラプトメッセージに対するアクノリッジメントを待機していることである。あるいは、イベントは、出力インタラプトメッセージが送信される必要があることである。または、シーケンサの１つについて、入力インタラプトメッセージ又はアクノリッジメントがサービスされる必要がある。インタラプトシーケンサブロック２１４は、状態５０４において、イベントのサービスのためＡＰＩＣロジック２１２を起動する。 Thereafter, the method 500 schedules or performs actions appropriate for the selected event in state 504. For example, this event is waiting for an acknowledgment for an interrupt message previously sent from one of the output message queues. Alternatively, the event is that an output interrupt message needs to be sent. Or, for one of the sequencers, an input interrupt message or acknowledgment needs to be serviced. Interrupt sequencer block 214 activates APIC logic 212 to service the event in state 504.

アクノリッジメントが待機されている場合、インタラプトシーケンサブロック２１４は、このようなアクノリッジメントが待機されていることを判断するため、マイクロアーキテクチャ状態３０３に照会する。そうである場合、インタラプトシーケンサブロック２１４は、状態５０４においてアクノリッジメントが受信されたか判断するため、ＡＰＩＣ状態２０２の適切なエントリを照会する。そうでない場合、次のシーケンサのイベントを処理するように、状態５０４から抜け出す。 If an acknowledgment is waiting, interrupt sequencer block 214 queries microarchitecture state 303 to determine that such an acknowledgment is waiting. If so, interrupt sequencer block 214 queries the appropriate entry in APIC state 202 to determine if an acknowledgment was received in state 504. Otherwise, exit from state 504 to process the next sequencer event.

アクノリッジメントが受信された場合、アクノリッジメントがもはや待機されていないことを反映するため、マイクロアーキテクチャ状態３０３が更新される。インタラプトシーケンサブロック２１４はまた、状態５０２に戻る前にＡＰＩＣインスタンスのスコアボード３０５のエントリをクリアする。少なくとも１つの実施例では、スコアボード３０５のエントリは、現在サービスされているイベントがＡＰＩＣインスタンスの保留中のイベントのみである場合に限ってクリアされる。 If an acknowledgment is received, the microarchitecture state 303 is updated to reflect that the acknowledgment is no longer waiting. The interrupt sequencer block 214 also clears the APIC instance scoreboard 305 entry before returning to state 502. In at least one embodiment, the scoreboard 305 entry is cleared only if the currently serviced event is the only APIIC instance pending event.

他の例では、状態５０４においてサービスされるイベントがインタラプトメッセージの送信である場合（ローカルインターコネクト１０２又はシステムインターコネクト１０６を介した）、そのようなイベントは状態５０４において以下のようにサービスされる。インタラプトシーケンサブロック２１４は、現在サービスされている論理プロセッサのＡＰＩＣインスタンスから、上述した優先化処理が与えられた場合に何れの出力メッセージが提供される必要があるか判断する。その後、出力メッセージは、所望される送信先アドレスと共に、適切な出力メッセージキュー（出力ローカルメッセージキュー２０８又は出力システムメッセージキュー２１０）への送信のためスケジューリングされる。 In another example, if the event serviced in state 504 is a transmission of an interrupt message (via the local interconnect 102 or system interconnect 106), such event is serviced in state 504 as follows. The interrupt sequencer block 214 determines which output messages need to be provided from the APIC instance of the currently serviced logical processor given the above priority processing. The output message is then scheduled for transmission to the appropriate output message queue (output local message queue 208 or output system message queue 210) along with the desired destination address.

出力メッセージが、アクノリッジメントの受信などの追加的なサービスをイベントが完全にサービスされる前に必要とする場合、中央化されたコントローラ１１０は、さらなるサービスが当該イベントに必要とされることを示すため、マイクロアーキテクチャ状態３０３を更新する。（ローカルインターコネクト１０２又はシステムインターコネクト１０６を介した入力アクノリッジメントは、入力メッセージキュー２０４、２０６においてキューアップされ、最終的には、関連するＡＰＩＣインスタンスの状態５０４の次の繰り返しで処理可能となるように、中央化されたＡＰＩＣ状態２０２に更新される。その後、本方法は状態５０４から５０２に移行する。 If the outgoing message requires additional services, such as receipt of an acknowledgment, before the event is fully serviced, the centralized controller 110 indicates that additional services are required for the event. Therefore, the microarchitecture state 303 is updated. (Input acknowledgments via the local interconnect 102 or the system interconnect 106 are queued up in the input message queues 204, 206 so that they can eventually be processed in the next iteration of the associated APIC instance state 504. , Updated to the centralized APIC state 202. The method then transitions from state 504 to 502.

図６は、開示された技術を実行可能なマルチスレッド計算システム９００の少なくとも１つの実施例を示す。計算システム９００は、少なくとも１つのプロセッサコア９０４（０）とメモリシステム９４０とを有する。システム９００は、破線及び楕円により示されるような追加的なコア（〜９０４（ｎ））を有する。 FIG. 6 illustrates at least one embodiment of a multi-threaded computing system 900 capable of performing the disclosed techniques. The computing system 900 includes at least one processor core 904 (0) and a memory system 940. System 900 has an additional core (˜904 (n)) as indicated by dashed lines and ellipses.

メモリシステム９４０は、命令キャッシュ９４４及び／又はデータキャッシュ９４２などの１以上の小型で相対的に高速なキャッシュと共に、大型で相対的に低速なメモリストレージ９０２を有する。メモリストレージ９０２は、プロセッサ９０４の動作を制御するための命令９１０とデータ９１２とを格納する。 The memory system 940 includes a large, relatively slow memory storage 902 with one or more small, relatively fast caches such as an instruction cache 944 and / or a data cache 942. The memory storage 902 stores instructions 910 and data 912 for controlling the operation of the processor 904.

メモリシステム９４０は、メモリの一般化された表現として意図され、ハードドライブ、ＣＤ−ＲＯＭ、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＤＲＡＭ（ＤｙｎａｍｉｃＲＡＭ）、ＳＲＡＭ（ＳｔａｔｉｃＲＡＭ）、フラッシュメモリ及び関連する回路などの各種形態のメモリを有する。メモリシステム９４０は、プロセッサ９０４により実行可能なデータ信号により表される命令９１０及び／又はデータ９１２を格納する。命令９１０及び／又はデータ９１２は、ここに記載された技術の何れか又はすべてを実行するためのコード及び／又はデータを有する。 The memory system 940 is intended as a generalized representation of memory, such as hard drives, CD-ROMs, RAM (Random Access Memory), DRAM (Dynamic RAM), SRAM (Static RAM), flash memory and related circuitry It has various types of memory. Memory system 940 stores instructions 910 and / or data 912 represented by data signals that can be executed by processor 904. Instructions 910 and / or data 912 include code and / or data for performing any or all of the techniques described herein.

図６は、各プロセッサ９０４が中央化されたインタラプトコントローラ１１０に接続されていることを示す。各プロセッサ９０４は、命令情報を実行コア９３０に供給するフロントエンド９２０を有する。フェッチされた命令情報は、実行コア９３０による実行を待機するため、キャッシュ２２５にバッファされる。フロントエンド９２０は、命令情報を実行コア９３０にプログラム順に供給する。少なくとも１つの実施例では、フロントエンド９２０は、実行される次の命令を決定するフェッチ／デコードユニット３２２を有する。システム９００の少なくとも１つの実施例では、フェッチ／デコードユニット３２２は、単一の次命令ポインタフェッチロジック３２０を有する。しかしながら、各プロセッサ９０４が複数のスレッドコンテクストをサポートする実施例では、フェッチ／デコードユニット３２２は、サポートされる各スレッドコンテクストについて異なる次命令ポインタフェッチロジック３２０を実装する。マルチプロセッサ環境において追加的な次命令ポインタフェッチロジック３２０の任意的な性質は、図６の破線により示される。 FIG. 6 shows that each processor 904 is connected to a centralized interrupt controller 110. Each processor 904 has a front end 920 that provides instruction information to the execution core 930. The fetched instruction information is buffered in the cache 225 for waiting for execution by the execution core 930. The front end 920 supplies instruction information to the execution core 930 in program order. In at least one embodiment, front end 920 includes a fetch / decode unit 322 that determines the next instruction to be executed. In at least one embodiment of system 900, fetch / decode unit 322 has a single next instruction pointer fetch logic 320. However, in embodiments where each processor 904 supports multiple thread contexts, the fetch / decode unit 322 implements different next instruction pointer fetch logic 320 for each supported thread context. The optional nature of additional next instruction pointer fetch logic 320 in a multiprocessor environment is illustrated by the dashed lines in FIG.

ここに記載された方法の各実施例は、ハードウェア、ハードウェアエミュレーションソフトウェア若しくは他のソフトウェア、ファームウェア又はこのような実現アプローチの組み合わせにより実現可能である。本発明の各実施例は、少なくとも１つのプロセッサ、データストレージシステム（揮発性及び不揮発性メモリ及び／又はストレージ要素を含む）、少なくとも１つの入力装置及び少なくとも１つの出力装置を有するプログラマブルシステムについて実現可能である。このアプリケーションのため、処理システムは、デジタル信号プロセッサ（ＤＳＰ）、マイクロコントローラ、特定用途向け集積回路（ＡＳＩＣ）、マイクロプロセッサなどのプロセッサを有する任意のシステムを含む。 Each embodiment of the method described herein may be implemented by hardware, hardware emulation software or other software, firmware, or a combination of such implementation approaches. Embodiments of the present invention can be implemented for a programmable system having at least one processor, a data storage system (including volatile and non-volatile memory and / or storage elements), at least one input device and at least one output device. It is. For this application, a processing system includes any system having a processor such as a digital signal processor (DSP), a microcontroller, an application specific integrated circuit (ASIC), or a microprocessor.

プログラムは、汎用又は特定用途のプログラム可能な処理システムにより可読なストレージメディア又は装置（ハードディスクドライブ、フロッピー（登録商標）ディスクドライブ、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＣＤ−ＲＯＭ装置、フラッシュメモリ装置、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、又は他のストレージ装置など）に格納されてもよい。処理システムのプロセッサにアクセス可能な命令は、ストレージメディア又は装置がここに記載された手順を実行するため処理システムにより読み込まれると、処理システムを設定及び実行する。本発明の各実施例はまた、処理システムにより使用するため構成されたマシーン可読ストレージ媒体として実現されると考えられ、そのように構成されたストレージメディアは、処理システムにここに記載された機能を実行するため特定かつ所定の方法により実行させる。 The program may be a storage medium or device readable by a general purpose or special purpose programmable processing system (hard disk drive, floppy disk drive, ROM (Read Only Memory), CD-ROM device, flash memory device, DVD ( (Digital Versatile Disk) or other storage device). The instructions accessible to the processor of the processing system configure and execute the processing system when the storage medium or device is read by the processing system to perform the procedures described herein. Each embodiment of the present invention is also considered to be implemented as a machine-readable storage medium configured for use by a processing system, and the storage medium configured as such provides the functionality described herein to the processing system. It is executed by a specific and predetermined method for execution.

一例となるシステム９００は、他のシステム（他のマイクロプロセッサ、エンジニアリングワークステーション、携帯情報端末、他の携帯装置、セットトップボックスなどを含む）がまた利用可能であるが、インテルコーポレイションから利用可能なＰｅｎｔｉｕｍ（登録商標）、Ｐｅｎｔｉｕｍ（登録商標）Ｐｒｏ、Ｐｅｎｔｉｕｍ（登録商標）ＩＩ、Ｐｅｎｔｉｕｍ（登録商標）ＩＩＩ、Ｐｅｎｔｉｕｍ（登録商標）４、Ｉｔａｎｉｕｍ（登録商標）、Ｉｔａｎｉｕｍ（登録商標）２マイクロプロセッサ、ＭｏｂｉｌｅＩｎｔｅｌ（登録商標）Ｐｅｎｔｉｕｍ（登録商標）ＩＩＩＰｒｏｃｅｓｓｏｒ−Ｍ、ＭｏｂｉｌｅＩｎｔｅｌ（登録商標）Ｐｅｎｔｉｕｍ（登録商標）４Ｐｒｏｃｅｓｓｏｒ−Ｍに基づく処理システムを表す。一実施例では、一例となるシステムは、他のオペレーティングシステム及びグラフィカルユーザインタフェースなどがまた利用可能であるが、マイクロソフトコーポレイションから利用可能なＷｉｎｄｏｗｓ（登録商標）オペレーティングシステムのバージョンを実行する。 The exemplary system 900 is available from Intel Corporation, although other systems (including other microprocessors, engineering workstations, personal digital assistants, other portable devices, set-top boxes, etc.) are also available. Pentium (R), Pentium (R) Pro, Pentium (R) II, Pentium (R) III, Pentium (R) 4, Itanium (R), Itanium (R) 2 Microprocessor, Mobile It represents a processing system based on Intel (R) Pentium (R) III Processor-M, Mobile Intel (R) Pentium (R) 4 Processor-M. In one embodiment, the exemplary system runs a version of the Windows operating system available from Microsoft Corporation, although other operating systems and graphical user interfaces are also available.

本発明の特定の実施例が図示及び説明されたが、添付した請求項の範囲から逸脱することなく各種変更及び改良が可能であることは当業者に明らかであろう。例えば、中央化されたＡＰＩＣ状態２０２の少なくとも１つの実施例は、シングルリードポートとシングルライトポートのみを含む。このような実施例では、入力システムメッセージキュー２０４、入力ローカルメッセージキュー２０６及びインタラプトシーケンサブロック２１４は、中央化されたＡＰＩＣ状態２０２へのアクセスを取得するため、アービトレーションロジック（図示せず）を利用する。 While particular embodiments of the present invention have been illustrated and described, it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the scope of the appended claims. For example, at least one embodiment of the centralized APIC state 202 includes only a single read port and a single write port. In such an embodiment, input system message queue 204, input local message queue 206, and interrupt sequencer block 214 utilize arbitration logic (not shown) to gain access to centralized APIC state 202. .

また、例えば、図５に示された方法５００の少なくとも１つの実施例は状態５０８を排除する。当業者は、状態５０８がパフォーマンスエンハンスメント（パワーセービング）を提供するだけであり、添付した請求項の発明の実施例には必要でないことを認識するであろう。 Also, for example, at least one embodiment of the method 500 shown in FIG. Those skilled in the art will recognize that state 508 only provides performance enhancement (power saving) and is not required for the claimed invention embodiments.

また、例えば、上述した中央化されたインタラプトコントローラ１１０の少なくとも１つの実施例は、スコアボード３０５を排除してもよいということが上述された。このような実施例では、インタラプトシーケンサ２１４は、ＡＰＩＣロジック２１２からのサービスを受け取るための次のＡＰＩＣインスタンスを決定するため、アーキテクチャＡＰＩＣ状態３０２のエントリ４１０をシーケンシャルにトラバースする。 It has also been mentioned above that, for example, at least one embodiment of the centralized interrupt controller 110 described above may exclude the scoreboard 305. In such an embodiment, interrupt sequencer 214 sequentially traverses architecture APIC state 302 entry 410 to determine the next APIC instance to receive service from APIC logic 212.

このため、当業者は、本発明から逸脱することなくそれの最も広範な側面において各種変更及び改良が可能であることを認識するであろう。添付した請求項は、本発明の真の範囲内に属するそのようなすべての変更及び改良をその範囲内に含むものである。 Thus, those skilled in the art will recognize that various changes and modifications can be made in the broadest aspects thereof without departing from the invention. The appended claims are intended to include within their scope all such changes and modifications as fall within the true scope of the present invention.

１００システム
１０４コア
１１０インタラプトコントローラ
３５９マルチコアスレッディング環境 100 system 104 core 110 interrupt controller 359 multi-core threading environment

Claims

A single logic block that performs priority and control functions for providing interrupt messages to multiple processing units;
A storage area having an array including a plurality of entries to maintain architectural interrupt state information of the plurality of processing units;
An interrupt event for the plurality of processing units connected to the storage area, querying the storage area, determining architecture interrupt status information of the plurality of processing units, connected to the logic block, and for processing by the logic block; An interrupt sequencer block, wherein each interrupt event has one of a plurality of classes of importance and is obtained from an input interrupt message;
An input interrupt message is received via one of a system interconnect connected to another system resource and a local interconnect connected to the plurality of processing units, and information from the input interrupt message is arranged in the storage area. One or more input message queues;
One or more output message queues for transmitting an output interrupt message scheduled by the interrupt sequencer block via one of the system interconnect and the local interconnect;
Having a device.

The apparatus of claim 1, wherein the single logic block comprises non-redundant circuitry that provides an interrupt for each processing unit.

The apparatus of claim 1, wherein the interrupt sequencer block schedules interrupt events for the plurality of processing units for provision according to a fairness scheme.

4. The apparatus of claim 3, wherein the interrupt sequencer block schedules interrupt events of the plurality of processing units according to a sequential traversal of the storage area.

The apparatus of claim 1, further comprising a scoreboard for maintaining data regarding which of the processing units have pending interrupt events.

The apparatus of claim 1, wherein the storage area further stores microarchitecture state information including APIC microarchitecture state information and per-instance microarchitecture APIC state information.

The apparatus of claim 1, wherein the plurality of processing units communicate via a local interconnect.

The one or more input message queues include a message queue for receiving an input interrupt message via the local interconnect;
The apparatus of claim 7, wherein the one or more output message queues comprise a message queue for sending output interrupt messages over the local interconnect.

The one or more input message queues include a message queue for receiving an input interrupt message via a system interconnect;
The apparatus of claim 7, wherein the one or more output message queues comprise a message queue for transmitting output interrupt messages over the system interconnect.

The apparatus of claim 1, wherein the one or more output message queues further comprise firewall logic for prohibiting one or more transmissions of the output interrupt message.

The apparatus of claim 1, wherein the one or more input message queues further comprise firewall logic for prohibiting one or more transmissions of the input interrupt message to one or more of the processing units.

The apparatus of claim 1, wherein the one or more input message queues further comprise firewall logic that prohibits the transmission of one or more of the input interrupt messages to one or more of the processing units.

Querying a storage array including a plurality of entries to determine an architecture interrupt state of the plurality of processing units;
Scheduling interrupt events for the plurality of processing units for processing by non-redundant interrupt providing blocks, wherein processing by the non-redundant interrupt providing logic is prioritized for providing interrupt messages for the plurality of processing units. And performing the control function, the scheduling step comprising:
A method comprising:
The scheduling is performed according to a fairness scheme that allows each processing unit to access the interrupt providing block equally.

The method of claim 13, wherein the interrupt providing block comprises advanced programmable interrupt controller logic.

The method of claim 13, wherein the fairness scheme is a sequential round robin scheme for the processing unit having one or more pending interrupt events.

A plurality of processing units executing one or more threads;
A memory connected to the processing unit;
A shared interrupt controller that provides an interrupt providing service for the plurality of processing units;
A system comprising:
The shared interrupt controller is:
A single logic block that performs prioritization and control functions for providing interrupt messages to a plurality of processing units, the single logic block being shared among the plurality of processing units;
A storage area including an array having a plurality of entries to maintain architectural interrupt state information of the plurality of processing units;
An interrupt event for the plurality of processing units connected to the storage area, querying the storage area, determining architecture interrupt status information of the plurality of processing units, connected to the logic block, and for processing by the logic block; An interrupt sequencer block for scheduling
One or more input message queues for receiving an input interrupt message and placing information from the input interrupt message in the storage area;
One or more output message queues for sending output interrupt messages;
Having a system.

The system of claim 16, wherein the shared interrupt controller further provides an APIC interrupt providing service for the plurality of processing units.

The system of claim 16, wherein the processing unit does not have self-contained APIC interrupt providing logic.

The system of claim 16, wherein the shared interrupt controller further comprises firewall logic.

The system of claim 16, further comprising a local interconnect connected to the plurality of processing units.

21. The system of claim 20, wherein the shared interrupt controller further comprises firewall logic that prohibits transmission of one or more interrupt messages over the local interconnect.

The system of claim 16, further comprising a system interconnect connected to the shared interrupt controller.

23. The system of claim 22, wherein the shared interrupt controller further comprises firewall logic that prohibits transmission of one or more interrupt messages over the system interconnect.

The system of claim 16, wherein the shared interrupt controller further schedules an interrupt serial service between the plurality of processing units.