JP2018085129A

JP2018085129A - Multichip package link

Info

Publication number: JP2018085129A
Application number: JP2018001692A
Authority: JP
Inventors: ジェイウー，ズオグオ; Zuoguo J Wu; ワグ，マヘシュ; Wagh Mahesh; シャーマ，デベンドラダス; Das Sharma Debendra; エスパスダスト，ジェラルド; S Pasdast Gerald; アイヤサミー，アナンサン; Ayyasamy Ananthan; リ，シャオベイ; Xiaobei Li; ジーブランケンシップ，ロバート; G Blankenship Robert; ジェイサフラネック，ロバート; J Safranek Robert
Original assignee: Intel Corp
Current assignee: Intel Corp
Priority date: 2018-01-10
Filing date: 2018-01-10
Publication date: 2018-05-31
Anticipated expiration: 2033-12-26
Also published as: JP6745289B2

Abstract

PROBLEM TO BE SOLVED: To provide an interconnect architecture which enables high-speed communication in a computing system.SOLUTION: A physical layer logic receives data on one or more data lanes of a physical link, receives an effective signal for specifying that effective data continues to assertion of the effective signal on one or more data lanes on another lane of the physical link, and receives a stream signal for specifying a type of data on one or more data lanes, on another lane of the physical link.SELECTED DRAWING: Figure 7

Description

本開示は、コンピューティングシステムに関し、（限定ではないが）特に、ポイントツーポイントインターコネクトに関する。 The present disclosure relates to computing systems, and in particular (but not exclusively) to point-to-point interconnects.

半導体処理及びロジック設計における進歩は、集積回路デバイス上に存在することができるロジックの量の増大を可能にしてきた。当然の結果として、コンピュータシステム構成は、システムにおける単一又は複数の集積回路から、個々の集積回路上に存在する複数のコア、複数のハードウェアスレッド及び複数の論理プロセッサ、並びにそのようなプロセッサ内に一体化された他のインタフェースへと展開した。プロセッサ又は集積回路は通常、単一の物理プロセッサダイを備え、ここで、プロセッサダイは任意の数のコア、ハードウェアスレッド、論理プロセッサ、インタフェース、メモリ、コントローラハブ等を含むことができる。 Advances in semiconductor processing and logic design have allowed an increase in the amount of logic that can exist on an integrated circuit device. As a natural consequence, computer system configurations can vary from single or multiple integrated circuits in the system to multiple cores, multiple hardware threads and multiple logical processors residing on individual integrated circuits, and within such processors. Expanded to other interfaces integrated into. A processor or integrated circuit typically comprises a single physical processor die, where the processor die can include any number of cores, hardware threads, logical processors, interfaces, memories, controller hubs, and the like.

より小さなパッケージにより多くの処理能力を収める能力が高まった結果、より小型のコンピューティングデバイスの人気が高まっている。スマートフォン、タブレット、超薄型ノートブック及び他のユーザ機器が飛躍的に成長している。一方、これらのより小型のデバイスは、フォームファクタを超えるデータストレージ及び複雑な処理の双方についてサーバに頼る。したがって、高性能のコンピューティング市場（すなわち、サーバ空間）における需要も増大している。例えば、最近のサーバにおいて、通常、複数のコアを有する単一のプロセッサのみでなく、計算能力を増大させるための複数の物理プロセッサ（複数のソケットとも呼ばれる）も存在する。しかし、処理能力は、コンピューティングシステム内のデバイス数と共に増えるため、ソケットと他のデバイスとの間の通信がよりクリティカルになる。 As a result of the increased ability to accommodate more processing power in smaller packages, smaller computing devices are becoming increasingly popular. Smartphones, tablets, ultra-thin notebooks and other user equipment are growing exponentially. On the other hand, these smaller devices rely on the server for both data storage and complex processing beyond the form factor. Therefore, there is also an increasing demand in the high performance computing market (ie server space). For example, in modern servers, there are usually not only a single processor with multiple cores, but also multiple physical processors (also called multiple sockets) to increase computing power. However, as processing power increases with the number of devices in the computing system, communication between sockets and other devices becomes more critical.

実際、インターコネクトは、電気通信を主に処理していた、より従来的なマルチドロップバスから、高速通信を促進する本格的なインターコネクトアーキテクチャへと成長している。不都合なことに、未来のプロセッサへの需要として、既存のインターコネクトアーキテクチャの能力に対し、対応する需要を更に高いレートで消費することが課されている。 In fact, interconnects have grown from a more traditional multi-drop bus that primarily handles telecommunications to a full-fledged interconnect architecture that facilitates high-speed communications. Unfortunately, the demand for future processors is imposed on the ability of existing interconnect architectures to consume the corresponding demand at a higher rate.

インターコネクトアーキテクチャを含むコンピューティングシステムの一実施形態を示す。1 illustrates one embodiment of a computing system including an interconnect architecture. 階層化スタックを含むインターコネクトアーキテクチャの一実施形態を示す。1 illustrates one embodiment of an interconnect architecture that includes a layered stack. インターコネクトアーキテクチャ内で生成又は受信されるリクエスト又はパケットの一実施形態を示す。FIG. 4 illustrates one embodiment of a request or packet generated or received within an interconnect architecture. インターコネクトアーキテクチャのための送信機及び受信機の対の一実施形態を示す。1 illustrates one embodiment of a transmitter and receiver pair for an interconnect architecture. マルチチップパッケージの一実施形態を示す。1 illustrates one embodiment of a multi-chip package. マルチチップパッケージリンク（ＭＣＰＬ）の単純化されたブロック図である。FIG. 2 is a simplified block diagram of a multichip package link (MCPL). 例示的なＭＣＰＬにおける例示的なシグナリングの表現である。2 is an exemplary signaling representation in an exemplary MCPL. 例示的なＭＣＰＬにおけるデータレーンを示す単純化されたブロック図である。FIG. 3 is a simplified block diagram illustrating data lanes in an exemplary MCPL. ＭＣＰＬの一実施形態における例示的なクロストークキャンセル技法を示す単純化されたブロック図である。FIG. 3 is a simplified block diagram illustrating an exemplary crosstalk cancellation technique in one embodiment of MCPL. ＭＣＰＬの一実施形態における例示的なクロストークキャンセルコンポーネントを示す単純化された回路図である。FIG. 6 is a simplified circuit diagram illustrating an exemplary crosstalk cancellation component in one embodiment of the MCPL. ＭＣＰＬの単純化されたブロック図である。FIG. 3 is a simplified block diagram of MCPL. 論理ＰＨＹインタフェース（ＬＰＩＦ）を用いて複数のプロトコルの上位層ロジックとインタフェースするＭＣＰＬの単純化されたブロック図である。FIG. 2 is a simplified block diagram of MCPL interfacing with higher layer logic of multiple protocols using a logical PHY interface (LPIF). リンクのリカバリに関連した例示的なＭＣＰＬにおける例示的なシグナリングの表現である。2 is an exemplary signaling representation in an exemplary MCPL related to link recovery. 例示的なＭＣＰＬのレーン上でのデータの例示的なビットマッピングである。FIG. 6 is an exemplary bit mapping of data on an exemplary MCPL lane. FIG. 例示的なＭＣＰＬのレーン上でのデータの例示的なビットマッピングである。FIG. 6 is an exemplary bit mapping of data on an exemplary MCPL lane. FIG. 例示的なＭＣＰＬのレーン上でのデータの例示的なビットマッピングである。FIG. 6 is an exemplary bit mapping of data on an exemplary MCPL lane. FIG. 例示的なリンク状態機械の一部の図である。FIG. 3 is a diagram of a portion of an example link state machine. リンクの例示的なセンタリングに関連付けられたフローの図である。FIG. 6 is a flow diagram associated with exemplary centering of links. 例示的なリンク状態機械の図である。FIG. 3 is an illustration of an exemplary link state machine. 低電力状態に入るためのシグナリングの図である。FIG. 6 is a signaling diagram for entering a low power state. マルチコアプロセッサを備えるコンピューティングシステムのためのブロック図の一実施形態を示す。1 illustrates one embodiment of a block diagram for a computing system comprising a multi-core processor. マルチコアプロセッサを備えるコンピューティングシステムのためのブロック図の別の実施形態を示す。FIG. 6 illustrates another embodiment of a block diagram for a computing system comprising a multi-core processor. プロセッサのためのブロック図の一実施形態を示す。1 illustrates one embodiment of a block diagram for a processor. プロセッサを備えるコンピューティングシステムのためのブロック図の別の実施形態を示す。FIG. 6 illustrates another embodiment of a block diagram for a computing system comprising a processor. 複数のプロセッサを含むコンピューティングシステムのためのブロックの一実施形態を示す。1 illustrates one embodiment of a block for a computing system including multiple processors. システムオンチップ（ＳｏＣ）として実装される例示的なシステムを示す。1 illustrates an exemplary system implemented as a system on chip (SoC).

様々な図面における同様の参照番号及び符号は同様の要素を示す。 Like reference numbers and designations in the various drawings indicate like elements.

以下の説明において、本発明の完全な理解をもたらすために、特定のタイプのプロセッサ及びシステム構成、特定のハードウェア構造、特定のアーキテクチャ及びマイクロアーキテクチャの詳細、特定のレジスタ構成、特定の命令タイプ、特定のシステムコンポーネント、特定の測定値／高さ、特定のプロセッサパイプラインステージ及び動作の例等の多数の特定の詳細が示される。しかしながら、当業者には、これらの特定の詳細が、本発明を実施するのに用いられる必要がないことが明らかであろう。他の例では、特定の及び代替的なプロセッサアーキテクチャ、説明されるアルゴリズムのための特定の論理回路／コード、特定のファームウェアコード、特定のインターコネクト動作、特定の論理構成、特定の製造技法及び材料、特定のコンパイラ実装、コードでのアルゴリズムの特定の表現、特定のパワーダウン及びパワーゲーティング技法／ロジック、並びにコンピューティングシステムの他の特定の動作の詳細等の、よく知られたコンポーネント又は方法は、本発明を不要に曖昧にすることを回避するために、詳細に説明されていない。 In the following description, to provide a thorough understanding of the present invention, specific types of processors and system configurations, specific hardware structures, specific architecture and microarchitecture details, specific register configurations, specific instruction types, Numerous specific details are shown, such as specific system components, specific measurements / heights, specific processor pipeline stages and example operations. However, it will be apparent to one skilled in the art that these specific details need not be used to practice the present invention. Other examples include specific and alternative processor architectures, specific logic circuits / codes for the described algorithm, specific firmware codes, specific interconnect operations, specific logic configurations, specific manufacturing techniques and materials, Well-known components or methods, such as specific compiler implementations, specific representations of algorithms in code, specific power-down and power-gating techniques / logic, and other specific operational details of the computing system, In order to avoid unnecessarily obscuring the invention, it has not been described in detail.

以下の実施形態は、コンピューティングプラットフォーム又はマイクロプロセッサ等の特定の集積回路におけるエネルギー節減及びエネルギー効率を参照して説明される場合があるが、他の実施形態は、他のタイプの集積回路及び論理デバイスに適用可能である。本明細書において説明される実施形態の同様の技法及び教示は、より良好なエネルギー効率及びエネルギー節減から同様に利益を受けることができる他のタイプの回路又は半導体デバイスに適用することができる。例えば、開示される実施形態は、デスクトップコンピュータシステム又はＵｌｔｒａｂｏｏｋｓ（商標）に限定されない。また、ハンドヘルドデバイス、タブレット、他の薄型ノートブック、システムオンチップ（ＳＯＣ）デバイス及び埋込みアプリケーション等の他のデバイスにおいて用いられる場合もある。ハンドヘルドデバイスのいくつかの例は、携帯電話、インターネットプロトコルデバイス、デジタルカメラ、携帯情報端末（ＰＤＡ：ｐｅｒｓｏｎａｌｄｉｇｉｔａｌａｓｓｉｓｔａｎｔ）及びハンドヘルドＰＣを含む。埋込みアプリケーションは通常、マイクロコントローラ、デジタル信号プロセッサ（ＤＳＰ）、システムオンチップ、ネットワークコンピュータ（ＮｅｔＰＣ）、セットトップボックス、ネットワークハブ、広域ネットワーク（ＷＡＮ）スイッチ、又は以下で教示する機能及び動作を実行することができる任意の他のシステムを含む。更に、本明細書において記載される装置、方法及びシステムは、物理的なコンピューティングデバイスに限定されず、エネルギー節減及びエネルギー効率のためのソフトウェア最適化にも関係することができる。以下の説明において容易に明らかとなるように、本明細書に記載される方法、装置及びシステムの実施形態は（ハードウェア、ファームウェア、ソフトウェア又はそれらの組合せのいずれにおいて参照されていようと）、性能の懸案事項とバランスをとられた「環境に優しい技術（ｇｒｅｅｎｔｅｃｈｎｏｌｏｇｙ）」に不可欠である。 Although the following embodiments may be described with reference to energy savings and energy efficiency in a particular integrated circuit such as a computing platform or microprocessor, other embodiments may include other types of integrated circuits and logic. Applicable to devices. Similar techniques and teachings of the embodiments described herein can be applied to other types of circuits or semiconductor devices that can also benefit from better energy efficiency and energy savings. For example, the disclosed embodiments are not limited to desktop computer systems or Ultrabooks ™. It may also be used in other devices such as handheld devices, tablets, other thin notebooks, system on chip (SOC) devices and embedded applications. Some examples of handheld devices include mobile phones, internet protocol devices, digital cameras, personal digital assistants (PDAs), and handheld PCs. Embedded applications typically perform a microcontroller, digital signal processor (DSP), system on chip, network computer (NetPC), set top box, network hub, wide area network (WAN) switch, or the functions and operations taught below. Including any other system that can. Furthermore, the apparatus, methods and systems described herein are not limited to physical computing devices, but can also relate to software optimization for energy savings and energy efficiency. As will be readily apparent in the following description, the method, apparatus and system embodiments described herein (whether referenced in hardware, firmware, software or combinations thereof) It is essential to "green technology" that is balanced with the concerns.

コンピューティングシステムの進歩に伴い、コンピューティングシステム内のコンポーネントがより複雑になっている。結果として、最適なコンポーネント動作のための帯域幅要件が満たされることを確実にするために、コンポーネント間で結合し通信するためのインターコネクトアーキテクチャもますます複雑になっている。更に、様々な市場区分は、インターコネクトアーキテクチャの様々な態様が市場の需要に適合することを求める。例えば、サーバはより高い性能を要求する一方、モバイルエコシステムは、場合によっては、電力節減のために全体性能を犠牲にすることができる。しかし、電力節減を最大にしながら、可能な限り最高の性能を提供することがほとんどのファブリックの単一の目的である。以下において、本明細書において記載される本発明の態様から潜在的に利益を得る複数のインターコネクトが検討される。 As computing systems have advanced, components within computing systems have become more complex. As a result, interconnect architectures for coupling and communicating between components are also becoming increasingly complex to ensure that bandwidth requirements for optimal component operation are met. Furthermore, different market segments require different aspects of the interconnect architecture to meet market demands. For example, the server may require higher performance, while the mobile ecosystem may in some cases sacrifice overall performance to save power. However, providing the best possible performance while maximizing power savings is the single goal of most fabrics. In the following, a plurality of interconnects that will potentially benefit from aspects of the invention described herein are discussed.

１つのインターコネクトファブリックアーキテクチャは、周辺コンポーネントインターコネクト（ＰＣＩ）エクスプレス（ＰＣＩｅ）アーキテクチャを備える。ＰＣＩｅの主な目標は、複数の市場区分、すなわち、クライアント（デスクトップ及びモバイル）、サーバ（標準及び企業）、並びに埋込みデバイス及び通信デバイスにまたがって、様々なベンダからのコンポーネント及びデバイスがオープンアーキテクチャにおいて相互運用することを可能にすることである。ＰＣＩＥｘｐｒｅｓｓは、多岐にわたる未来の計算及び通信プラットフォームについて定義される高性能の汎用Ｉ／Ｏインターコネクトである。使用モデル、ロード−ストアアーキテクチャ及びソフトウェアインタフェース等のいくつかのＰＣＩ属性がその改訂を通じて維持されているのに対し、以前のパラレルバス実装は、高度にスケーリング可能で完全にシリアルのインタフェースに置き換えられている。ＰＣＩＥｘｐｒｅｓｓのより近時のバージョンは、ポイントツーポイントインターコネクト、スイッチベースの技術及びパケット化されたプロトコルにおける進歩を利用して、新たなレベルの性能及び特徴をもたらしている。電力管理、サービス品質（ＱｏＳ）、ホットプラグ／ホットスワップサポート、データ完全性、エラーハンドリングが、ＰＣＩＥｘｐｒｅｓｓによってサポートされる進化した数ある特徴のうちのいくつかである。 One interconnect fabric architecture comprises a peripheral component interconnect (PCI) express (PCIe) architecture. The main goal of PCIe is across multiple market segments: clients (desktop and mobile), servers (standard and enterprise), and embedded and communications devices, with components and devices from various vendors in an open architecture. It is possible to interoperate. PCI Express is a high performance general purpose I / O interconnect defined for a wide variety of future computing and communication platforms. While some PCI attributes such as usage model, load-store architecture and software interface are maintained throughout the revision, the previous parallel bus implementation has been replaced with a highly scalable and fully serial interface. Yes. More recent versions of PCI Express take advantage of advances in point-to-point interconnects, switch-based technology, and packetized protocols to bring new levels of performance and features. Power management, quality of service (QoS), hot plug / hot swap support, data integrity, and error handling are some of the evolutionary features supported by PCI Express.

図１を参照すると、１組のコンポーネントをインターコネクトするポイントツーポイントリンクから構成されるファブリックの一実施形態が示されている。システム１００は、コントローラハブ１１５に結合されたプロセッサ１０５及びシステムメモリ１１０を備える。プロセッサ１０５は、マイクロプロセッサ、ホストプロセッサ、埋込みプロセッサ、コプロセッサ又は他のプロセッサ等の任意の処理要素を含む。プロセッサ１０５は、フロントサイドバス（ＦＳＢ）１０６を通じてコントローラハブ１１５に結合される。１つの実施形態では、ＦＳＢ１０６は以下で説明するようなシリアルポイントツーポイントインターコネクトである。別の実施形態では、リンク１０６は、異なるインターコネクト規格に準拠する、シリアルの、差動インターコネクトアーキテクチャを含む。 Referring to FIG. 1, there is shown one embodiment of a fabric comprised of point-to-point links that interconnect a set of components. System 100 includes a processor 105 and a system memory 110 coupled to a controller hub 115. The processor 105 includes any processing element such as a microprocessor, host processor, embedded processor, coprocessor or other processor. The processor 105 is coupled to the controller hub 115 through a front side bus (FSB) 106. In one embodiment, the FSB 106 is a serial point-to-point interconnect as described below. In another embodiment, link 106 includes a serial, differential interconnect architecture that conforms to different interconnect standards.

システムメモリ１１０は、ランダムアクセスメモリ（ＲＡＭ）、不揮発性（ＮＶ）メモリ、又はシステム１００内のデバイスによってアクセス可能な他のメモリ等の任意のメモリデバイスを含む。システムメモリ１１０は、メモリインタフェース１１６を通じてコントローラハブ１１５に結合される。メモリインタフェースの例は、ダブルデータレート（ＤＤＲ）メモリインタフェース、デュアルチャネルＤＤＲメモリインタフェース、及びダイナミックＲＡＭ（ＤＲＡＭ）メモリインタフェースを含む。 The system memory 110 includes any memory device, such as random access memory (RAM), non-volatile (NV) memory, or other memory accessible by devices in the system 100. System memory 110 is coupled to controller hub 115 through memory interface 116. Examples of memory interfaces include double data rate (DDR) memory interfaces, dual channel DDR memory interfaces, and dynamic RAM (DRAM) memory interfaces.

１つの実施形態では、コントローラハブ１１５は、周辺コンポーネントインターコネクトエクスプレス（ＰＣＩｅ又はＰＣＩＥ）相互接続階層におけるルートハブ、ルートコンプレックス、又はルートコントローラである。コントローラハブ１１５の例は、チップセット、メモリコントローラハブ（ＭＣＨ）、ノースブリッジ、インターコネクトコントローラハブ（ＩＣＨ）、サウスブリッジ及びルートコントローラ／ハブを含む。多くの場合、チップセットという用語は、２つの物理的に別個のコントローラハブ、すなわち、インターコネクトコントローラハブ（ＩＣＨ）に結合されたメモリコントローラハブ（ＭＣＨ）を指す。現行のシステムは、多くの場合、プロセッサ１０５に一体化されたＭＣＨを含む一方、コントローラ１１５は、以下で説明するのと同様の方式でＩ／Ｏデバイスと通信することに留意されたい。いくつかの実施形態では、ピアツーピアルーティングは、オプションでルートコンプレックス１１５を通じてサポートされる。 In one embodiment, the controller hub 115 is a root hub, root complex, or root controller in a peripheral component interconnect express (PCIe or PCIE) interconnect hierarchy. Examples of the controller hub 115 include a chipset, a memory controller hub (MCH), a north bridge, an interconnect controller hub (ICH), a south bridge, and a route controller / hub. In many cases, the term chipset refers to two physically separate controller hubs, namely a memory controller hub (MCH) coupled to an interconnect controller hub (ICH). Note that current systems often include an MCH integrated into the processor 105, while the controller 115 communicates with I / O devices in a manner similar to that described below. In some embodiments, peer-to-peer routing is optionally supported through the route complex 115.

ここで、コントローラハブ１１５は、シリアルリンク１１９を通じてスイッチ／ブリッジ１２０に結合される。インタフェース／ポート１１７及び１２１とも呼ばれる場合がある入出力モジュール１１７及び１２１は、コントローラハブ１１５とスイッチ１２０との間で通信を提供するための階層化プロトコルスタックを含む／実装する。１つの実施形態では、複数のデバイスがスイッチに１２０結合されることが可能である。 Here, the controller hub 115 is coupled to the switch / bridge 120 through a serial link 119. Input / output modules 117 and 121, which may also be referred to as interfaces / ports 117 and 121, include / implement a layered protocol stack for providing communication between controller hub 115 and switch 120. In one embodiment, multiple devices can be 120 coupled to the switch.

スイッチ／ブリッジ１２０は、パケット／メッセージを、デバイス１２５からアップストリームに、すなわちルートコンプレックスに向かって階層を上がってコントローラハブ１１５へ、及びダウンストリームに、すなわち、ルートコントローラから離れるように階層を下がって、プロセッサ１０５又はシステムメモリ１１０からデバイス１２５へルーティングする。１つの実施形態において、スイッチ１２０は、複数の仮想ＰＣＩ間ブリッジデバイスの論理アセンブリと呼ばれる。デバイス１２５は、Ｉ／Ｏデバイス、ネットワークインタフェースコントローラ（ＮＩＣ）、アドインカード、オーディオプロセッサ、ネットワークプロセッサ、ハードドライブ、ストレージデバイス、ＣＤ／ＤＶＤＲＯＭ、モニタ、プリンタ、マウス、キーボード、ルータ、ポータブルストレージデバイス、Ｆｉｒｅｗｉｒｅデバイス、ユニバーサルシリアルバス（ＵＳＢ）デバイス、スキャナ及び他の入出力デバイス等の電子システムに結合される任意の内部又は外部のデバイス又はコンポーネントを含む。ＰＣＩｅにおいて多くの場合、デバイス等の専門語はエンドポイントと呼ばれる。詳細に示されていないが、デバイス１２５は、レガシＰＣＩデバイス又は他のバージョンのＰＣＩデバイスをサポートするためのＰＣＩｅ対ＰＣＩ／ＰＣＩ−Ｘブリッジを含むことができる。ＰＣＩｅにおけるエンドポイントデバイスは、多くの場合に、レガシ、ＰＣＩｅ又はルートコンプレックスが一体化されたエンドポイントとして分類される。 The switch / bridge 120 moves packets / messages from the device 125 upstream, ie up the hierarchy towards the root complex, to the controller hub 115 and downstream, ie down the hierarchy away from the root controller. Route from the processor 105 or system memory 110 to the device 125. In one embodiment, switch 120 is referred to as a logic assembly of multiple virtual PCI bridge devices. The device 125 is an I / O device, network interface controller (NIC), add-in card, audio processor, network processor, hard drive, storage device, CD / DVD ROM, monitor, printer, mouse, keyboard, router, portable storage device, Includes any internal or external device or component coupled to an electronic system such as Firewire devices, Universal Serial Bus (USB) devices, scanners and other input / output devices. In PCIe, technical terms such as devices are often called endpoints. Although not shown in detail, the device 125 may include a PCIe to PCI / PCI-X bridge to support legacy PCI devices or other versions of PCI devices. Endpoint devices in PCIe are often categorized as endpoints integrating legacy, PCIe or root complexes.

グラフィックアクセラレータ１３０もシリアルリンク１３２を通じてコントローラハブ１１５に結合される。１つの実施形態では、グラフィックアクセラレータ１３０はＭＣＨに結合され、ＭＣＨはＩＣＨに結合される。そして、スイッチ１２０、及びこれに応じてＩ／Ｏデバイス１２５はＩＣＨに結合される。Ｉ／Ｏモジュール１３１及び１１８はまた、階層化プロトコルスタックを実施してグラフィックアクセラレータ１３０とコントローラハブ１１５との間で通信する。上記のＭＣＨの検討と同様に、グラフィックコントローラ又はグラフィックアクセラレータ１３０自体がプロセッサ１０５に一体化されてもよい。 Graphic accelerator 130 is also coupled to controller hub 115 through serial link 132. In one embodiment, graphics accelerator 130 is coupled to MCH and MCH is coupled to ICH. The switch 120 and, accordingly, the I / O device 125 are coupled to the ICH. I / O modules 131 and 118 also implement a layered protocol stack to communicate between graphics accelerator 130 and controller hub 115. Similar to the MCH discussion above, the graphics controller or graphics accelerator 130 itself may be integrated into the processor 105.

図２を参照すると、階層化プロトコルスタックの一実施形態が示されている。階層化プロトコルスタック２００は、クイックパスインターコネクト（ＱＰＩ）スタックＰＣIｅスタック、次世代高性能コンピューティングインターコネクトスタック又は他の階層化スタック等の任意の形態の階層化通信スタックを含む。図１〜図４を参照してすぐ下で行う検討は、ＰＣＩｅスタックに関係しているが、同じ概念を他のインターコネクトスタックに適用してもよい。１つの実施形態では、プロトコルスタック２００は、トランザクション層２０５、リンク層２１０及び物理層２２０を含むＰＣＩｅプロトコルスタックである。図１のインタフェース１１７、１１８、１２１、１２２、１２６及び１３１等のインタフェースは、通信プロトコルスタック２００として表すことができる。通信プロトコルスタックとしての表現は、プロトコルスタックを実装する／含むモジュール又はインタフェースと呼ぶこともできる。 Referring to FIG. 2, one embodiment of a layered protocol stack is shown. The layered protocol stack 200 includes any form of layered communication stack, such as a quick path interconnect (QPI) stack PCIe stack, a next generation high performance computing interconnect stack, or other layered stack. Although the discussion immediately below with reference to FIGS. 1-4 relates to a PCIe stack, the same concepts may be applied to other interconnect stacks. In one embodiment, protocol stack 200 is a PCIe protocol stack that includes transaction layer 205, link layer 210, and physical layer 220. Interfaces such as interfaces 117, 118, 121, 122, 126 and 131 in FIG. 1 can be represented as a communication protocol stack 200. A representation as a communication protocol stack can also be referred to as a module or interface that implements / includes the protocol stack.

ＰＣＩＥｘｐｒｅｓｓは、パケットを用いてコンポーネント間で情報を通信する。パケットは、送信コンポーネントから受信コンポーネントに情報を搬送するように、トランザクション層２０５及びデータリンク層２１０において形成される。送信されたパケットが他の層を通じて流れるとき、これらのパケットは、これらの層においてパケットを処理するのに必要な追加情報を用いて拡張される。受信側において、逆のプロセスが行われ、パケットは、それらの物理層２２０表現からデータリンク層２１０表現に変換され、最終的に（トランザクション層パケットの場合）、受信デバイスのトランザクション層２０５によって処理することができる形態に変換される。 PCI Express communicates information between components using packets. Packets are formed in transaction layer 205 and data link layer 210 to carry information from the sending component to the receiving component. As transmitted packets flow through other layers, they are expanded with the additional information needed to process the packets at these layers. On the receiving side, the reverse process takes place and the packets are converted from their physical layer 220 representation to the data link layer 210 representation and finally (in the case of transaction layer packets) processed by the transaction layer 205 of the receiving device. Can be converted into a form that can.

トランザクション層
１つの実施形態では、トランザクション層２０５は、デバイスの処理コアと、データリンク層２１０及び物理層２２０等のインターコネクトアーキテクチャとの間のインタフェースを提供する。これに関して、トランザクション層２０５の主な役割は、パケット（すなわち、トランザクション層パケット又はＴＬＰ）の組立て及び分解である。トランザクション層２０５は、通常、ＴＬＰのためのクレジットベースのフロー制御を管理する。ＰＣＩｅは分割トランザクション、すなわち、時間で分かれたリクエスト及び応答を用いたトランザクションを実施し、ターゲットデバイスが応答のためのデータを収集する間にリンクが他のトラフィックを搬送することを可能にする。 Transaction Layer In one embodiment, transaction layer 205 provides an interface between the processing core of the device and an interconnect architecture such as data link layer 210 and physical layer 220. In this regard, the primary role of transaction layer 205 is the assembly and disassembly of packets (ie, transaction layer packets or TLPs). Transaction layer 205 typically manages credit-based flow control for TLP. PCIe performs split transactions, i.e. transactions with time-separated requests and responses, allowing the link to carry other traffic while the target device collects data for responses.

加えて、ＰＣＩｅはクレジットベースのフロー制御を利用する。この方式において、デバイスは、トランザクション層２０５内の受信バッファごとに初期クレジット量を公表する（ａｄｖｅｒｔｉｓｅ）。図１のコントローラハブ１１５等の、リンクの反対端の外部デバイスが、各ＴＬＰによって消費されるクレジット数をカウントする。トランザクションがクレジット限界を超えない場合、トランザクションを送信することができる。応答の受信時、クレジット量がリストアされる。クレジット方式の利点は、クレジット限界に達していないならば、クレジットの戻りのレイテンシが性能に影響を及ぼさないことである。 In addition, PCIe utilizes credit-based flow control. In this scheme, the device advertises an initial credit amount for each receive buffer in the transaction layer 205. An external device at the other end of the link, such as controller hub 115 of FIG. 1, counts the number of credits consumed by each TLP. If the transaction does not exceed the credit limit, the transaction can be sent. Upon receipt of the response, the credit amount is restored. The advantage of the credit scheme is that the credit return latency does not affect performance if the credit limit is not reached.

１つの実施形態では、４つのトランザクションアドレス空間が、構成アドレス空間、メモリアドレス空間、入出力アドレス空間及びメッセージアドレス区間を含む。メモリ空間トランザクションは、メモリによりマッピングされたロケーションに／からデータを転送するための読出しリクエスト及び書込みリクエストのうちの１以上を含む。１つの実施形態では、メモリ空間トランザクションは、２つの異なるアドレスフォーマット、例えば、３２ビットアドレス等の短いアドレスフォーマット、又は６４ビットアドレス等の長いアドレスフォーマットを用いることが可能である。構成空間トランザクションが、ＰＣＩｅデバイスの構成空間にアクセスするために用いられる。構成空間へのトランザクションは、読出しリクエスト及び書込みリクエストを含む。メッセージ空間トランザクション（又は、単にメッセージ）は、ＰＣＩｅエージェント間の帯域内通信をサポートするように定義される。 In one embodiment, the four transaction address spaces include a configuration address space, a memory address space, an input / output address space, and a message address section. A memory space transaction includes one or more of a read request and a write request to transfer data to / from a location mapped by the memory. In one embodiment, the memory space transaction can use two different address formats, for example, a short address format such as a 32-bit address, or a long address format such as a 64-bit address. Configuration space transactions are used to access the configuration space of a PCIe device. Transactions to the configuration space include read requests and write requests. Message space transactions (or simply messages) are defined to support in-band communication between PCIe agents.

したがって、１つの実施形態では、トランザクション層２０５がパケットヘッダ／ペイロード２０６を組み立てる。現在のパケットヘッダ／ペイロードのためのフォーマットは、ＰＣＩｅ仕様ウェブサイトにおけるＰＣＩｅ仕様において得ることができる。 Thus, in one embodiment, transaction layer 205 assembles packet header / payload 206. The format for the current packet header / payload can be obtained in the PCIe specification at the PCIe specification website.

図３を簡単に参照すると、ＰＣＩｅトランザクション記述子の一実施形態が示されている。１つの実施形態において、トランザクション記述子３００は、トランザクション情報を搬送するためのメカニズムである。これに関して、トランザクション記述子３００は、システム内のトランザクションの識別をサポートする。他の潜在的な使用は、デフォルトのトランザクション順序及びトランザクションとチャネルとの関連付けの変更の追跡を含む。 Turning briefly to FIG. 3, one embodiment of a PCIe transaction descriptor is shown. In one embodiment, the transaction descriptor 300 is a mechanism for carrying transaction information. In this regard, the transaction descriptor 300 supports the identification of transactions within the system. Other potential uses include tracking default transaction order and transaction-channel association changes.

トランザクション記述子３００は、グローバル識別子フィールド３０２と、属性フィールド３０４と、チャネル識別子フィールド３０６とを含む。示される例において、ローカルトランザクション識別子フィールド３０８と、ソース識別子フィールド３１０とを含むグローバル識別子フィールド３０２が示される。１つの実施形態において、グローバルトランザクション識別子３０２は全ての未処理のリクエストについて一意である。 Transaction descriptor 300 includes a global identifier field 302, an attribute field 304, and a channel identifier field 306. In the example shown, a global identifier field 302 is shown that includes a local transaction identifier field 308 and a source identifier field 310. In one embodiment, the global transaction identifier 302 is unique for all outstanding requests.

１つの実施態様によれば、ローカルトランザクション識別子フィールド３０８は、リクエスト側エージェントによって生成されるフィールドであり、そのリクエスト側エージェントに対し完了を要求する全ての未処理のリクエストについて一意である。更に、この例では、ソース識別子３１０はＰＣＩｅ階層内のリクエスト元エージェントを一意に識別する。したがって、ソースＩＤ３１０と共に、ローカルトランザクション識別子３０８フィールドは、階層ドメイン内のトランザクションのグローバル識別を提供する。 According to one embodiment, the local transaction identifier field 308 is a field generated by the requesting agent and is unique for all outstanding requests that require completion for that requesting agent. Further, in this example, the source identifier 310 uniquely identifies the requesting agent in the PCIe hierarchy. Thus, along with the source ID 310, the local transaction identifier 308 field provides a global identification of transactions within the hierarchical domain.

属性フィールド３０４は、トランザクションの特性及び関係を指定する。これに関して、属性フィールド３０４は、トランザクションのデフォルトのハンドリングの変更を可能にする追加の情報を提供するのに潜在的に用いられる。１つの実施形態では、属性フィールド３０４は、優先度フィールド３１２と、予約済みフィールド３１４と、順序付けフィールド３１６と、非スヌープフィールド３１８とを含む。ここで、優先度サブフィールド３１２は、トランザクションに優先度を割り当てるようにイニシエータによって変更され得る。予約済み属性フィールド３１４は、未来のために、又はベンダが定義した使用のために予約済みのままにされる。優先度又はセキュリティ属性を用いた可能な使用モデルは、予約済み属性フィールドを用いて実施され得る。 An attribute field 304 specifies transaction characteristics and relationships. In this regard, the attribute field 304 is potentially used to provide additional information that allows a change in the default handling of the transaction. In one embodiment, the attribute field 304 includes a priority field 312, a reserved field 314, an ordering field 316, and a non-snoop field 318. Here, the priority subfield 312 can be changed by the initiator to assign a priority to the transaction. The reserved attribute field 314 remains reserved for future use or for vendor-defined use. A possible usage model using priority or security attributes may be implemented using reserved attribute fields.

この例では、順序付け属性フィールド３１６は、デフォルトの順序付け規則を変更することができる、順序付けタイプを伝達するオプション情報を供給するために用いられる。１つの実施態様によれば、順序付け属性「０」はデフォルトの順序付け規則が適用されることを示し、順序属性「１」は、緩和された順序付けを表す。緩和された順序付けでは、書込みは同じ方向の書込みをパスすることができ、読出し完了は同じ方向の書込みをパスすることができる。スヌープ属性フィールド３１８は、トランザクションがスヌープされているか否かを判断するのに利用される。示すように、チャネルＩＤフィールド３０６は、トランザクションが関連付けられているチャネルを識別する。 In this example, the ordering attribute field 316 is used to provide optional information that conveys the ordering type, which can change the default ordering rules. According to one embodiment, the ordering attribute “0” indicates that the default ordering rule is applied, and the ordering attribute “1” represents relaxed ordering. With relaxed ordering, a write can pass a write in the same direction, and a read complete can pass a write in the same direction. The snoop attribute field 318 is used to determine whether the transaction is snooped. As shown, channel ID field 306 identifies the channel with which the transaction is associated.

リンク層
データリンク層２１０とも呼ばれるリンク層２１０は、トランザクション層２０５と物理層２２０との間の中間ステージとしての役割を果たす。１つの実施形態では、データリンク層２１０の役割は、リンクとして２つのコンポーネント間でトランザクション層パケット（ＴＬＰ）を交換するための信頼性の高いメカニズムを提供することである。データリンク層２１０の一方の側は、トランザクション層２０５によって組み立てられたＴＬＰを受容し、パケットシーケンス識別子２１１、すなわち識別番号又はパケット番号を適用し、誤り検出コード、すなわちＣＲＣ２１２を計算し適用し、変更されたＴＬＰを、物理層を越えて外部デバイスに送信するために物理層２２０に提出する。 Link Layer The link layer 210, also referred to as the data link layer 210, serves as an intermediate stage between the transaction layer 205 and the physical layer 220. In one embodiment, the role of the data link layer 210 is to provide a reliable mechanism for exchanging transaction layer packets (TLPs) between two components as a link. One side of the data link layer 210 accepts the TLP assembled by the transaction layer 205, applies the packet sequence identifier 211, ie identification number or packet number, calculates and applies the error detection code, ie CRC 212, changes Submitted TLPs to the physical layer 220 for transmission across the physical layer to external devices.

物理層
１つの実施形態では、物理層２２０は、パケットを外部デバイスに物理的に送信するための論理サブブロック２２１及び電気サブブロック２２２を備える。ここで、論理サブブロック２２１は、物理層２２１の「デジタル」機能を担当する。これに関して、論理サブブロックは、物理的サブブロック２２２による送信のために発信情報を準備する送信部と、受信した情報を識別及び準備し、その後リンク層２１０に渡す受信機部とを含む。 Physical Layer In one embodiment, the physical layer 220 comprises a logical sub-block 221 and an electrical sub-block 222 for physically transmitting the packet to an external device. Here, the logical sub-block 221 is responsible for the “digital” function of the physical layer 221. In this regard, the logical sub-block includes a transmitter that prepares outgoing information for transmission by the physical sub-block 222 and a receiver that identifies and prepares the received information and then passes it to the link layer 210.

物理ブロック２２２は、送信機及び受信機を含む。送信機は、論理サブブロック２２１によってシンボルを供給され、送信機がこのシンボルをシリアル化し、外部デバイスに送信する。受信機は、外部デバイスからのシリアル化されたシンボルを供給され、受信した信号をビットストリームに変換する。ビットストリームはシリアル化解除され、論理サブブロック２２１に供給される。１つの実施形態では、８ｂ／１０ｂ送信コードが用いられる。ここで、１０ビットシンボルが送信／受信される。ここで、特殊なシンボルを用いて、パケットがフレーム２２３を用いてフレーム化される。加えて、１つの例では、受信機は着信シリアルストリームからリカバリされたシンボルクロックも提供する。 The physical block 222 includes a transmitter and a receiver. The transmitter is provided with a symbol by logical sub-block 221, which serializes the symbol and transmits it to an external device. The receiver is supplied with serialized symbols from an external device and converts the received signal into a bit stream. The bitstream is deserialized and provided to the logical sub-block 221. In one embodiment, 8b / 10b transmission codes are used. Here, 10-bit symbols are transmitted / received. Here, the packet is framed using the frame 223 using a special symbol. In addition, in one example, the receiver also provides a recovered symbol clock from the incoming serial stream.

上記で示したように、トランザクション層２０５、リンク層２１０及び物理層２２０はＰＣＩｅプロトコルスタックの特殊な実施形態を参照して検討されているが、階層化プロトコルスタックはそのように限定されない。実際、任意の階層化プロトコルを含める／実装することができる。例として、階層化プロトコルとして表されるポート／インタフェースは、（１）パケットを組み立てる第１の層、すなわち、トランザクション層と、パケットを配列する第２の層、すなわちリンク層と、パケットを送信する第３の層、すなわち物理層とを含む。特殊な例として、共通標準インタフェース（ＣＳＩ：ｃｏｍｍｏｎｓｔａｎｄａｒｄｉｎｔｅｒｆａｃｅ）階層化プロトコルが利用される。 As indicated above, the transaction layer 205, link layer 210, and physical layer 220 are discussed with reference to a specific embodiment of the PCIe protocol stack, but the layered protocol stack is not so limited. In fact, any layered protocol can be included / implemented. By way of example, a port / interface, represented as a layered protocol, (1) sends a packet to the first layer that assembles the packet, ie, the transaction layer, and the second layer that arranges the packet, ie, the link layer Includes a third layer, the physical layer. As a special example, a common standard interface (CSI) layered protocol is used.

次に図４を参照すると、ＰＣＩｅシリアルポイントツーポイントファブリックの一実施形態が示されている。ＰＣＩｅシリアルポイントツーポイントリンクの一実施形態が示されているが、シリアルポイントツーポイントリンクはシリアルデータを送信するための任意の送信パスを含むので、シリアルポイントツーポイントリンクはそのように限定されない。示す実施形態では、基本ＰＣＩｅリンクは、２つの低電圧の差動駆動信号対、すなわち送信対４０６／４１１及び受信対４１２／４０７を含む。したがって、デバイス４０５は、データをデバイス４１０に送信するための送信ロジック４０６と、デバイス４１０からデータを受信するための受信ロジック４０７とを含む。換言すれば、２つの送信経路、すなわち、経路４１６及び４１７、並びに２つの受信経路、すなわち経路４１８及び４１９がＰＣＩｅリンクに含まれる。 Referring now to FIG. 4, one embodiment of a PCIe serial point-to-point fabric is shown. Although one embodiment of a PCIe serial point-to-point link is shown, the serial point-to-point link is not so limited, as the serial point-to-point link includes any transmission path for transmitting serial data. In the illustrated embodiment, the basic PCIe link includes two low voltage differential drive signal pairs: a transmit pair 406/411 and a receive pair 412/407. Accordingly, device 405 includes transmit logic 406 for transmitting data to device 410 and receive logic 407 for receiving data from device 410. In other words, two transmission paths, namely paths 416 and 417, and two reception paths, ie paths 418 and 419, are included in the PCIe link.

送信パスは、伝送線、銅線、光回線、無線通信チャネル、赤外線通信リンク又は他の通信パス等の、データを伝送するための任意のパスを指す。デバイス４０５及びデバイス４１０等の２つのデバイス間の接続は、リンク４１５等のリンクと呼ばれる。リンクは、１つのレーンをサポートすることができ、各レーンは、１組の差動信号対（送信用の１対、受信用の１対）を表す。帯域幅をスケーリングするために、リンクはｘＮによって表される複数のレーンを集約することができ、ここで、Ｎは、１、２、４、８、１２、１６、３２、６４又はそれより広いもの等の任意のサポートされるリンク幅である。 A transmission path refers to any path for transmitting data, such as a transmission line, copper line, optical line, wireless communication channel, infrared communication link, or other communication path. The connection between two devices, such as device 405 and device 410, is referred to as a link, such as link 415. A link can support one lane, and each lane represents a set of differential signal pairs (one pair for transmission, one pair for reception). To scale bandwidth, a link can aggregate multiple lanes represented by xN, where N is 1, 2, 4, 8, 12, 16, 32, 64 or wider Any supported link width, such as

差動対は、差動信号を伝送するための、線４１６及び４１７等の２つの伝送パスを指す。例として、線４１６が低電圧レベルから高電圧レベルに切り替わるとき、すなわち立ち上がりエッジのとき、線４１７は高い論理レベルから低い論理レベルに駆動し、すなわち立ち下がりエッジとなる。差動信号は、潜在的に、より良好なシグナルインテグリティ、すなわち相互結合、電圧オーバシュート／アンダーシュート、リンギング等のような、より良好な電気特性を実証する。これによって、より高速な送信周波数を可能にするより良好なタイミングウィンドウが可能になる。 A differential pair refers to two transmission paths, such as lines 416 and 417, for transmitting differential signals. As an example, when line 416 switches from a low voltage level to a high voltage level, ie, a rising edge, line 417 drives from a high logic level to a low logic level, ie, a falling edge. Differential signals potentially demonstrate better electrical characteristics, such as better signal integrity, i.e., mutual coupling, voltage overshoot / undershoot, ringing, and the like. This allows for a better timing window that allows for faster transmission frequencies.

図５は、例示的なマルチチップパッケージリンク（ＭＣＰＬ）５２０を用いて通信可能に接続される２つ以上のチップ又はダイ（例えば、５１０、５１５）を含む例示的なマルチチップパッケージ５０５を示す単純化されたブロック図５００である。図５は、例示的なＭＣＰＬ５２０を用いて相互接続された２つ（以上）のダイの例を示しているが、ＭＣＰＬの実施に関して本明細書において説明される原理及び特徴は、数ある潜在的な例の中でも、２つ以上のダイ（例えば、５１０、５１５）を接続すること、ダイ（又はチップ）をダイ上にない別のコンポーネントに接続すること、ダイをパッケージ（例えば５０５）上にない別のデバイス又はダイに接続すること、ダイをＢＧＡパッケージに接続すること、インタポーザ上パッチ（ＰＯＩＮＴ：ＰａｔｃｈｏｎＩｎｔｅｒｐｏｓｅｒ）の実装を含む、ダイ（例えば５１０）と他のコンポーネントとを接続する任意のインターコネクト又はリンクに適用することができることを理解されたい。 FIG. 5 is a simplified diagram illustrating an exemplary multi-chip package 505 that includes two or more chips or dies (eg, 510, 515) that are communicatively connected using an exemplary multi-chip package link (MCPL) 520. FIG. Although FIG. 5 shows an example of two (or more) dies interconnected using an exemplary MCPL 520, the principles and features described herein with respect to the MCPL implementation are numerous potential. Among other examples, connecting two or more dies (eg, 510, 515), connecting a die (or chip) to another component that is not on the die, and not dying on a package (eg, 505). Any interconnect that connects a die (eg, 510) and other components, including connecting to another device or die, connecting the die to a BGA package, and implementing a Patch on Interposer (POINT) Or it should be understood that it may apply to links.

一般的に、マルチチップパッケージ（例えば、５０５）は、複数の集積回路（ＩＣ）、半導体ダイ又は他のディスクリートコンポーネント（例えば、５１０、５１５）が統一基板（例えば、シリコン又は他の半導体基板）上にパッケージ化され、これにより、組み合わされたコンポーネントを（例えば、より大型のＩＣであるかのように）単一のコンポーネントとして使用することを促進する、電子パッケージとすることができる。場合によっては、より大型のコンポーネント（例えば、ダイ５１０、５１５）は、それ自体が、システムオンチップ（ＳｏＣ）、マルチプロセッサチップ、又はデバイス上、例えば単一のダイ（例えば、５１０、５１５）上に複数のコンポーネント（例えば、５２５〜５３０及び５４０〜５４５）を含む他のコンポーネント等のＩＣシステムであり得る。マルチチップパッケージ５０５は、潜在的に複数のディスクリートコンポーネント及びシステムから複雑で様々なシステムを構築する柔軟性を提供することができる。例えば、数多くの例の中でも、ダイ５１０、５１５の各々は、２つの異なるエンティティによって製造されるか又は他の形で提供され、パッケージ５０５のシリコン基板が更に第３のエンティティによって提供されてもよい。更に、マルチチップパッケージ５０５内のダイ及び他のコンポーネントは、それ自体が、デバイス（例えば、それぞれ５１０、５１５）内でコンポーネント（例えば、５２５〜５３０及び５４０〜５４５）間の通信のためのインフラストラクチャを提供するインターコネクト又は他の通信ファブリック（例えば、５３５、５５０）を含むことができる。様々なコンポーネント及びインターコネクト（例えば、５３５、５５０）は、潜在的に複数の異なるプロトコルをサポート又は使用することができる。更に、ダイ（例えば、５１０、５１５）間の通信は、潜在的に、複数の異なるプロトコルを介したダイ上の様々なコンポーネント間のトランザクションを含むことができる。相互接続されることを求められるコンポーネント（及び所望のトランザクション）の特定の組合せに基づく、高度に特殊化され、高価で、パッケージ固有の解決策を用いる従来の解決策では、マルチチップパッケージ上のチップ（又はダイ）間で通信を提供するメカニズムを設計することは困難である可能性がある。 Generally, a multi-chip package (eg, 505) has a plurality of integrated circuits (ICs), semiconductor dies or other discrete components (eg, 510, 515) on a unified substrate (eg, silicon or other semiconductor substrate). Packaged into an electronic package that facilitates using the combined components as a single component (eg, as if it were a larger IC). In some cases, the larger component (eg, die 510, 515) is itself on a system-on-chip (SoC), multiprocessor chip, or device, eg, on a single die (eg, 510, 515). IC components such as other components including a plurality of components (e.g., 525-530 and 540-545). The multi-chip package 505 can potentially provide the flexibility to build complex and diverse systems from multiple discrete components and systems. For example, among many examples, each of the dies 510, 515 may be manufactured or otherwise provided by two different entities, and the silicon substrate of the package 505 may be further provided by a third entity. . Further, the die and other components in the multi-chip package 505 are themselves infrastructure for communication between components (eg, 525-530 and 540-545) within the device (eg, 510, 515, respectively). Interconnects or other communication fabrics (eg, 535, 550) that provide Various components and interconnects (eg, 535, 550) can potentially support or use multiple different protocols. Further, communication between dies (eg, 510, 515) can potentially involve transactions between various components on the die via multiple different protocols. Traditional solutions using highly specialized, expensive, package specific solutions based on specific combinations of components (and desired transactions) that are required to be interconnected are chips on a multi-chip package. It may be difficult to design a mechanism that provides communication between (or dies).

本明細書内で記載される例、システム、アルゴリズム、装置、ロジック及び特徴は、上記で特定した問題のうちの少なくともいくつかの対処することができる。これらの問題には、潜在的に、本明細書において明示的に言及されていない多くの他の問題を含む。例えば、いくつかの実施態様では、ホストデバイス（例えば、ＣＰＵ）又は他のデバイスを、ホストと同じパッケージ内に配置されたコンパニオンチップに接続するための高帯域幅、低電力、低レイテンシインタフェースを提供することができる。そのようなマルチチップパッケージリンク（ＭＣＰＬ）は、複数のパッケージオプション、複数のＩ／Ｏプロトコル、並びに信頼性、可用性及びサービス可能性（ＲＡＳ：Ｒｅａｌｉａｂｉｌｉｔｙ，Ａｖａｉｌａｂｉｌｉｔｙ，ａｎｄＳｅｒｖｉｃｅａｂｉｌｉｔｙ）機能をサポートすることができる。更に、物理層（ＰＨＹ）は電気層及び論理層を含むことができ、最大で約４５ｍｍ、場合によっては約４５ｍｍを超えるチャネル長を含む、より長いチャネル長をサポートすることができる。いくつかの実施態様では、例示的なＭＣＰＬが、８Ｇｂ／ｓ〜１０Ｇｂ／ｓを超えるデータレートを含む高いデータレートで動作することができる。 The examples, systems, algorithms, devices, logic, and features described herein can address at least some of the issues identified above. These issues potentially include many other issues not explicitly mentioned herein. For example, some embodiments provide a high bandwidth, low power, low latency interface for connecting a host device (eg, CPU) or other device to a companion chip located in the same package as the host. can do. Such a multi-chip package link (MCPL) can support multiple package options, multiple I / O protocols, and reliability, availability, and serviceability (RAS) functions. . Further, the physical layer (PHY) can include electrical and logical layers and can support longer channel lengths, including channel lengths of up to about 45 mm, and in some cases greater than about 45 mm. In some implementations, the exemplary MCPL can operate at high data rates, including data rates greater than 8 Gb / s to 10 Gb / s.

ＭＣＰＬの１つの例示的な実施態様では、ＰＨＹ電気層は、従来のマルチチャネルインターコネクト解決策（例えば、マルチチャネルＤＲＡＭＩ／Ｏ）に対し改善を行い、例えば、例として、数ある潜在的な例の中でも、調整された中間レール終端、低電力アクティブクロストークキャンセル、回路冗長性、ビットごとのデューティサイクル補正及びデスキュー、ラインコーディング及び送信機等化を含む、複数の特徴によってデータレート及びチャネル構成を拡張することができる。 In one exemplary implementation of MCPL, the PHY electrical layer provides improvements over conventional multi-channel interconnect solutions (eg, multi-channel DRAM I / O), for example, by way of example, a number of potential examples Among other things, data rate and channel configuration with multiple features, including coordinated intermediate rail termination, low power active crosstalk cancellation, circuit redundancy, bit-by-bit duty cycle correction and deskew, line coding and transmitter equalization Can be extended.

ＭＣＰＬの１つの例示的な実施態様では、インターコネクトが電気層を越えて複数のプロトコルをルーティングすることも可能にしながら、（例えば、電気層特徴が）データレート及びチャネル構成を拡張することを更に支援することができる、ＰＨＹ論理層を実施することができる。そのような実施態様は、プロトコルにとらわれず、潜在的に任意の既存の又は未来のインターコネクトプロトコルと共に機能するように設計されたモジュール式共通物理層を提供し定義することができる。 One exemplary implementation of MCPL further assists in extending data rates and channel configurations (eg, electrical layer features) while also allowing the interconnect to route multiple protocols across the electrical layer. A PHY logic layer can be implemented. Such an implementation can provide and define a modular common physical layer that is designed to work with any existing or future interconnect protocol, independent of the protocol.

図６を参照すると、マルチチップパッケージリンク（ＭＣＰＬ）の例示的な実施態様を含むシステムの少なくとも一部を表す単純化されたブロック図６００が示される。第１のデバイス６０５（例えば、１以上のサブコンポーネントを含む第１のダイ）を第２のデバイス６１０（例えば、１以上の他のサブコンポーネントを含む第２のダイ）と接続する物理的電気接続（例えば、レーンとして実装されるワイヤ）を用いてＭＣＰＬを実装することができる。図６００の高レベルの表現に示す特定の例では、（チャネル６１５、６２０における）全ての信号は単一指向性とすることができ、データ信号がアップストリーム及びダウンストリーム双方のデータ転送を有するためのレーンを提供することができる。図６のブロック図６００は、第１のコンポーネント６０５をアップストリームコンポーネントと呼び、第２のコンポーネント６１０をダウンストリームコンポーネントと呼び、データを送信する際に用いられるＭＣＰＬの物理レーンをダウンストリームチャネル６１５と呼び、（コンポーネント６１０から）データを受信するために用いられるレーンをアップストリームチャネル６２０と呼ぶが、デバイス６０５、６１０間のＭＣＰＬを、デバイス間でデータの送信及び受信の双方を行うために各デバイスによって用いることができることを理解されたい。 Referring to FIG. 6, a simplified block diagram 600 representing at least a portion of a system including an exemplary embodiment of a multichip package link (MCPL) is shown. A physical electrical connection that connects a first device 605 (eg, a first die that includes one or more subcomponents) with a second device 610 (eg, a second die that includes one or more other subcomponents). MCPL can be mounted using (for example, a wire mounted as a lane). In the particular example shown in the high level representation of FIG. 600, all signals (in channels 615, 620) can be unidirectional and the data signal has both upstream and downstream data transfers. Lanes can be provided. The block diagram 600 of FIG. 6 refers to the first component 605 as the upstream component, the second component 610 as the downstream component, and the physical lane of the MCPL used when transmitting data as the downstream channel 615. The lane used to receive and receive data (from component 610) is referred to as upstream channel 620, but the MCPL between devices 605, 610 can be used for each device to both transmit and receive data between devices. It should be understood that can be used.

１つの例示的な実施態様では、ＭＣＰＬは、電気的ＭＣＰＬＰＨＹ６２５ａ、６２５ｂ（又は、まとめて６２５）及び実行可能なロジックを実施するＭＣＰＬ論理ＰＨＹ６３０ａ、６３０ｂ（又は、まとめて６３０）を含む物理層（ＰＨＹ）を提供することができる。電気又は物理ＰＨＹ６２５は、データがデバイス６０５、６１０間で通信される際に介する物理接続を提供することができる。信号調整コンポーネント及びロジックは、リンクの高いデータレート及びチャネル構成機能を確立するために、物理ＰＨＹ６２５に関連して実施することができる。これは、いくつかの用途では、約４５ｍｍ以上の長さの高度にクラスタ化された物理接続を伴うことができる。論理ＰＨＹ６３０は、ＭＣＰＬを介した通信に用いられる潜在的に複数の様々なプロトコル間のクロッキング、（例えば、リンク層６３５ａ、６３５ｂの）リンク状態管理、及びプロトコル多重化を促進するためのロジックを含むことができる。 In one exemplary implementation, the MCPL includes an electrical MCPL PHY 625a, 625b (or collectively 625) and a physical layer (including collectively MCPL logic PHYs 630a, 630b (or collectively 630) that implements executable logic). PHY) can be provided. The electrical or physical PHY 625 can provide a physical connection through which data is communicated between the devices 605, 610. Signal conditioning components and logic can be implemented in connection with the physical PHY 625 to establish a high data rate and channel configuration function for the link. This may involve a highly clustered physical connection of about 45 mm or longer in some applications. Logic PHY 630 provides logic to facilitate clocking between potentially multiple different protocols used for communication via MCPL, link state management (eg, of link layers 635a, 635b), and protocol multiplexing. Can be included.

１つの例示的な実施態様では、物理ＰＨＹ６２５は、チャネル（例えば、６１５、６２０）ごとに１組のデータレーンを含むことができ、この１組のデータレーンを介して帯域内データを送信することができる。この特定の例では、アップストリームチャネル６１５及びダウンストリームチャネル６２０の各々に５０個のデータレーンが提供されるが、レイアウト及び電力の制約、所望の用途、デバイス制約等によって許容されるとおりに任意の他の数のレーンを用いることができる。各チャネルは、チャネル用のストローブ信号又はクロック信号のための１以上の専用レーンと、チャネル用の有効信号のための１以上の専用レーンと、ストリーム信号のための１以上の専用レーンと、リンク状態機械管理又はサイドバンド信号のための１以上の専用レーンとを更に含むことができる。物理ＰＨＹはサイドバンドリンク６４０を更に含むことができ、サイドバンドリンク６４０は、場合によっては、数ある例の中でも、デバイス６０５、６１０を接続するＭＣＰＬの状態遷移及び他の属性を調整するのに用いられる双方向低周波数制御信号リンクとすることができる。 In one exemplary implementation, the physical PHY 625 may include a set of data lanes per channel (eg, 615, 620) and transmitting in-band data over the set of data lanes. Can do. In this particular example, 50 data lanes are provided for each of upstream channel 615 and downstream channel 620, but any as permitted by layout and power constraints, desired applications, device constraints, etc. Other numbers of lanes can be used. Each channel has one or more dedicated lanes for channel strobe or clock signals, one or more dedicated lanes for valid signals for channels, one or more dedicated lanes for stream signals, and links It may further include one or more dedicated lanes for state machine management or sideband signals. The physical PHY may further include a sideband link 640 that, in some cases, may coordinate state transitions and other attributes of the MCPL connecting the devices 605, 610, among other examples. It can be the bidirectional low frequency control signaling link used.

上記で示したように、ＭＣＰＬの実施を用いて複数のプロトコルをサポートすることができる。実際に、各デバイス６０５、６１０に複数の独立したトランザクション層６５０ａ、６５０ｂを設けることができる。例えば、各デバイス６０５、６１０は、数ある中でも、ＰＣＩ、ＰＣＩｅ、ＱＰＩ、Ｉｎｔｅｌのダイ内インターコネクト（ＩＤＩ：Ｉｎ−ＤｉｅＩｎｔｅｒｃｏｎｎｅｃｔ）等の２つ以上のプロトコルをサポートし利用することができる。ＩＤＩは、コア、最終レベルキャッシュ（ＬＬＣ）、メモリ、グラフィック及びＩＯコントローラ間で通信するためにダイ上で用いられるコヒーレントなプロトコルである。イーサネット（登録商標）プロトコル、インフィニバンドプロトコル及び他のＰＣＩｅファブリックベースのプロトコルを含む他のプロトコルもサポートすることができる。論理ＰＨＹ及び物理ＰＨＹの組合せは、数ある例の中でも、１つのダイ上のＳｅｒＤｅｓＰＨＹ（ＰＣＩｅ、イーサネット（登録商標）、インフィニバンド又は他の高速ＳｅｒＤｅｓ）を、他のダイ上に実装されたその上位層に接続するダイ間インターコネクトとしても用いることができる。 As indicated above, MCPL implementations can be used to support multiple protocols. In fact, each device 605, 610 can be provided with multiple independent transaction layers 650a, 650b. For example, each of the devices 605 and 610 can support and use two or more protocols such as PCI, PCIe, QPI, and in-die interconnect (IDI), among others. IDI is a coherent protocol used on the die to communicate between the core, last level cache (LLC), memory, graphics and IO controller. Other protocols can also be supported, including Ethernet protocol, InfiniBand protocol and other PCIe fabric based protocols. The combination of logical PHY and physical PHY is, among other examples, SerDesPHY (PCIe, Ethernet, InfiniBand or other high-speed SerDes) on one die, its higher level implemented on other dies. It can also be used as an inter-die interconnect that connects to layers.

論理ＰＨＹ６３０は、ＭＣＰＬにおけるこれらの複数のプロトコル間の多重化もサポートすることができる。例えば、専用ストリームレーンを用いて、チャネルのデータレーン上で実質的に同時に送信されるデータにいずれのプロトコルを適用するべきかを特定するエンコードされたストリーム信号をアサートすることができる。更に、論理ＰＨＹ６３０を用いて、様々なプロトコルがサポート又はリクエストすることができる様々なタイプのリンク状態遷移を交渉することができる。場合によっては、チャネルの専用ＬＳＭ＿ＳＢレーンを介して送信されたＬＳＭ＿ＳＢ信号を、サイドバンドリンク６４０と共に用いて、デバイス６０５、６１０間のリンク状態遷移を通信及び交渉することができる。更に、リンクトレーニング、誤り検出、スキュー検出、デスキュー、及び従来のインターコネクトの他の機能を、部分的に論理ＰＨＹ６３０を用いて交換又は統制することができる。例えば、各チャネル内の１以上の専用の有効信号レーンを介して送信される有効信号を用いて、数ある例の中でも、リンクアクティビティをシグナリングし、スキュー、リンクエラーを検出し、他の機能を実現することができる。図６の特定の例では、チャネルごとに複数の有効なレーンが提供される。例えば、チャネル内のデータレーンは、（物理的にかつ／又は論理的に）バンドル化又はクラスタ化することができ、有効なレーンをクラスタごとに提供することができる。更に、場合によっては、数ある例の中でも、チャネル内の複数のデータレーンクラスタ内のクラスタごとに専用ストローブ信号の提供も行う複数のストローブレーンを提供することができる。 Logical PHY 630 can also support multiplexing between these multiple protocols in MCPL. For example, a dedicated stream lane can be used to assert an encoded stream signal that specifies which protocol should be applied to data that is transmitted substantially simultaneously on the data lane of the channel. Further, the logical PHY 630 can be used to negotiate different types of link state transitions that different protocols can support or request. In some cases, the LSM_SB signal transmitted over the channel's dedicated LSM_SB lane can be used with the sideband link 640 to communicate and negotiate link state transitions between devices 605, 610. In addition, link training, error detection, skew detection, deskew, and other functions of the conventional interconnect can be exchanged or controlled in part using the logical PHY 630. For example, using a valid signal transmitted over one or more dedicated valid signal lanes within each channel, among other examples, signaling link activity, detecting skew, link errors, and other functions Can be realized. In the particular example of FIG. 6, multiple valid lanes are provided for each channel. For example, data lanes within a channel can be bundled or clustered (physically and / or logically) and valid lanes can be provided for each cluster. Further, in some cases, among a number of examples, it is possible to provide a plurality of strobe lanes that also provide dedicated strobe signals for each cluster in a plurality of data lane clusters in the channel.

上記で示したように、論理ＰＨＹ６３０を用いて、ＭＣＰＬによって接続されたデバイス間で送信されるリンク制御信号を交渉し、管理することができる。いくつかの実施態様では、論理ＰＨＹ６３０は、ＭＣＰＬを介して（すなわち、帯域内で）リンク層制御メッセージを送信するのに用いることができるリンク層パケット（ＬＬＰ）生成ロジック６６０を含むことができる。そのようなメッセージはチャネルのデータレーンを介して送信することができ、ストリームレーンは、データが、数ある例の中でも、リンク層制御データ等のリンク層間メッセージングであることを特定する。ＬＬＰモジュール６６０を用いて有効にされたリンク層メッセージは、デバイス６０５のリンク層６３５ａとデバイス６１０のリンク層６３５ｂとの間の他のリンク層特徴の中でも、リンク階層化態遷移、電力管理、ループバック、無効、再センタリング、スクランブル化の交渉及び実行に役立つことができる。 As indicated above, the logical PHY 630 can be used to negotiate and manage link control signals transmitted between devices connected by MCPL. In some implementations, the logical PHY 630 can include link layer packet (LLP) generation logic 660 that can be used to send link layer control messages via MCPL (ie, in-band). Such a message can be sent over the data lane of the channel, and the stream lane identifies that the data is link layer messaging, such as link layer control data, among other examples. Link layer messages enabled using the LLP module 660 include link layer state transitions, power management, loops, among other link layer features between the link layer 635a of the device 605 and the link layer 635b of the device 610. Can help negotiate and execute back, invalidation, re-centering and scrambling.

図７を参照すると、例ＭＣＰＬの特定のチャネル内の１組のレーン（例えば、６１５、６２０）を用いた例示的なシグナリングを表す図７００が示されている。図７の例では、チャネル内の合計５０個のデータレーンについて２５個のデータレーンの２つのクラスタが提供されている。レーンの一部分が示されているが、他のもの（例えば、ＤＡＴＡ［４−４６］及び第２のストローブ信号レーン（ＳＴＲＢ））は、特定の例を示す都合上、（例えば、冗長な信号として）省かれている。物理層がアクティブ状態（例えば、電源オフにされていないか又は低電力モード（例えば、Ｌ１状態）にある）とき、ストローブレーン（ＳＴＲＢ）に、同期クロック信号を提供することができる。いくつかの実施態様では、ストローブの立ち上がりエッジ及び立ち下がりエッジの双方でデータを送信することができる。各エッジ（又は半クロックサイクル）は、単位インターバル（ＵＩ）を画定することができる。したがって、この例では、ビット（例えば、７０５）を各レーン上で送信することができ、８ＵＩごとに１バイトが送信されることが可能になる。バイト期間７１０を８ＵＩとして、又はデータレーンのうちの単一のものにおいてバイトを送信する時間（例えば、ＤＡＴＡ［０−４９］）として定義することができる。 Referring to FIG. 7, a diagram 700 depicting exemplary signaling using a set of lanes (eg, 615, 620) within a particular channel of the example MCPL is shown. In the example of FIG. 7, two clusters of 25 data lanes are provided for a total of 50 data lanes in the channel. Although a portion of the lane is shown, others (eg, DATA [4-46] and the second strobe signal lane (STRB)) are shown for convenience (eg, as redundant signals). ) Omitted. When the physical layer is in an active state (eg, not powered off or in a low power mode (eg, L1 state)), a synchronous clock signal can be provided to the strobe lane (STRB). In some implementations, data can be transmitted on both the rising and falling edges of the strobe. Each edge (or half clock cycle) can define a unit interval (UI). Thus, in this example, bits (eg, 705) can be transmitted on each lane, and one byte can be transmitted every 8 UI. The byte period 710 can be defined as 8 UI or as the time to transmit a byte in a single one of the data lanes (eg, DATA [0-49]).

いくつかの実施態様では、１以上の専用有効信号チャネル（例えば、ＶＡＬＩＤ０、ＶＡＬＩＤ１）において送信される有効信号は、アサートされているとき（高）、受信デバイス又はシンクに、バイト期間７１０等の後続の期間中に、データレーン（例えば、ＤＡＴＡ［０−４９］）上でデータが送信デバイス又はソースから送信されていることを特定する、受信デバイスのための先導するインジケータとしての役割を果たすことができる。代替的に、有効信号が低であるとき、ソースはシンクに、シンクが後続の期間中、データレーン上でデータを送信しないことを示す。したがって、シンク論理ＰＨＹが、（例えばレーンＶＡＬＩＤ０及びＶＡＬＩＤ１において）有効信号がアサートされていないことを検出すると、シンクは、後続の期間中にデータレーン（例えば、ＤＡＴＡ［０−４９］）上で検出される任意のデータを無視することができる。例えば、ソースが実際にデータを一切送信していないときに、クロストークノイズ又は他のビットがデータレーンのうちの１以上に現れる場合がある。前の期間（例えば、前のバイト期間）中の低いか又はアサートされていない有効信号により、シンクは、後続の期間中にデータレーンが無視されることを決定することができる。 In some implementations, when a valid signal transmitted in one or more dedicated valid signal channels (eg, VALID0, VALID1) is asserted (high), the receiving device or sink is followed by a byte period 710, etc. Act as a lead indicator for the receiving device to identify that data is being transmitted from the transmitting device or source on the data lane (eg, DATA [0-49]) it can. Alternatively, when the valid signal is low, the source indicates to the sink that the sink will not transmit data on the data lane during subsequent periods. Thus, if the sink logic PHY detects that the valid signal is not asserted (eg, in lanes VALID0 and VALID1), the sink will detect on the data lane (eg, DATA [0-49]) during subsequent periods. Any data that is sent can be ignored. For example, crosstalk noise or other bits may appear in one or more of the data lanes when the source is not actually transmitting any data. A valid signal that is low or not asserted during a previous period (eg, the previous byte period) allows the sink to determine that the data lane is ignored during the subsequent period.

ＭＣＰＬのレーンの各々において送信されたデータは、ストローブ信号に厳密にアラインすることができる。バイト期間等の期間をストローブに基づいて定義することができ、これらの期間の各々が、定義されたウィンドウに対応することができる。このウィンドウにおいて、信号はデータレーン（例えば、ＤＡＴＡ［０−４９］）、有効なレーン（例えば、ＶＡＬＩＤ１、ＶＡＬＩＤ２）及びストリームレーン（例えば、ＳＴＲＥＡＭ）上で送信されることになる。したがって、これらの信号のアライメントにより、前の期間ウィンドウ内の有効信号が、後続の期間ウィンドウ内のデータに適用されること、及びストリーム信号が同じ期間ウィンドウ内のデータに適用されることの特定を可能にすることができる。ストリーム信号は、同じ期間ウィンドウ中に送信されるデータに適用されるプロトコルを特定するようにエンコードされるエンコード信号（例えば、バイト期間ウィンドウにつき１バイトのデータ）とすることができる。 Data transmitted in each of the MCPL lanes can be strictly aligned to the strobe signal. Periods such as byte periods can be defined based on the strobe, and each of these periods can correspond to a defined window. In this window, signals will be transmitted on data lanes (eg DATA [0-49]), valid lanes (eg VALID1, VALID2) and stream lanes (eg STREAM). Therefore, the alignment of these signals ensures that the valid signal in the previous period window is applied to the data in the subsequent period window and that the stream signal is applied to the data in the same period window. Can be possible. The stream signal can be an encoded signal (eg, 1 byte of data per byte period window) that is encoded to specify the protocol applied to the data transmitted during the same period window.

説明のために、図７の特定の例において、バイト期間ウィンドウが定義される。データレーンＤＡＴＡ［０−４９］において任意のデータが投入される前に、期間ウィンドウｎ（７１５）において有効信号がアサートされる。後続の期間ウィンドウｎ＋１（７２０）において、データレーンのうちの少なくともいくつかにおいてデータが送信される。この場合、データはｎ＋１（７２０）中に５０個全てのデータレーン上で送信される。先行する期間ウィンドウｎ（７１５）の持続時間にわたって有効信号がアサートされたので、シンクデバイスは、期間ウィンドウｎ＋１（７２０）中にデータレーンＤＡＴＡ［０−４９］上で受信されるデータを有効にすることができる。更に、期間ウィンドウｎ（７１５）中の有効信号の先導する特性により、受信デバイスが着信データの準備をすることが可能になる。図７の例を継続すると、有効信号は、期間ウィンドウｎ＋１（７２０）の持続時間中、（ＶＡＬＩＤ１及びＶＡＬＩＤ２において）アサートされたままであり、シンクデバイスに、期間ウィンドウｎ＋２（７２５）中にデータレーンＤＡＴＡ［０−４９］を介してデータが送信されることを予期させる。有効信号が期間ウィンドウｎ＋２（７２５）中にアサートされたままである場合、シンクデバイスは、直後の期間ウィンドウｎ＋３（７３０）中に送信される更なるデータを受信（及び処理）することを更に予期することができる。一方、図７の例では、有効信号は、期間ウィンドウｎ＋２（７２５）の持続時間中にアサート解除され、シンクデバイスに、期間ウィンドウｎ＋３（７３０）中にデータが送信されないこと、及びデータレーンＤＡＴＡ［０−４９］上で検出される任意のビットが期間ウィンドウｎ＋３（７３０）中無視されるべきであることを示す。 For illustration purposes, in the particular example of FIG. 7, a byte period window is defined. Before any data is input in the data lane DATA [0-49], the valid signal is asserted in the period window n (715). In a subsequent period window n + 1 (720), data is transmitted in at least some of the data lanes. In this case, data is transmitted on all 50 data lanes in n + 1 (720). Since the valid signal was asserted for the duration of the preceding period window n (715), the sink device validates the data received on data lane DATA [0-49] during period window n + 1 (720). be able to. In addition, the leading characteristics of the valid signal during period window n (715) allow the receiving device to prepare for incoming data. Continuing the example of FIG. 7, the valid signal remains asserted (in VALID1 and VALID2) for the duration of period window n + 1 (720), and the sink device receives data lane DATA during period window n + 2 (725). Expect data to be sent over [0-49]. If the valid signal remains asserted during period window n + 2 (725), the sink device further expects to receive (and process) further data transmitted during the immediately subsequent period window n + 3 (730). be able to. On the other hand, in the example of FIG. 7, the valid signal is deasserted during the duration of the period window n + 2 (725), and no data is transmitted to the sink device during the period window n + 3 (730), and the data lane DATA [ 0-49] indicates that any bits detected on should be ignored during period window n + 3 (730).

上記で示すように、チャネルごとに複数の有効なレーン及びストローブレーンを維持することができる。これは、数ある利点の中でも、２つのデバイスを接続する比較的長い物理レーンのクラスタの中で回路の単純性及び同期を維持するのに役立つことができる。いくつかの実施態様では、１組のデータレーンは、データレーンのクラスタに分割することができる。例えば、図７の例において、データレーンＤＡＴＡ［０−４９］は、２５個のレーンの２つのクラスタに分割することができ、各クラスタは専用の有効レーン及びストローブレーンを有することができる。例えば、有効レーンＶＡＬＩＤ１はデータレーンＤＡＴＡ［０−２４］に関連付けることができ、有効レーンＶＡＬＩＤ２はデータレーンＤＡＴＡ［２５−４９］に関連付けることができる。クラスタごとの有効レーン及びストローブレーンの各「コピー」における信号は、同一とすることができる。 As indicated above, multiple valid lanes and strobe lanes can be maintained for each channel. This can help, among other advantages, maintain circuit simplicity and synchronization in a relatively long cluster of physical lanes connecting two devices. In some implementations, a set of data lanes can be divided into clusters of data lanes. For example, in the example of FIG. 7, the data lane DATA [0-49] can be divided into two clusters of 25 lanes, and each cluster can have a dedicated valid lane and strobe lane. For example, valid lane VALID1 can be associated with data lane DATA [0-24], and valid lane VALID2 can be associated with data lane DATA [25-49]. The signal in each “copy” of valid lanes and strobe lanes for each cluster may be the same.

上記で紹介したように、ストリームレーンＳＴＲＥＡＭ上のデータを用いて、受信側の論理ＰＨＹに、データレーンＤＡＴＡ［０−４９］上で送信されている対応するデータにどのプロトコルを適用するかを示すことができる。図７の例では、データレーン上のデータのプロトコルを示すために、ストリーム信号が、データレーンＤＡＴＡ［０−４９］上のデータと同じ期間ウィンドウ中にＳＴＲＥＡＭ上で送信される。代替的な実施態様では、ストリーム信号は、数ある潜在的な変更形態の中でも、対応する有効信号を用いること等により、先行する期間ウィンドウ中に送信することができる。一方、図７の例を継続すると、期間ウィンドウｎ＋１（７２０）中に、データレーンＤＡＴＡ［０−４９］を介して期間ウィンドウｎ＋１（７２０）中に送信されるビットに適用するプロトコル（例えば、ＰＣＩｅ、ＰＣＩ、ＩＤＩ、ＱＰＩ等）を示すようにエンコードされたストリーム信号７３５が送信される。同様に、後続の期間ウィンドウｎ＋２（７２５）中に、データレーンＤＡＴＡ［０−４９］を介して期間ウィンドウｎ＋２（７２５）中に送信されるビットに適用するプロトコルを示す別のストリーム信号７４０を送信することができ、以下同様である。図７の例（ストリーム信号７３５、７４０の双方が同じエンコーディング、バイナリＦＦを有する）等のいくつかの場合、連続期間ウィンドウ（例えば、ｎ＋１（７２０）及びｎ＋２（７２５））は同じプロトコルに属することができる。一方、他の事例では、連続期間ウィンドウ（例えば、ｎ＋１（７２０）及びｎ＋２（７２５））におけるデータは、異なるプロトコルが適用される異なるトランザクションからのものとすることができ、これに応じて、ストリーム信号（例えば、７３５、７４０）を、数ある例の中でも、データレーン（例えば、ＤＡＴＡ［０−４９］）上の連続データバイトに適用する異なるプロトコルを特定するようにエンコードすることができる。 As introduced above, the data on the stream lane STREAM is used to indicate which protocol is applied to the corresponding data being transmitted on the data lane DATA [0-49] to the logical PHY on the receiving side. be able to. In the example of FIG. 7, a stream signal is transmitted on STREAM during the same period window as the data on data lane DATA [0-49] to indicate the protocol of the data on the data lane. In alternative implementations, the stream signal can be transmitted during the preceding period window, such as by using a corresponding valid signal, among other potential variations. On the other hand, if the example of FIG. 7 is continued, a protocol (for example, PCIe) applied to bits transmitted during the period window n + 1 (720) via the data lane DATA [0-49] during the period window n + 1 (720). , PCI, IDI, QPI, and the like) are transmitted. Similarly, during the subsequent period window n + 2 (725), another stream signal 740 is transmitted indicating the protocol applied to the bits transmitted during the period window n + 2 (725) via the data lane DATA [0-49]. The same applies hereinafter. In some cases, such as the example of FIG. 7 (both stream signals 735, 740 have the same encoding, binary FF), the continuous period windows (eg, n + 1 (720) and n + 2 (725)) belong to the same protocol Can do. On the other hand, in other cases, the data in the continuous period windows (eg, n + 1 (720) and n + 2 (725)) can come from different transactions to which different protocols are applied, and accordingly The signal (eg, 735, 740) can be encoded to identify different protocols that apply to consecutive data bytes on the data lane (eg, DATA [0-49]), among other examples.

いくつかの実施態様では、ＭＣＰＬのための低電力状態又はアイドル状態を定義することができる。例えば、ＭＣＰＬにおけるいずれのデバイスもデータを送信していないとき、ＭＣＰＬの物理層（電気及び論理）は、アイドル状態又は低電力状態に進むことができる。例えば、図７の例では、期間ウィンドウｎ−２（７４５）において、ＭＣＰＬは静止状態又はアイドル状態にあり、ストローブは電力を節減するために無効にされている。ＭＣＰＬは、低電力モード又はアイドルモードから遷移し、期間ウィンドウｎ−１（例えば、７０５）においてストローブをウェイク状態にすることができる。ストローブは、（例えば、チャネルのレーンの各々及びシンクデバイスをウェイクさせ、同期させるのに役立つために）送信プリアンブルを完了させ、他の非ストローブレーンにおける任意の他のシグナリングの前にストローブ信号を開始することができる。上記で検討したように、この期間ウィンドウｎ−１（７０５）に続いて、期間ウィンドウｎ（７１５）において有効信号をアサートし、シンクに、データが後続の期間ウィンドウｎ＋１（７２０）において到来することを通知することができる。 In some implementations, a low power state or idle state for MCPL may be defined. For example, when no device in the MCPL is transmitting data, the physical layer (electrical and logical) of the MCPL can go into an idle state or a low power state. For example, in the example of FIG. 7, in the period window n-2 (745), the MCPL is in a quiescent or idle state, and the strobe is disabled to save power. The MCPL can transition from the low power mode or the idle mode and wake the strobe in the period window n−1 (eg, 705). The strobe completes the transmit preamble (for example, to help wake and synchronize each of the channel lanes and the sink device) and initiates the strobe signal before any other signaling in other non-strobe lanes can do. As discussed above, following this period window n-1 (705), the valid signal is asserted in period window n (715) and the data arrives at the sink in the subsequent period window n + 1 (720). Can be notified.

ＭＣＰＬは、ＭＣＰＬチャネルの有効レーン、データレーン及び／又は他のレーンにおけるアイドル状態の検出に続いて、低電力状態又はアイドル状態（例えば、Ｌ１状態）に再度入ることができる。例えば、期間ウィンドウｎ＋３（７３０）において開始し、進行するシグナリングが検出されない場合がある。数ある例及び原理（本明細書において後に検討するものを含む）の中でも、ソースデバイス又はシンクデバイスのいずれかにおけるロジックが低電力状態に戻る遷移を開始することができ、これにより再び（例えば、期間ウィンドウｎ＋５（７５５）において）ストローブが電力節減モードにおいてアイドルに進むことになる。 The MCPL may re-enter a low power state or idle state (eg, L1 state) following detection of an idle state in the MCPL channel's valid lane, data lane and / or other lanes. For example, there may be no signaling detected in the period window n + 3 (730) and progressing. Among other examples and principles (including those discussed later in this document), the logic in either the source device or sink device can initiate a transition back to a low power state, which again (eg, In the period window n + 5 (755)), the strobe will go idle in the power saving mode.

物理ＰＨＹの電気特性は、数ある特徴の中でも、シングルエンドシグナリング、半レート進められたクロッキング、インターコネクトチャネルと、送信機（ソース）及び受信機（シンク）のオンチップトランスポート遅延とのマッチング、最適化された静電放電（ＥＳＤ：ｅｌｅｃｔｒｏｓｔａｔｉｃｄｉｓｃｈａｒｇｅ）保護、パッドキャパシタンスのうちの１以上を含むことができる。更に、ＭＣＰＬは、従来のパッケージＩ／Ｏ解決策よりも高いデータレート（例えば、１６Ｇｂ／ｓに到達する）及びエネルギー効率特性を達成するように実装することができる。 The physical characteristics of the physical PHY include single-ended signaling, half-rate advanced clocking, interconnect channel and on-chip transport delay matching of transmitter (source) and receiver (sink), among other features, It may include one or more of optimized electrostatic discharge (ESD) protection, pad capacitance. Furthermore, MCPL can be implemented to achieve higher data rates (eg, reaching 16 Gb / s) and energy efficiency characteristics than conventional packaged I / O solutions.

図８は、例示的なＭＣＰＬの一部分を表す単純化されたブロック図８００の一部を示す。図８の図８００は、例示的なレーン８０５（例えば、データレーン、有効レーン又はストリームレーン）の表現と、クロック生成ロジック８１０とを含む。図８の例に示すように、いくつかの実施態様では、クロック生成ロジック８１０は、データレーン８０５等の例示的なＭＣＰＬの各レーンを実装する各ブロックに生成されたクロック信号を分配するためのクロックツリーとして実装することができる。更に、クロックリカバリ回路８１５を提供することができる。いくつかの実施態様では、少なくともいくつかの従来のインターコネクトＩ／Ｏアーキテクチャにおいて慣習的であるように、クロック信号が分散されるレーンごとに別個のクロックリカバリ回路を提供するのではなく、複数のレーンのクラスタに単一のクロックリカバリ回路を提供することができる。実際に、図６及び図７の例示的な構成に適用されるとき、別個のストローブレーン及び付随するクロックリカバリ回路は、２５個のデータレーンのクラスタごとに提供することができる。 FIG. 8 shows a portion of a simplified block diagram 800 that represents a portion of an exemplary MCPL. The diagram 800 of FIG. 8 includes a representation of an exemplary lane 805 (eg, data lane, valid lane or stream lane) and clock generation logic 810. As shown in the example of FIG. 8, in some implementations, the clock generation logic 810 is for distributing the generated clock signal to each block that implements each lane of the exemplary MCPL, such as data lane 805. It can be implemented as a clock tree. Further, a clock recovery circuit 815 can be provided. In some implementations, as is customary in at least some conventional interconnect I / O architectures, rather than providing a separate clock recovery circuit for each lane where the clock signal is distributed, A single clock recovery circuit can be provided for each cluster. Indeed, when applied to the exemplary configurations of FIGS. 6 and 7, a separate strobe lane and associated clock recovery circuit can be provided for each cluster of 25 data lanes.

図８の例を続けると、いくつかの実施態様では、少なくとも、データレーン、ストリームレーン及び有効レーンを、ゼロ（グラウンド）よりも高い調整電圧に中間レールで終端することができる。いくつかの実施態様では、中間レール電圧は、Ｖｃｃ／２に調整することができる。いくつかの実施態様では、レーンのクラスタごとに単一の電圧レギュレータ８２５を設けることができる。例えば、図６及び図７の例に適用するとき、数ある潜在的な例の中でも、２５個のデータレーンの第１のクラスタのために第１の電圧レギュレータを設けることができ、２５個のデータレーンの残りのクラスタのために第２の電圧レギュレータを設けることができる。場合によっては、例示的な電圧レギュレータ８２５は、数ある例の中でも、リニアレギュレータ、スイッチトキャパシタ回路として実施することができる。いくつかの実施態様では、リニアレギュレータには、数ある例の中でも、アナログフィードバックループ又はデジタルフィードバックループを設けることができる。 Continuing with the example of FIG. 8, in some implementations, at least the data lane, stream lane, and valid lane can be terminated with an intermediate rail to a regulated voltage higher than zero (ground). In some implementations, the intermediate rail voltage can be adjusted to Vcc / 2. In some implementations, a single voltage regulator 825 can be provided for each cluster of lanes. For example, when applied to the examples of FIGS. 6 and 7, a first voltage regulator can be provided for a first cluster of 25 data lanes, among other potential examples, A second voltage regulator can be provided for the remaining clusters in the data lane. In some cases, the exemplary voltage regulator 825 can be implemented as a linear regulator, a switched capacitor circuit, among other examples. In some implementations, the linear regulator can be provided with an analog feedback loop or a digital feedback loop, among other examples.

いくつかの実施態様では、例示的なＭＣＰＬのためにクロストークキャンセル回路部を設けることもできる。場合によっては、長いＭＣＰＬワイヤのコンパクトな性質により、レーン間にクロストーク干渉が生じる可能性がある。これらの問題及び他の問題に対処するようにクロストークキャンセルロジックを実装することができる。例えば、図９〜図１０に示す１つの例では、図９００及び１０００に示すもの等の例示的な低電力アクティブ回路を用いてクロストークを大幅に低減することができる。例えば、図９の例では、重み付き高域通過フィルタリングされた「アグレッサ（aggressor）」信号を「ビクティム（victim）」信号（すなわち、アグレッサからのクロストーク干渉を受ける信号）に加えることができる。各信号は、リンクにおける互いの信号からのクロストークのビクティムとみなすことができ、この信号がクロストーク干渉源である限り、それ自体が他の信号に対するアグレッサとなり得る。リンク上のクロストークの派生的性質に起因して、そのような信号が生成され、ビクティムレーンにおけるクロストークを、５０％よりも多く低減することができる。図９の例では、低域通過フィルタリングされたアグレッサ信号を、高域通過ＲＣフィルタ（例えばＣ及びＲ１を通じて実施される）を通じて生成することができる。高域通過ＲＣフィルタはフィルタリングされた信号を生成し、この信号は、加算回路９０５（例えば、ＲＸセンスアンプ）を用いて加えられる。 In some implementations, a crosstalk cancellation circuitry may be provided for the exemplary MCPL. In some cases, the compact nature of long MCPL wires can cause crosstalk interference between lanes. Crosstalk cancellation logic can be implemented to address these and other issues. For example, in one example shown in FIGS. 9-10, crosstalk can be significantly reduced using an exemplary low power active circuit such as those shown in FIGS. For example, in the example of FIG. 9, a weighted high-pass filtered “aggressor” signal can be added to a “victim” signal (ie, a signal subject to crosstalk interference from the aggressor). Each signal can be viewed as a crosstalk victim from each other's signal on the link and can itself be an aggressor for other signals, as long as this signal is a crosstalk interference source. Due to the derivative nature of crosstalk on the link, such a signal is generated, and crosstalk in victim lanes can be reduced by more than 50%. In the example of FIG. 9, a low-pass filtered aggressor signal can be generated through a high-pass RC filter (eg, implemented through C and R1). The high pass RC filter generates a filtered signal that is applied using a summing circuit 905 (eg, an RX sense amplifier).

図９の例において説明したものに類似した実施態様は、ＭＣＰＬ等の用途のための特に好都合な解決策であり得る。なぜなら、図９の例において示され、説明された回路の例示的なトランジスタレベルの概略図を示す図１０の図に示すように、回路の実装を比較的低いオーバヘッドで実現することができるためである。図９及び図１０における表現は単純化された表現であり、実際の実施態様は、リンクのレーン間のクロストーク干渉のネットワークを収容するために、図９及び図１０に示す回路の複数のコピーを含むことを理解されたい。例として、数ある例の中でも、レーンの形状及びレイアウトに基づいて、レーン０からレーン１、レーン０からレーン２、レーン１からレーン０、レーン１からレーン２、レーン２からレーン０、レーン２からレーン１等の、図９及び図１０の例において説明したものに類似の３レーンリンク（例えば、レーン０〜２）回路を提供することができる。 An embodiment similar to that described in the example of FIG. 9 may be a particularly advantageous solution for applications such as MCPL. This is because the circuit implementation can be realized with relatively low overhead, as shown in the diagram of FIG. 10 which shows an exemplary transistor level schematic of the circuit shown and described in the example of FIG. is there. The representations in FIGS. 9 and 10 are simplified representations, and the actual implementation is that multiple copies of the circuits shown in FIGS. 9 and 10 are used to accommodate the network of crosstalk interference between the lanes of the link Should be understood to include. As an example, among other examples, lane 0 to lane 1, lane 0 to lane 2, lane 1 to lane 0, lane 1 to lane 2, lane 2 to lane 0, lane 2 based on the shape and layout of the lane 3 to lane 1 can be provided which is similar to the one described in the examples of FIGS. 9 and 10 (eg, lanes 0 to 2).

例示的なＭＣＰＬの物理的なＰＨＹレベルにおいて追加の特徴を実施することができる。例えば、場合によっては、受信機オフセットは、重大なエラーを引き起こし、Ｉ／Ｏ電圧マージンを制限する可能性がある。回路冗長性を用いて受信機の感度を改善することができる。いくつかの実施態様では、ＭＣＰＬにおいて用いられるデータサンプラの標準偏差オフセットに対処するように回路冗長性を最適化することができる。例えば、３標準偏差オフセット指定に対し設計された例示的なデータサンプラを提供することができる。図６及び図７の例において、例えば、受信機ごとに（例えば、レーンごとに）２つのデータサンプラが用いられるとすると、５０レーンのＭＣＰＬのために１００個のサンプラが用いられることになる。この例では、受信機（ＲＸ）レーンのうちの１つが３標準偏差オフセット指定に失敗する確率は２４％である。オフセット上限を設定し、他のデータサンプラのうちの別のものがこの上限を超えていることがわかった場合に受信機において次のデータサンプラに移るためのチップ基準電圧発生器を提供することができる。一方、受信機あたり４つのデータサンプラ（すなわち、この例では２つの代わり）が用いられるとすると、受信機は、４つのうちの３つのサンプラが失敗した場合にしか失敗しない。図６及び図７の例におけるような５０レーンＭＣＰＬの場合、この追加の回路冗長性は、失敗率を２４％から０．０１％未満に劇的に低減させることができる。 Additional features can be implemented at the physical PHY level of the exemplary MCPL. For example, in some cases, receiver offsets can cause significant errors and limit the I / O voltage margin. Circuit redundancy can be used to improve receiver sensitivity. In some implementations, circuit redundancy can be optimized to account for the standard deviation offset of the data sampler used in MCPL. For example, an exemplary data sampler designed for a three standard deviation offset specification can be provided. In the example of FIGS. 6 and 7, for example, if two data samplers are used for each receiver (eg, for each lane), 100 samplers are used for 50-lane MCPL. In this example, the probability that one of the receiver (RX) lanes will fail to specify 3 standard deviation offsets is 24%. To provide a chip reference voltage generator to set an upper offset and to move to the next data sampler at the receiver if another of the other data samplers is found to exceed this upper limit it can. On the other hand, if four data samplers are used per receiver (ie, instead of two in this example), the receiver will only fail if three of the four samplers fail. For a 50 lane MCPL as in the examples of FIGS. 6 and 7, this additional circuit redundancy can dramatically reduce the failure rate from 24% to less than 0.01%.

更なる他の例では、非常に高いデータレートで、ビットごとのデューティサイクル補正（ＤＣＣ）及びデスキューを用いて、ベースラインのクラスタごとのＤＣＣ及びデスキューを拡張してリンクマージンを改善することができる。従来の解決策のように全ての場合に補正を行う代わりに、いくつかの実施態様では、Ｉ／Ｏレーンが失敗した場合に、外れ値を検知し補正する低電力デジタル実装を利用することができる。例えば、レーンのグローバルチューニングを行って、クラスタ内の問題レーンを特定することができる。次に、これらの問題レーンを、レーンごとのチューニングの対象にして、ＭＣＰＬによってサポートされる高いデータレートを達成することができる。 In yet another example, at a very high data rate, bit-cycle duty cycle correction (DCC) and deskew can be used to extend the per-cluster cluster DCC and deskew to improve link margins. . Instead of performing corrections in all cases as in conventional solutions, some embodiments may utilize a low power digital implementation that detects and corrects outliers when an I / O lane fails. it can. For example, global tuning of lanes can be performed to identify problem lanes in the cluster. These problem lanes can then be subject to lane-by-lane tuning to achieve high data rates supported by MCPL.

ＭＣＰＬのいくつかの例において、物理リンクの性能特性を向上させる追加の特徴もオプションで実施することができる。例えば、ラインコーディングを提供することができる。上記で説明したような中間レール終端は、ＤＣデータバス反転（ＤＢＩ）が省かれることを可能にすることができるが、ＡＣＤＢＩを依然として用いて動的電力を低減することができる。数ある例示的な利点の中でも、より複雑なコーディングを用いて、最悪の事態である１及び０の差異をなくし、例えば、中間レールレギュレータの駆動要件を低減すると共に、Ｉ／Ｏ切換えノイズを制限することもできる。更に、オプションで送信機等化を実施することができる。例えば、非常に高いデータレートにおいて、挿入損失は、パッケージ内チャネルにとって大きい可能性がある。数ある中でも、２タップ重み送信機等化（例えば、初期起動シーケンス中に実行される）は、場合によっては、これらの問題のうちのいくつかを軽減するのに十分であり得る。 In some examples of MCPL, additional features that improve the performance characteristics of the physical link can also be implemented optionally. For example, line coding can be provided. An intermediate rail termination as described above can allow DC data bus inversion (DBI) to be omitted, but AC DBI can still be used to reduce dynamic power. Among other illustrative benefits, more complex coding is used to eliminate the worst case 1 and 0 differences, for example, reducing intermediate rail regulator drive requirements and limiting I / O switching noise You can also Furthermore, transmitter equalization can optionally be implemented. For example, at very high data rates, insertion loss can be significant for in-package channels. Among other things, 2-tap weight transmitter equalization (eg, performed during the initial startup sequence) may be sufficient to alleviate some of these problems in some cases.

図１１を参照すると、例示的なＭＣＰＬの例示的な論理ＰＨＹを示す単純化されたブロック図１１００が示されている。物理ＰＨＹ１１０５は、論理ＰＨＹ１１１０と、ＭＣＰＬのリンク層をサポートする追加のロジックとを含むダイに接続することができる。ダイは、この例では、ＭＣＰＬにおいて複数の異なるプロトコルをサポートするロジックを更に含むことができる。例えば、図１１の例において、ＩＤＩロジック１１２０と共にＰＣＩｅロジック１１１５を提供することができ、それによって、３つ以上のプロトコル又はＰＣＩｅ及びＩＤＩ以外のプロトコルがＭＣＰＬを介してサポートされる例を含む、潜在的に数多くの他の例の中でも、ダイは、２つのダイを接続する同じＭＣＰＬを介してＰＣＩｅ又はＩＤＩを用いて通信することができる。ダイ間でサポートされる様々なプロトコルが様々なレベルのサービス及び特徴を提供することができる。 Referring to FIG. 11, a simplified block diagram 1100 illustrating an example logical PHY of an example MCPL is shown. The physical PHY 1105 may be connected to a die that includes a logical PHY 1110 and additional logic that supports the MCPL link layer. The die may further include logic that, in this example, supports multiple different protocols in MCPL. For example, in the example of FIG. 11, PCIe logic 1115 can be provided along with IDI logic 1120, thereby including a potential to include more than two protocols or protocols other than PCIe and IDI are supported via MCPL. In particular, among many other examples, dies can communicate using PCIe or IDI over the same MCPL connecting the two dies. Different protocols supported between dies can provide different levels of services and features.

論理ＰＨＹ１１１０は、ダイの上位層ロジックのリクエスト（例えば、ＰＣＩｅ又はＩＤＩを介して受信される）に関連してリンク状態遷移を交渉するためのリンク状態機械管理ロジック１１２５を含むことができる。論理ＰＨＹ１１１０は、いくつかの実施態様において、リンク試験及びデバッグロジック（例えば、１１３０）を更に含むことができる。上記で示すように、例示的なＭＣＰＬは、ＭＣＰＬの、（数ある例示的な特徴の中でも）プロトコルにとらわれない、高性能で電力効率のよい特徴を促進するようにＭＣＰＬを介してダイ間で送信される制御信号をサポートすることができる。例えば、論理ＰＨＹ１１１０は、上記の例において説明したように、専用データレーンを介してデータを送受信することに関連して、有効信号、ストリーム信号及びＬＳＭサイドバンド信号の生成及び送信、並びに受信及び処理をサポートすることができる。 The logical PHY 1110 may include link state machine management logic 1125 for negotiating link state transitions in connection with die higher layer logic requests (eg, received via PCIe or IDI). The logic PHY 1110 may further include link test and debug logic (eg, 1130) in some implementations. As shown above, the exemplary MCPL is a die-to-die via MCPL that facilitates the MCPL's protocol-agnostic (among other exemplary features), high performance, power efficient features. The transmitted control signal can be supported. For example, the logical PHY 1110 generates, transmits, receives and processes valid signals, stream signals and LSM sideband signals in connection with transmitting and receiving data via dedicated data lanes as described in the above example. Can support.

いくつかの実施態様では、多重化（例えば、１１３５）及び逆多重化（例えば、１１４０）ロジックを論理ＰＨＹ１１１０に含めるか、又は他の形で論理ＰＨＹ１１１０にアクセス可能にすることができる。例えば、多重化ロジック（例えば、１１３５）を用いて、ＭＣＰＬ上に送信されるデータ（例えば、パケット、メッセージ等として具現化される）を特定することができる。多重化ロジック１１３５は、データを統制するプロトコルを特定し、ストリーム信号を生成する。このストリーム信号は、プロトコルを特定するためにエンコードされる。例えば、１つの例示的な実施態様では、ストリーム信号は、２つの１６進数シンボルのバイト（例えば、ＩＤＩ：ＦＦｈ、ＰＣＩｅ：Ｆ０ｈ、ＬＬＰ：ＡＡｈ、サイドバンド：５５ｈ等）としてエンコードすることができ、特定されたプロトコルによって統制されるデータの同じウィンドウ（例えば、バイト期間ウィンドウ）中に送信することができる。同様に、逆多重化ロジック１１４０を用いて、着信ストリーム信号を解釈してストリーム信号をデコードし、データレーン上のストリーム信号により現在受信されているデータに適用するプロトコルを特定することができる。次に、逆多重化ロジック１１４０をプロトコル固有リンク層処理に適用（又は確保）し、対応するプロトコルロジック（例えば、ＰＣＩｅロジック１１１５又はＩＤＩロジック１１２０）によってデータを処理させることができる。 In some implementations, multiplexing (eg, 1135) and demultiplexing (eg, 1140) logic can be included in logical PHY 1110 or otherwise accessible to logical PHY 1110. For example, multiplexing logic (eg, 1135) can be used to identify data (eg, embodied as packets, messages, etc.) transmitted on the MCPL. Multiplexing logic 1135 identifies the protocol that governs the data and generates a stream signal. This stream signal is encoded to specify the protocol. For example, in one exemplary implementation, a stream signal can be encoded as two hexadecimal symbol bytes (eg, IDI: FFh, PCIe: F0h, LLP: AAh, sideband: 55h, etc.) It can be transmitted during the same window of data governed by the specified protocol (eg, a byte duration window). Similarly, demultiplexing logic 1140 can be used to interpret the incoming stream signal and decode the stream signal to identify the protocol to apply to the data currently received by the stream signal on the data lane. The demultiplexing logic 1140 can then be applied (or reserved) to protocol specific link layer processing to cause the data to be processed by the corresponding protocol logic (eg, PCIe logic 1115 or IDI logic 1120).

論理ＰＨＹ１１１０は、電力管理タスク、ループバック、ディセーブル、再センタリング、スクランブリング等を含む様々なリンク制御機能を処理するのに用いることができるリンク層パケットロジック１１５０を更に含むことができる。ＬＬＰロジック１１５０は、数ある機能の中でも、ＭＣＬＰを介したリンク層間メッセージを促進することができる。ＬＬＰシグナリングに対応するデータは、データレーンＬＬＰデータを特定するためにエンコードされる専用ストリーム信号レーン上で送信されるストリーム信号によって特定することもできる。多重化ロジック及び逆多重化ロジック（例えば、１１３５、１１４０）を用いて、ＬＬＰトラフィックに対応するストリーム信号を生成し解釈し、そのようなトラフィックが適切なダイロジック（例えば、ＬＬＰロジック１１５０）によって処理されるようにすることもできる。同様に、ＭＣＬＰのいくつかの実施態様は、数ある例の中でも、非同期及び／又は低周波数サイドバンドチャネル等の専用サイドバンド（例えば、サイドバンド１１５５及びサポートロジック）を含むことができる。 The logical PHY 1110 can further include link layer packet logic 1150 that can be used to handle various link control functions including power management tasks, loopback, disable, re-centering, scrambling and the like. LLP logic 1150 can facilitate link layer messages via MCLP, among other functions. Data corresponding to LLP signaling can also be identified by a stream signal transmitted on a dedicated stream signal lane that is encoded to identify data lane LLP data. Multiplexing and demultiplexing logic (eg, 1135, 1140) is used to generate and interpret stream signals corresponding to LLP traffic, and such traffic is processed by appropriate die logic (eg, LLP logic 1150). It can also be made. Similarly, some implementations of MCLP may include dedicated sidebands (eg, sideband 1155 and support logic) such as asynchronous and / or low frequency sideband channels, among other examples.

論理ＰＨＹロジック１１１０は、専用ＬＳＭサイドバンドレーンを介してリンク状態管理メッセージを生成及び受信（及び使用）することができるリンク状態機械管理ロジックを更に含むことができる。例えば、数ある潜在的な例の中でも、ＬＳＭサイドバンドレーンを用いてハンドシェイクを実行し、リンクトレーニング状態を進め、電力管理状態（例えば、Ｌ１状態）から出ることができる。数ある例の中でも、ＬＳＭサイドバンド信号は、リンクのデータ信号、有効信号及びストリーム信号にアラインされておらず、代わりにシグナリング状態遷移に対応するという点で非同期信号とすることができ、リンクによって接続された２つのダイ又はチップ間のリンク状態機械をアラインすることができる。数ある例示的な利点の中でも、いくつかの例において、専用ＬＳＭサイドバンドレーンを提供することによって、アナログフロントエンド（ＡＦＥ）の従来のスケルチ及び受信検出回路を排除することが可能になる。 The logical PHY logic 1110 can further include link state machine management logic that can generate and receive (and use) link state management messages via a dedicated LSM sideband lane. For example, among other potential examples, a handshake can be performed using the LSM sideband lane to advance the link training state and exit the power management state (eg, L1 state). Among other examples, LSM sideband signals can be asynchronous signals in that they are not aligned with link data signals, valid signals and stream signals, but instead correspond to signaling state transitions. A link state machine between two connected dies or chips can be aligned. Among other illustrative advantages, in some instances, providing a dedicated LSM sideband lane allows the analog front end (AFE) conventional squelch and receive detection circuitry to be eliminated.

図１２を参照すると、ＭＣＰＬを実施するのに用いられるロジックの別の表現を示す単純化されたブロック図１２００が示されている。例えば、論理ＰＨＹ１１１０には、定義された論理ＰＨＹインタフェース（ＬＰＩＦ）１２０５が設けられ、これを通じて、複数の異なるプロトコル（例えば、ＰＣＩｅ、ＩＤＩ、ＱＰＩ等）１２１０、１２１５、１２２０、１２２５及びシグナリングモード（例えば、サイドバンド）のうちの任意のものが例示的なＭＣＰＬの物理層とインタフェースすることができる。いくつかの実施態様では、論理ＰＨＹ１１１０と別個の層として、多重化及び調停ロジック１２３０を提供することもできる。１つの例では、このＭｕｘＡｒｂ層１２３０のいずれかの側のインタフェースとして、ＬＰＩＦ１２０５を提供することができる。論理ＰＨＹ１１１０は、別のインタフェースを通じて物理ＰＨＹ（例えばＭＣＰＬＰＨＹのアナログフロントエンド（ＡＦＥ）１１０５）とインタフェースすることができる。 Referring to FIG. 12, a simplified block diagram 1200 showing another representation of the logic used to implement MCPL is shown. For example, the logical PHY 1110 is provided with a defined logical PHY interface (LPIF) 1205 through which a plurality of different protocols (eg, PCIe, IDI, QPI, etc.) 1210, 1215, 1220, 1225 and signaling modes (eg, , Sideband) can interface with the physical layer of the exemplary MCPL. In some implementations, multiplexing and arbitration logic 1230 may be provided as a separate layer from logical PHY 1110. In one example, LPIF 1205 can be provided as an interface on either side of this MuxArb layer 1230. The logical PHY 1110 may interface with a physical PHY (eg, an analog front end (AFE) 1105 of the MCPL PHY) through another interface.

ＬＰＩＦは、上位層（例えば、１２１０、１２１５、１２２０、１２２５）からＰＨＹ（論理及び電気／アナログ）を抽出することができ、それによって、完全に異なるＰＨＹを、上位層に対しトランスペアレントなＬＰＩＦ下で実施することができる。このことは、数ある例の中でも、上位層が、基礎を成すシグナリング技術ＰＨＹが更新されるときに完全なままであることができるので、モジュール性及び設計の再利用を促進するのに役立つことができる。更に、ＬＰＩＦは、論理ＰＨＹの多重化／逆多重化、ＬＳＭ管理、エラー検出及び処理、並びに他の機能を可能にする複数の信号を定義することができる。例えば、表１は、例示的なＬＰＩＦのために定義することができる信号の少なくとも一部分を要約する。 The LPIF can extract PHY (logic and electrical / analog) from higher layers (eg, 1210, 1215, 1220, 1225), thereby allowing completely different PHYs to be under LPIF transparent to higher layers. Can be implemented. This helps to promote modularity and design reuse, because, among other examples, higher layers can remain intact when the underlying signaling technology PHY is updated. Can do. In addition, the LPIF can define multiple signals that enable logical PHY multiplexing / demultiplexing, LSM management, error detection and processing, and other functions. For example, Table 1 summarizes at least a portion of the signals that can be defined for an exemplary LPIF.

表１に示すように、いくつかの実施態様では、ＡｌｉｇｎＲｅｑ／ＡｌｉｇｎＡｃｋハンドシェイクを通じてアライメントメカニズムを提供することができる。例えば、物理層がリカバリに入るとき、いくつかのプロトコルはパケットフレーミングを失う場合がある。パケットのアライメントを補正して、例えば、リンク層による正しいフレーミング識別を保証することができる。更に、図１３に示すように、物理層は、リカバリに入るときにＳｔａｌｌＲｅｑ信号をアサートすることができ、それによって、リンク層は、新たにアラインされたパケットを転送する準備ができたときにＳｔａｌｌ信号をアサートする。物理層ロジックは、Ｓｔａｌｌ及びＶａｌｉｄの双方をサンプリングして、パケットがアラインされているか否かを判断することができる。例えば、物理層は、Ｖａｌｉｄを用いてパケットアライメントを支援する他の代替的な実施態様を含む数ある潜在的な実施態様の中でも、Ｓｔａｌｌ及びＶａｌｉｄがサンプリングされてアサートされるまで、ｔｒｄｙを駆動してリンク層パケットをドレインし続けることができる。 As shown in Table 1, in some implementations, an alignment mechanism can be provided through an AlignReq / AlignAck handshake. For example, some protocols may lose packet framing when the physical layer enters recovery. Packet alignment can be corrected to ensure correct framing identification by, for example, the link layer. Further, as shown in FIG. 13, the physical layer can assert the StallReq signal when entering recovery, so that the link layer is ready to transfer the newly aligned packet. Assert signal. Physical layer logic can sample both Stall and Valid to determine if the packet is aligned. For example, the physical layer drives trdy until Stall and Valid are sampled and asserted, among other potential implementations, including other alternative implementations that support packet alignment using Valid. Can continue to drain link layer packets.

ＭＣＰＬにおける信号について、様々な耐障害性を定義することができる。例えば、有効信号、ストリーム信号、ＬＳＭサイドバンド信号、低周波数サイドバンド信号、リンク層パケット信号及び他のタイプの信号について耐障害性を定義することができる。ＭＣＰＬの専用データレーンを介して送信されたパケット、メッセージ及び他のデータの耐障害性は、データを統制する特定のプロトコルに基づくことができる。いくつかの実施態様では、数ある潜在的な例の中でも、周期的冗長検査（ＣＲＣ）、リトライバッファ等のエラー検出及び処理メカニズムを提供することができる。例として、ＭＣＰＬを介して送信されるＰＣＩｅパケットの場合、ＰＣＩｅトランザクション層パケット（ＴＬＰ）のために３２ビットＣＲＣを利用することができ（（例えばリプレイメカニズムを通じて）送達が保証される）、ＰＣＩｅリンク層パケット（損失が多くなるように構築される場合がある（例えば、リプレイが適用されない））のために１６ビットＣＲＣを利用することができる。更に、ＰＣＩｅフレーミングトークンの場合、数ある例の中でも、トークン識別子について特定のハミング距離（例えば、ハミング距離４）を定義することができ、パリティ及び４ビットＣＲＣも利用することができる。他方で、ＩＤＩパケットの場合、１６ビットＣＲＣを利用することができる。 Various fault tolerances can be defined for signals in MCPL. For example, fault tolerance can be defined for valid signals, stream signals, LSM sideband signals, low frequency sideband signals, link layer packet signals, and other types of signals. The resiliency of packets, messages and other data transmitted over the MCPL dedicated data lanes can be based on the specific protocol governing the data. In some implementations, among other potential examples, error detection and processing mechanisms such as cyclic redundancy check (CRC), retry buffers, etc. may be provided. As an example, for PCIe packets sent over MCPL, a 32-bit CRC can be utilized for PCIe transaction layer packets (TLPs) (delivery is guaranteed (eg, through a replay mechanism)) and the PCIe link A 16-bit CRC may be utilized for layer packets (which may be constructed to be lossy (eg, no replay is applied)). Further, for PCIe framing tokens, among other examples, a specific hamming distance (eg, hamming distance 4) can be defined for the token identifier, and parity and 4-bit CRC can also be utilized. On the other hand, in the case of an IDI packet, a 16-bit CRC can be used.

いくつかの実施態様では、（例えば、ビット及びシンボルロックを保証するのに役立つために）有効信号が低から高（すなわち、０から１）に遷移することを必要とすることを含む耐障害性をリンク層パケット（ＬＬＰ）について定義することができる。更に、ＭＣＰＬ上でＬＬＰデータにおける障害を判断する基として用いることができる数ある定義された特徴の中でも、１つの例では、特定の数の連続した同一のＬＬＰが送信されるように定義することができ、各リクエストに対する応答を予期することができ、リクエスト側は、応答タイムアウト後にリトライを行う。更なる例では、例えば、期間ウィンドウ全体にわたって有効信号又はシンボルを拡張することにより（例えば８つのＵＩについて有効信号を高に保持することによる）、有効信号のための耐障害性を提供することができる。更に、ストリーム信号におけるエラー又は障害は、数ある例の中でも、ストリーム信号のエンコード値のハミング距離を維持することによって防ぐことができる。 In some embodiments, fault tolerance including requiring the valid signal to transition from low to high (ie, from 0 to 1) (eg, to help ensure bit and symbol lock). Can be defined for link layer packets (LLP). In addition, among the many defined features that can be used as a basis for determining faults in LLP data on the MCPL, in one example, defining a specific number of consecutive identical LLPs to be transmitted. And a response to each request can be expected, and the request side retries after a response timeout. In a further example, providing fault tolerance for a valid signal, for example, by extending the valid signal or symbol across the period window (eg, by keeping the valid signal high for eight UIs). it can. Further, errors or failures in the stream signal can be prevented by maintaining the Hamming distance of the encoded value of the stream signal, among other examples.

論理ＰＨＹの実施は、エラー検出、エラー報告、及びエラー処理ロジックを含むことができる。いくつかの実施態様では、例示的なＭＣＰＬの論理ＰＨＹは、数ある例の中でも、ＰＨＹ層デフレーミングエラー（例えば、有効レーン及びストリームレーン上）、サイドバンドエラー（例えば、ＬＳＭ状態遷移に関する）、ＬＬＰにおけるエラー（例えば、ＬＳＭ状態遷移に重要である）を検出するロジックを含むことができる。数ある例の中でも、いくらかのエラー検出／解決は、ＰＣＩｅ固有のエラーを検出するように適合されたＰＣＩｅロジック等の上位層ロジックに委譲することができる。 The implementation of the logic PHY can include error detection, error reporting, and error handling logic. In some implementations, an exemplary MCPL logical PHY includes, among other examples, PHY layer deframing errors (eg, on valid lanes and stream lanes), sideband errors (eg, for LSM state transitions), Logic may be included to detect errors in the LLP (eg, important for LSM state transitions). Among other examples, some error detection / resolution can be delegated to higher layer logic such as PCIe logic adapted to detect PCIe specific errors.

デフレーミングエラーの場合、いくつかの実施態様では、エラー処理ロジックを通じて１以上のメカニズムを提供することができる。デフレーミングエラーは、関与するプロトコルに基づいて処理することができる。例えば、いくつかの実施態様では、リンク層は、リトライをトリガするために、エラーを通知されることができる。デフレーミングは、論理ＰＨＹデフレーミングの再アライメントも生じさせることができる。更に、数ある技法の中でも、論理ＰＨＹの再センタリングも行うことができ、シンボル／ウィンドウロックを再取得することができる。いくつかの例では、センタリングは、ＰＨＹが、着信データを検出するために、受信機クロック位相を最適点まで動かすことを含むことができる。この文脈において、「最適」とは、ノイズ及びクロックジッタのための最大マージンを有する場所を指すことができる。再センタリングは、数ある例の中でも、例えば、ＰＨＹが低電力状態からウェイクアップするときに実行される、単純化されたセンタリング機能を含むことができる。 In the case of a deframing error, some implementations may provide one or more mechanisms through error handling logic. Deframing errors can be handled based on the protocol involved. For example, in some implementations, the link layer can be notified of an error to trigger a retry. Deframing can also cause realignment of logical PHY deframing. In addition, among other techniques, logical PHY re-centering can be performed and symbol / window locks can be reacquired. In some examples, centering can include the PHY moving the receiver clock phase to an optimal point in order to detect incoming data. In this context, “optimal” can refer to the location with the largest margin for noise and clock jitter. Re-centering can include, for example, a simplified centering function that is performed, for example, when the PHY wakes up from a low power state.

他のタイプのエラーは、他のエラー処理技法を含むことができる。例えば、サイドバンドにおいて検出されるエラーは、対応する状態（例えば、ＬＳＭ）のタイムアウトメカニズムを通じて捕捉され得る。エラーのログをとることができ、次にリンク状態機械をＲｅｓｅｔ（リセット）に遷移させることができる。ＬＳＭは、ソフトウェアから再始動コマンドが受信されるまでＲｅｓｅｔにとどまることができる。別の例では、リンク制御パケットエラー等のＬＬＰエラーは、ＬＬＰシーケンスに対する確認応答が受信されない場合にＬＬＰシーケンスを再始動することができるタイムアウトメカニズムを用いて処理することができる。 Other types of errors can include other error handling techniques. For example, errors detected in the sideband can be captured through a corresponding state (eg, LSM) timeout mechanism. An error can be logged and the link state machine can then transition to Reset. The LSM can stay in Reset until a restart command is received from the software. In another example, LLP errors, such as link control packet errors, can be handled using a timeout mechanism that can restart the LLP sequence if no acknowledgment for the LLP sequence is received.

図１４Ａ〜図１４Ｃは、様々なタイプのデータについて例示的なＭＣＰＬのデータレーンにおける例示的なビットマッピングの表現を示している。例えば、例示的なＭＣＰＬは、５０個のデータレーンを含むことができる。図１４Ａは、８ＵＩのシンボル又はウィンドウ内でデータレーンを介して送信することができる、ＩＤＩ等の第１のプロトコルにおける例示的な１６バイトのスロットの第１のビットマッピングを示す。例えば、定義された８ＵＩウィンドウ内で、ヘッダスロットを含む３つの１６バイトスロットを送信することができる。この例では２バイトのデータが残り、これらの残っている２バイトをＣＲＣビット（例えば、レーンＤＡＴＡ［４８］及びＤＡＴＡ［４９］内）に利用することができる。 14A-14C illustrate exemplary bit mapping representations in exemplary MCPL data lanes for various types of data. For example, an exemplary MCPL can include 50 data lanes. FIG. 14A shows a first bit mapping of an exemplary 16 byte slot in a first protocol such as IDI that can be transmitted over a data lane in an 8 UI symbol or window. For example, three 16-byte slots including header slots can be transmitted within a defined 8 UI window. In this example, two bytes of data remain, and these two remaining bytes can be used for CRC bits (for example, in lanes DATA [48] and DATA [49]).

別の例では、図１４Ｂは、例示的なＭＣＰＬの５０個のデータレーンを介して送信されるＰＣＩｅパケットデータのための第２の例示的なビットマッピングを示す。図１４Ｂの例では、１６バイトのパケット（例えば、トランザクション層（ＴＬＰ）又はデータリンク層（ＤＬＬＰ）ＰＣＩｅパケット）を、ＭＣＰＬを介して送信することができる。８ＵＩウィンドウでは、３つのパケットを送信することができ、残りの２バイトの帯域幅がウィンドウ内で用いられないまま残る。これらのシンボルにフレーミングトークンを含めることができ、各パケットの開始及び終了を位置特定するのに用いることができる。ＰＣＩｅの１つの例では、図１４Ｂの例において利用されるフレーミングは、８ＧＴ／ｓにおいてＰＣＩｅのために実施されるトークン等と同じとすることができる。 In another example, FIG. 14B shows a second example bit mapping for PCIe packet data transmitted over 50 data lanes of the example MCPL. In the example of FIG. 14B, a 16-byte packet (eg, transaction layer (TLP) or data link layer (DLLP) PCIe packet) can be transmitted via MCPL. In the 8UI window, three packets can be transmitted, and the remaining 2 bytes of bandwidth remain unused within the window. These symbols can include framing tokens and can be used to locate the start and end of each packet. In one example of PCIe, the framing utilized in the example of FIG. 14B can be the same as a token etc. implemented for PCIe at 8 GT / s.

図１４Ｃに示される更に別の例では、例示的なＭＣＰＬを介して送信されるリンク間パケット（例えば、ＬＬＰパケット）の例示的なビットマッピングが示されている。ＬＬＰはそれぞれ４バイトとすることができ、各ＬＬＰ（例えば、ＬＬＰ０、ＬＬＰ１、ＬＬＰ２等）は、例示的な実施態様における耐障害性及びエラー検出に従って連続して４回送信することができる。例えば、４つの連続した同一のＬＬＰの受信に失敗することは、エラーを示すことができる。更に、他のデータタイプに関して、進行中の時間ウィンドウ又はシンボルにおいてＶＡＬＩＤを受信するのに失敗することもエラーを示すことができる。場合によっては、ＬＬＰは固定のスロットを有することができる。更に、数ある例の中でも、この例では、バイト期間における未使用又は「スペア」ビットにより、結果として５０個のレーンのうちの２つ（例えば、ＤＡＴＡ［４８−４９］）にわたって論理０が送信されることになる。 In yet another example shown in FIG. 14C, an exemplary bit mapping of an interlink packet (eg, an LLP packet) transmitted via an exemplary MCPL is shown. Each LLP can be 4 bytes, and each LLP (eg, LLP0, LLP1, LLP2, etc.) can be sent four times in succession according to fault tolerance and error detection in an exemplary implementation. For example, failure to receive four consecutive identical LLPs can indicate an error. Further, for other data types, failure to receive a VALID in an ongoing time window or symbol can also indicate an error. In some cases, the LLP can have a fixed slot. Further, among other examples, in this example, an unused or “spare” bit in the byte period results in a logical 0 transmitted over two of the 50 lanes (eg, DATA [48-49]). Will be.

図１５を参照すると、状態遷移間で利用されるサイドバンドハンドシェイキングと共に、単純化されたリンク状態機械遷移図１４００が示されている。例えば、Ｒｅｓｅｔ．Ｉｄｌｅ状態（例えば、この状態において、位相ロックループ（ＰＬＬ）ロック較正が行われる）が、サイドバンドハンドシェイクを通じてＲｅｓｅｔ．Ｃａｌ状態（例えば、この状態において、リンクが更に較正される）に遷移することができる。Ｒｅｓｅｔ．Ｃａｌは、サイドバンドハンドシェイクを通じて、Ｒｅｓｅｔ．ＣｌｏｃｋＤＣＣ状態（例えば、この状態において、デューティサイクル補正（ＤＣＣ）及び遅延ロックルーピング（ＤＬＬ）ロックを行うことができる）に遷移することができる。Ｒｅｓｅｔ．ＣｌｏｃｋＤＣＣからＲｅｓｅｔ．Ｑｕｉｅｔ状態（例えば、有効信号をアサート解除する）に遷移するための更なるハンドシェイクを行うことができる。ＭＣＰＬのレーン上のシグナリングのアライメントに役立つために、Ｃｅｎｔｅｒ．Ｐａｔｔｅｒｎ状態を通じてレーンをセンタリングすることができる。 Referring to FIG. 15, a simplified link state machine transition diagram 1400 is shown with sideband handshaking utilized between state transitions. For example, Reset. The Idle state (e.g., in this state, phase lock loop (PLL) lock calibration is performed) is reset through the sideband handshake. A transition can be made to the Cal state (eg, in this state the link is further calibrated). Reset. Cal uses Reset.com through the sideband handshake. A transition can be made to the ClockDCC state (eg, in this state, duty cycle correction (DCC) and delay lock looping (DLL) lock can be performed). Reset. From ClockDCC, Reset. Additional handshaking can be performed to transition to the Quiet state (eg, deassert the valid signal). To assist in the alignment of signaling on the MCPL lane, Center. Lanes can be centered through the Pattern state.

いくつかの実施態様では、図１６の例に示すように、Ｃｅｎｔｅｒ．Ｐａｔｔｅｒｎ状態の間、送信機はトレーニングパターン又は他のデータを生成することができる。受信機は、例えば、位相補間器位置及びｖｒｅｆ位置を設定し、比較器を設定することによって、そのようなトレーニングパターンを受信するようにその受信機回路を調整することができる。受信機は、受信したパターンを予測パターンと比較し続け、結果をレジスタに記憶することができる。１組のパターンが完成した後、受信機は、ｖｒｅｆを同じに保ちながら位相補間器設定をインクリメントすることができる。テストパターン生成及び比較プロセスは継続することができ、新たな比較結果をレジスタに記憶することができ、プロシージャは全ての位相補間器値及び全てのｖｒｅｆ値を反復的にステップスルーする。パターン生成及び比較プロセスが全て完了すると、Ｃｅｎｔｅｒ．Ｑｕｉｅｔ状態に入ることができる。Ｃｅｎｔｅｒ．Ｐａｔｔｅｒｎ及びＣｅｎｔｅｒＱｕｉｅｔリンク状態を通じてレーンをセンタリングすることに続いて、サイドバンドハンドシェイク（例えば、リンクの専用ＬＳＭサイドバンドレーンを介するＬＳＭサイドバンド信号を用いる）を促進してＬｉｎｋ．Ｉｎｉｔ状態に遷移し、ＭＣＰＬを初期化し、ＭＣＰＬにおいてデータの送信を可能にすることができる。 In some implementations, as shown in the example of FIG. During the Pattern state, the transmitter can generate a training pattern or other data. The receiver can adjust its receiver circuit to receive such a training pattern, for example, by setting the phase interpolator position and the vref position and setting the comparator. The receiver can continue to compare the received pattern with the predicted pattern and store the result in a register. After a set of patterns is completed, the receiver can increment the phase interpolator settings while keeping vref the same. The test pattern generation and comparison process can continue, new comparison results can be stored in registers, and the procedure steps through all phase interpolator values and all vref values iteratively. Once the pattern generation and comparison process is complete, Center. A Quiet state can be entered. Center. Following centering of the lanes through the Pattern and CenterQuiet link states, link. It is possible to transition to the Init state, initialize the MCPL, and enable data transmission in the MCPL.

一時的に図１５の検討に戻ると、上記で示したように、サイドバンドハンドシェイクを用いて、マルチチップパッケージにおけるダイ又はチップ間のリンク状態機械遷移を促進することができる。例えば、ＭＣＰＬのＬＳＭサイドバンドレーンにおける信号を用いて、ダイにわたる状態機械遷移を同期させることができる。例えば、状態（例えば、Ｒｅｓｅｔ．Ｉｄｌｅ）を出る条件が満たされたとき、これらの条件を満たした側が、自身のアウトバウンドＬＳＭ＿ＳＢレーンにおいて、ＬＳＭサイドバンド信号をアサートし、他の遠隔ダイが同じ条件に達し、そのダイのＬＳＭ＿ＳＢレーンにおいてＬＳＭサイドバンド信号をアサートするのを待機することができる。双方のＬＳＭ＿ＳＢ信号がアサートされたとき、各それぞれのダイのリンク状態機械が次の状態（例えば、Ｒｅｓｅｔ．Ｃａｌ状態）に遷移することができる。双方のＬＳＭ＿ＳＢ信号が状態を遷移する前にアサートされたままであるべき最小オーバラップ時間を定義することができる。更に、ＬＳＭ＿ＳＢがアサート解除された後、正確なターンアラウンド検出を可能にするための最小静止時間を定義することができる。いくつかの実施態様では、全てのリンク状態機械遷移はそのようなＬＳＭ＿ＳＢハンドシェイクを条件とし、そのようなＬＳＭ＿ＳＢハンドシェイクによって促進され得る。 Returning temporarily to the discussion of FIG. 15, sideband handshaking can be used to facilitate link state machine transitions between dies or chips in a multi-chip package, as indicated above. For example, signals in the MCPL LSM sideband lane can be used to synchronize state machine transitions across dies. For example, when conditions exiting a state (eg Reset.Idle) are met, the side that met these conditions will assert an LSM sideband signal in its outbound LSM_SB lane, and the other remote die will And wait to assert the LSM sideband signal in the LSM_SB lane of that die. When both LSM_SB signals are asserted, the link state machine of each respective die can transition to the next state (eg, Reset.Cal state). A minimum overlap time can be defined that should remain asserted before both LSM_SB signals transition state. In addition, a minimum quiesce time can be defined to allow accurate turnaround detection after LSM_SB is deasserted. In some implementations, all link state machine transitions are conditional on such an LSM_SB handshake and may be facilitated by such an LSM_SB handshake.

図１７は、例示的なＭＣＰＬに含めることができる更なるリンク状態及びリンク状態遷移のうちの少なくともいくつかを示す、より詳細なリンク状態機械図１７００である。いくつかの実施態様では、例示的なリンク状態機械は、図１７に示す数ある状態及び状態遷移の中でも、「有向ループバック（ＤｉｒｅｃｔｅｄＬｏｏｐｂａｃｋ）」遷移を含むことができ、この遷移は、ＭＣＰＬのレーンをデジタルループバックにするように提供することができる。例えば、ＭＣＰＬの受信機レーンは、クロックリカバリ回路の後に送信機レーンにループバックすることができる。場合によっては、「ＬＢ＿Ｒｅｃｅｎｔｅｒ」状態も提供することができる。この状態は、データシンボルをアラインするのに用いることができる。更に、図１５に示すように、ＭＣＰＬは、数ある潜在的な例の中でも、Ｌ１アイドル状態及びＬ２スリープ状態等の、アクティブＬ０状態及び低電力状態を含む複数のリンク状態をサポートすることができる。 FIG. 17 is a more detailed link state machine diagram 1700 illustrating at least some of the additional link states and link state transitions that can be included in an exemplary MCPL. In some implementations, an exemplary link state machine may include a “Directed Loopback” transition, among the numerous states and state transitions shown in FIG. Lanes can be provided to be digital loopbacks. For example, the MCPL receiver lane can loop back to the transmitter lane after the clock recovery circuit. In some cases, an “LB_Recenter” state can also be provided. This state can be used to align data symbols. Further, as shown in FIG. 15, MCPL can support multiple link states including active L0 state and low power state, such as L1 idle state and L2 sleep state, among other potential examples. .

図１８は、アクティブ状態（例えば、Ｌ０）及び低電力と、アイドル状態（例えば、Ｌ１）との間の遷移の例示的なフローを示す単純化されたブロック図１８００である。この特定の例では、第１のデバイス１８０５及び第２のデバイス１８１０はＭＣＰＬを用いて通信可能に結合される。アクティブ状態にある間、データはＭＣＰＬのレーン（例えば、ＤＡＴＡ、ＶＡＬＩＤ、ＳＴＲＥＡＭ等）を介して送信される。レーン（例えば、データレーンであり、ストリーム信号が、データがＬＬＰデータであることを示す）を介してリンク層パケット（ＬＬＰ）を通信し、リンク状態遷移を促進するのに役立てることができる。例えば、第１のデバイス１８０５と第２のデバイス１８１０との間でＬＬＰを送信して、Ｌ０からＬ１へのエントリを交渉することができる。例えば、ＭＣＰＬによってサポートされる上位層プロトコルは、Ｌ１（又は別の状態）へのエントリが望ましいことを通信することができ、この上位層プロトコルは、物理層にＬ１に入らせるために、ＬＬＰを、ＭＣＰＬを介して送信させ、リンク層ハンドシェイクを促進することができる。例えば、図１８は、第２の（アップストリーム）デバイス１８１０から第１の（ダウンストリーム）デバイス１８０５に送信される「Ｌ１に入る」リクエストＬＬＰを含む、送信されるＬＬＰの少なくとも一部分を示す。いくつかの実施態様及び上位レベルプロトコルでは、ダウンストリームポートはＬ１へのエントリを開始しない。数ある例の中でも、受信側の第１のデバイス１８０５は、応答時に「Ｌ１に変更」リクエストＬＬＰを送信することができ、第２のデバイス１８１０は、この応答を、「Ｌ１に変更」確認応答（ＡＣＫ）ＬＬＰを通じて確認応答することができる。ハンドシェイクの完了を検出すると、論理ＰＨＹは、サイドバンド信号が専用サイドバンドリンクにおいてアサートされるようにし、ＡＣＫが受信されたこと、及びデバイス（例えば、１８０５）が、Ｌ１へのエントリの準備ができており、これを予期していることを確認応答する。例えば、第１のデバイス１８０５は、第２のデバイス１８１０に送信されるサイドバンド信号１８１５をアサートし、リンク層ハンドシェイクにおける最終的なＡＣＫの受信を確認することができる。更に、第２のデバイス１８１０も、サイドバンド信号１８１５に応答してサイドバンド信号をアサートし、第１のデバイス１８０５に、第１のデバイスのサイドバンドＡＣＫ１８０５を通知することができる。リンク層制御及びサイドバンドハンドシェイクが完了すると、ＭＣＰＬＰＨＹはＬ１状態に遷移することができ、これにより、デバイス１８０５、１８１０の１８２０、１８２５のそれぞれのＭＣＰＬストローブを含む、ＭＣＰＬの全てのレーンがアイドル電力節減モードに入る。例えば、ＭＣＰＬを介して他のデバイスに送信されるデータの検出に応答して、第１のデバイス１８０５及び第２のデバイス１８１０のうちの一方の上位レベル層ロジックがＬ０への再エントリをリクエストすると、Ｌ１を出ることができる。 FIG. 18 is a simplified block diagram 1800 illustrating an exemplary flow of transitions between an active state (eg, L0) and low power and an idle state (eg, L1). In this particular example, first device 1805 and second device 1810 are communicatively coupled using MCPL. While in the active state, data is transmitted over MCPL lanes (eg, DATA, VALID, STREAM, etc.). A link layer packet (LLP) can be communicated over a lane (eg, a data lane, and a stream signal indicates that the data is LLP data) to help facilitate link state transitions. For example, an LLP can be sent between the first device 1805 and the second device 1810 to negotiate an entry from L0 to L1. For example, an upper layer protocol supported by MCPL can communicate that an entry into L1 (or another state) is desirable, and this upper layer protocol uses LLP to cause the physical layer to enter L1. Can be transmitted via MCPL to facilitate link layer handshaking. For example, FIG. 18 illustrates at least a portion of a transmitted LLP, including a “enter L1” request LLP transmitted from a second (upstream) device 1810 to a first (downstream) device 1805. In some implementations and higher level protocols, the downstream port does not initiate entry to L1. Among other examples, the first device 1805 on the receiving side can send a “change to L1” request LLP upon response, and the second device 1810 can send this response to the “change to L1” confirmation response. (ACK) Acknowledgment can be made through LLP. Upon detecting the completion of the handshake, the logical PHY causes the sideband signal to be asserted on the dedicated sideband link, the ACK has been received, and the device (eg, 1805) is ready to enter L1. Acknowledge that it is done and expects it. For example, the first device 1805 can assert a sideband signal 1815 transmitted to the second device 1810 to confirm receipt of the final ACK in the link layer handshake. Further, the second device 1810 can also assert the sideband signal in response to the sideband signal 1815 and notify the first device 1805 of the sideband ACK 1805 of the first device. Upon completion of the link layer control and sideband handshake, the MCPL PHY can transition to the L1 state, which causes all lanes of the MCPL to be idle, including the respective MCPL strobes of the devices 1805, 1810 1820, 1825. Enter power saving mode. For example, in response to detecting data sent to another device via MCPL, one of the upper level logic of one of the first device 1805 and the second device 1810 requests re-entry to L0. , You can exit L1.

上記で示したように、いくつかの実施態様では、ＭＣＰＬは、潜在的に複数の異なるプロトコルをサポートする２つのデバイス間の通信を促進することができ、ＭＣＰＬは、ＭＣＰＬのレーンを介する複数のプロトコルのうちの潜在的に任意の１つによる通信を促進することができる。一方、複数のプロトコルを促進することは、少なくともいくつかのリンク状態へのエントリ及び再エントリを複雑にする可能性がある。例えば、いくつかの従来のインターコネクトは、状態遷移におけるマスタの役割を仮定した単一の上位層を有するのに対し、複数の異なるプロトコルを有するＭＣＰＬの実装は、事実上複数のマスタを伴う。例として、図１８に示すように、ＭＣＰＬの実装により、２つのデバイス１８０５、１８１０間でＰＣＩｅ及びＩＤＩの各々をサポートすることができる。例えば、物理層をアイドル又は低電力状態にすることは、サポートされるプロトコル（例えば、ＰＣＩｅ及びＩＤＩの双方）の各々から最初に得られる許可を条件とすることができる。 As indicated above, in some implementations, the MCPL can facilitate communication between two devices that potentially support multiple different protocols, and the MCPL can communicate with multiple lanes of MCPL. Communication with potentially any one of the protocols can be facilitated. On the other hand, facilitating multiple protocols can complicate entry and re-entry into at least some link states. For example, some conventional interconnects have a single upper layer that assumes the role of the master in state transitions, whereas MCPL implementations with multiple different protocols involve multiple masters in effect. As an example, as shown in FIG. 18, each of PCIe and IDI can be supported between two devices 1805 and 1810 by implementing MCPL. For example, putting the physical layer into an idle or low power state can be contingent on the permissions initially obtained from each of the supported protocols (eg, both PCIe and IDI).

場合によっては、Ｌ１（又は別の状態）へのエントリは、ＭＣＰＬの実施のためにサポートされる、複数のサポートされるプロトコルのうちの１つのみによってリクエストすることができる。他のプロトコルが（例えば、ＭＣＰＬにおける同様の条件（例えば、トラフィックがほとんど又は全くない）を特定することに基づいて）同様に同じ状態へのエントリをリクエストする可能性があるが、論理ＰＨＹは、実際に状態遷移を促進する前に、各上位層プロトコルから許可又は命令が受信されるまで待機することができる。論理ＰＨＹは、いずれの上位層プロトコルが状態変化をリクエストした（例えば、対応するハンドシェイクを実行した）かを追跡し、プロトコルの各々が、Ｌ０からＬ１への遷移、又は他のプロトコルの通信に影響若しくは干渉する別の遷移等の特定の状態変化をリクエストしたことを特定すると、状態遷移をトリガすることができる。いくつかの実施態様では、プロトコルは、システム内の他のプロトコルに対する自身の少なくとも部分的な依存性に関して認識していない可能性がある。更に、場合によっては、プロトコルは、リクエストされた状態遷移の確認又は拒否等の特定の状態に入るリクエストに対する応答を（例えば、ＰＨＹから）予期する場合がある。したがって、そのような場合、アイドルリンク状態へのエントリに関する他のサポートされるプロトコルからの許可を待機する間、論理ＰＨＹは、アイドル状態に入るリクエストに対する合成応答を生成し、要求側の上位層プロトコルに、（実際には、少なくとも、他のプロトコルもアイドル状態へのエントリをリクエストするまで、レーンが依然としてアクティブであるときに）特定の状態に入ったことを信じさせるように「騙す」ことができる。数ある潜在的な利点の中でも、これは、数ある例の中でも複数のプロトコル間での低電力状態へのエントリの調整を単純化することができる。 In some cases, entry to L1 (or another state) can be requested by only one of a plurality of supported protocols supported for MCPL implementation. Although other protocols may request entries to the same state as well (eg, based on identifying similar conditions in MCPL (eg, little or no traffic)), the logical PHY is It can wait until a permission or command is received from each higher layer protocol before actually facilitating the state transition. The logical PHY keeps track of which upper layer protocol has requested a state change (eg, performed a corresponding handshake), and each of the protocols is in the transition from L0 to L1 or other protocol communication. Specifying that a particular state change has been requested, such as another transition that affects or interferes, can trigger a state transition. In some implementations, the protocol may be unaware of its at least partial dependencies on other protocols in the system. Further, in some cases, the protocol may expect a response (eg, from the PHY) to a request that enters a particular state, such as confirmation or rejection of the requested state transition. Thus, in such a case, while waiting for permission from another supported protocol for entry into the idle link state, the logical PHY generates a composite response for the request to enter the idle state, and the requesting upper layer protocol. Can be fooled into believing that a particular state has been entered (in fact, at least until other protocols also request entry into the idle state) . Among other potential benefits, this can simplify the coordination of entries to low power states among multiple protocols, among other examples.

上記で記載した装置、方法及びシステムは、上述したように任意の電子デバイス又はシステムにおいて実施することができる。特定の例として、以下の図は、本明細書において説明されるような本発明を利用する例示的なシステムを提供する。以下のシステムは、より詳細に説明されているが、上記の検討から、複数の異なるインターコネクトが開示され、説明され、修正される。容易に明らかとなるように、上記で説明した利点は、これらのインターコネクト、ファブリック又はアーキテクチャのうちの任意のものに適用することができる。 The apparatus, methods, and systems described above can be implemented in any electronic device or system as described above. As a specific example, the following figures provide an exemplary system utilizing the present invention as described herein. Although the following system is described in more detail, from the above discussion, a number of different interconnects are disclosed, described, and modified. As will be readily apparent, the advantages described above can be applied to any of these interconnects, fabrics or architectures.

図１９を参照すると、マルチコアプロセッサを含むコンピューティングシステムのためのブロック図の一実施形態が示されている。プロセッサ１９００は、マイクロプロセッサ、埋込みプロセッサ、デジタル信号プロセッサ（ＤＳＰ）、ネットワークプロセッサ、ハンドヘルドプロセッサ、アプリケーションプロセッサ、コプロセッサ、システムオンチップ（ＳＯＣ）又はコードを実行する他のデバイス等の任意のプロセッサ又は処理デバイスを含む。１つの実施形態では、プロセッサ１９００は、少なくとも２つのコア、コア１９０１及び１９０２を含む。これは、非対称コア又は対称コア（示す実施形態）を含むことができる。一方、プロセッサ１９００は、対称又は非対称とすることができる任意の数の処理要素を含むことができる。 Referring to FIG. 19, one embodiment of a block diagram for a computing system that includes a multi-core processor is shown. The processor 1900 may be any processor or process such as a microprocessor, embedded processor, digital signal processor (DSP), network processor, handheld processor, application processor, coprocessor, system on chip (SOC) or other device executing code. Includes devices. In one embodiment, the processor 1900 includes at least two cores, cores 1901 and 1902. This can include an asymmetric core or a symmetric core (shown embodiment). On the other hand, the processor 1900 may include any number of processing elements that may be symmetric or asymmetric.

１つの実施形態では、処理要素は、ソフトウェアスレッドをサポートするハードウェア又はロジックを指す。ハードウェア処理要素の例は、スレッドユニット、スレッドスロット、スレッド、プロセスユニット、コンテキスト、コンテキストユニット、論理プロセッサ、ハードウェアスレッド、コア、及び／又は、実行状態若しくは構造的状態等のプロセッサの状態を保持することが可能な任意の他の要素を含む。換言すれば、１つの実施形態では、処理要素は、ソフトウェアスレッド、オペレーティングシステム、アプリケーション又は他のコード等のコードに独立して関連付けられることが可能な任意のハードウェアを指す。物理プロセッサ（又はプロセッサソケット）は通常、コア又はハードウェアスレッド等の潜在的に任意の数の他の処理要素を含む集積回路を指す。 In one embodiment, a processing element refers to hardware or logic that supports software threads. Examples of hardware processing elements hold thread state, thread slot, thread, process unit, context, context unit, logical processor, hardware thread, core, and / or processor state such as execution state or structural state Including any other elements that can be done. In other words, in one embodiment, a processing element refers to any hardware that can be independently associated with code, such as a software thread, operating system, application or other code. A physical processor (or processor socket) typically refers to an integrated circuit that includes potentially any number of other processing elements, such as cores or hardware threads.

コアは、多くの場合、独立した構造的状態を維持することが可能な集積回路上に位置するロジックを指す。独立して維持される各構造的状態は、少なくともいくつかの専用実行リソースに関連付けられる。コアと対照的に、ハードウェアスレッドは通常、独立した構造的状態を維持することが可能な集積回路上に位置する任意のロジックを指し、独立して維持される構造的状態は、実行リソースへのアクセスを共有する。見てわかるように、一定のリソースが共有され、他のリソースが構造的状態に専用であるとき、ハードウェアスレッド及びコアの用語間のラインは重複する。更に多くの場合に、コア及びハードウェアスレッドは、オペレーティングシステムによって、個々の論理プロセッサとみなされ、ここで、オペレーティングシステムは、各論理プロセッサにおいて動作を個々にスケジューリングすることが可能である。 A core often refers to logic located on an integrated circuit that can maintain an independent structural state. Each structural state that is maintained independently is associated with at least some dedicated execution resources. In contrast to the core, a hardware thread usually refers to any logic located on an integrated circuit capable of maintaining an independent structural state, which is independently transferred to an execution resource. Share access. As can be seen, when certain resources are shared and other resources are dedicated to the structural state, the lines between the hardware thread and core terms overlap. More often, the core and hardware threads are considered by the operating system as individual logical processors, where the operating system can schedule operations individually in each logical processor.

図１９に示すように、物理プロセッサ１９００は、２つのコア、コア１９０１及び１９０２を含む。ここで、コア１９０１及び１９０２は、対称コア、すなわち、同じ構成、機能ユニット及び／又はロジックを有するコアとみなされる。別の実施形態では、コア１９０１はアウトオブオーダプロセッサコアを含むのに対し、コア１９０２はインオーダプロセッサコアを含む。一方、コア１９０１及び１９０２は、ネイティブコア、ソフトウェア管理コア、ネイティブ命令セットアーキテクチャ（ＩＳＡ：ＩｎｓｔｒｕｃｔｉｏｎＳｅｔＡｒｃｈｉｔｅｃｕｔｕｒｅ）を実行するように適合されたコア、変換された命令セットアーキテクチャ（ＩＳＡ）を実行するように適合されたコア、共同設計されたコア、又は他の既知のコア等の任意のタイプのコアから個々に選択することができる。異種のコア環境（すなわち、非対称コア）において、バイナリ変換等の、ある形態の変換を利用して、一方又は双方のコアにおいてコードをスケジューリング又は実行することができる。また更に検討を進めて、コア１９０１に示す機能ユニットが以下で更に詳細に説明され、コア１９０２内のユニットは示される実施形態において同様に動作する。 As shown in FIG. 19, the physical processor 1900 includes two cores, cores 1901 and 1902. Here, cores 1901 and 1902 are considered symmetric cores, i.e. cores having the same configuration, functional units and / or logic. In another embodiment, core 1901 includes an out-of-order processor core, whereas core 1902 includes an in-order processor core. On the other hand, the cores 1901 and 1902 execute a native core, a software management core, a core adapted to execute a native instruction set architecture (ISA), a translated instruction set architecture (ISA). It can be individually selected from any type of core, such as adapted cores, co-designed cores, or other known cores. In a heterogeneous core environment (ie, an asymmetric core), some form of transformation, such as binary transformation, can be utilized to schedule or execute code in one or both cores. Further discussion will be made and the functional unit shown in core 1901 will be described in more detail below, and the units in core 1902 will operate similarly in the embodiment shown.

示されるように、コア１９０１は、ハードウェアスレッドスロット１９０１ａ及び１９０１ｂと呼ばれる場合もある２つのハードウェアスレッド１９０１ａ及び１９０１ｂを含む。したがって、１つの実施形態において、オペレーティングシステム等のソフトウェアエンティティは、潜在的に、プロセッサ１９００を４つの別個のプロセッサ、すなわち、４つのソフトウェアスレッドを同時に実行することが可能な４つの論理プロセッサ又は処理要素とみなす。上記で示唆したように、第１のスレッドは、アーキテクチャ状態レジスタ１９０１ａに関連付けられ、第２のスレッドはアーキテクチャ状態レジスタ１９０１ｂに関連付けられ、第３のスレッドはアーキテクチャ状態レジスタ１９０２ａに関連付けることができ、第４のスレッドはアーキテクチャ状態レジスタ１９０２ｂに関連付けることができる。ここで、上記で説明したように、アーキテクチャ状態レジスタ（１９０１ａ、１９０１ｂ、１９０２ａ及び１９０２ｂ）の各々を、処理要素、スレッドスロット又はスレッドユニットと呼ぶことができる。示すように、アーキテクチャ状態レジスタ１９０１ａは、アーキテクチャ状態レジスタ１９０１ｂにおいて複製され、このため、個々のアーキテクチャ状態／コンテキストが、論理プロセッサ１９０１ａ及び論理プロセッサ１９０１ｂのために記憶されることが可能である。コア１９０１において、アロケータ及びリネーマブロック１９３０における命令ポインタ及びリネームロジック等の他のより小さなリソースも、スレッド１９０１ａ及び１９０１ｂのために複製することができる。リオーダ／リタイアメントユニット１９３５におけるリオーダバッファ、ＩＬＴＢ１９２０、ロード／記憶バッファ及びキュー等のいくつかのリソースを、分割を通じて共有することができる。汎用内部レジスタ、ページテーブルベースレジスタ、低レベルデータキャッシュ及びデータＴＬＢ１９１５、実行ユニット１９４０、及びアウトオブオーダユニット１９３５の部分等の他のリソースが潜在的に完全に共有される。 As shown, core 1901 includes two hardware threads 1901a and 1901b, sometimes referred to as hardware thread slots 1901a and 1901b. Thus, in one embodiment, a software entity, such as an operating system, potentially has processor 1900 running on four separate processors, ie four logical processors or processing elements capable of executing four software threads simultaneously. It is considered. As suggested above, the first thread can be associated with the architectural state register 1901a, the second thread can be associated with the architectural state register 1901b, the third thread can be associated with the architectural state register 1902a, Four threads can be associated with the architecture status register 1902b. Here, as described above, each of the architecture status registers (1901a, 1901b, 1902a and 1902b) can be referred to as a processing element, thread slot or thread unit. As shown, architectural state register 1901a is duplicated in architectural state register 1901b so that individual architectural states / contexts can be stored for logical processor 1901a and logical processor 1901b. In core 1901, other smaller resources such as instruction pointers and rename logic in allocator and renamer block 1930 can also be replicated for threads 1901a and 1901b. Several resources such as reorder buffers, ILTB 1920, load / store buffers and queues in the reorder / retirement unit 1935 can be shared through partitioning. Other resources such as general internal registers, page table base registers, low level data cache and data TLB 1915, execution unit 1940, and out-of-order unit 1935 portions are potentially fully shared.

プロセッサ１９００は、多くの場合に他のリソースを含み、これらは完全に共有されるか、分割を通じて共有されるか、又は処理要素によって／に専用にすることができる。図１９において、プロセッサの例示的な論理ユニット／リソースを有する単なる例示的なプロセッサの一実施形態が示されている。プロセッサは、これらの機能ユニットのうちの任意のものを含むか又は省くことができ、示されていない任意の他の既知の機能ユニット、ロジック又はファームウェアを含むことができることに留意されたい。示されるように、コア１９０１は、単純化された、代表的なアウトオブオーダ（ＯＯＯ）プロセッサコアを含む。しかし、様々な実施形態においてインオーダプロセッサを利用することができる。ＯＯＯコアは、実行／取得されるブランチを予測するブランチターゲットバッファ１９２０を含み、命令のアドレス変換エントリを記憶する命令変換バッファ（Ｉ−ＴＬＢ）１９２０を含む。 The processor 1900 often includes other resources, which can be fully shared, shared through partitioning, or dedicated to / by processing elements. In FIG. 19, one embodiment of a mere exemplary processor with exemplary logical units / resources of the processor is shown. Note that the processor may include or omit any of these functional units, and may include any other known functional units, logic or firmware not shown. As shown, core 1901 includes a simplified, typical out-of-order (OOO) processor core. However, in-order processors can be utilized in various embodiments. The OOO core includes a branch target buffer 1920 that predicts the branch to be executed / acquired, and an instruction translation buffer (I-TLB) 1920 that stores an address translation entry of the instruction.

コア１９０１は、フェッチされた要素をデコードするためにフェッチユニット１９２０に結合されたデコードモジュール１９２５を更に含む。フェッチロジックは、１つの実施形態では、それぞれスレッドスロット１９０１ａ、１９０１ｂに関連付けられた個々のシーケンサを含む。通常、コア１９０１は第１のＩＳＡに関連付けられている。第１のＩＳＡは、プロセッサ１９００上で実行可能な命令を定義／指定する。第１のＩＳＡの一部である機械コード命令は、多くの場合、実行される命令又は動作を参照／指定する命令の一部分（ｏｐｃｏｄｅと呼ばれる）を含む。デコードロジック１９２５は、これらの命令をｏｐｃｏｄｅから認識し、デコードされた命令を第１のＩＳＡによって定義されているように処理のためのパイプラインに通す回路部を含む。例えば、以下で更に詳細に検討するように、デコーダ１９２５は、１つの実施形態では、トランザクション命令等の特定の命令を認識するように設計又は適合されたロジックを含む。デコーダ１９２５による認識の結果として、アーキテクチャ又はコア１９０１は、適切な命令に関連付けられたタスクを実行する特定の予め定められた行動を行う。本明細書において説明されるタスク、ブロック、動作及び方法のうちの任意のものを、単一の又は複数の命令に応答して実行することができることに留意することが重要である。それらの命令のうちのいくつかは、新たな命令又は古い命令である場合がある。デコーダ１９２６は、１つの実施形態では、同じＩＳＡ（又はそのサブセット）を認識することに留意されたい。代替的に、異種のコア環境では、デコーダ１９２６は第２のＩＳＡ（第１のＩＳＡのサブセット又は別個のＩＳＡ）を認識する。 Core 1901 further includes a decode module 1925 coupled to fetch unit 1920 to decode the fetched elements. The fetch logic, in one embodiment, includes individual sequencers associated with thread slots 1901a, 1901b, respectively. Typically, core 1901 is associated with the first ISA. The first ISA defines / specifies instructions that can be executed on the processor 1900. Machine code instructions that are part of the first ISA often include a portion of the instruction (referred to as opcode) that references / specifies the instruction or action to be performed. The decode logic 1925 includes circuitry that recognizes these instructions from the opcode and passes the decoded instructions through a pipeline for processing as defined by the first ISA. For example, as discussed in more detail below, decoder 1925 includes logic designed or adapted to recognize specific instructions, such as transaction instructions, in one embodiment. As a result of recognition by decoder 1925, architecture or core 1901 performs certain predetermined actions to perform the tasks associated with the appropriate instructions. It is important to note that any of the tasks, blocks, operations and methods described herein can be performed in response to a single or multiple instructions. Some of those instructions may be new or old instructions. Note that decoder 1926 recognizes the same ISA (or a subset thereof) in one embodiment. Alternatively, in a heterogeneous core environment, decoder 1926 recognizes a second ISA (a subset of the first ISA or a separate ISA).

１つの例では、アロケータ又はリネーマブロック１９３０は、命令処理結果を記憶するレジスタファイル等のリソースを予約するためのアロケータを含む。一方、スレッド１９０１ａ及び１９０１ｂは、潜在的に、アウトオブオーダ実行が可能であり、この場合、アロケータ及びリネーマブロック１９３０も、命令結果を追跡するためのリオーダバッファ等の他のリソースを予約する。ユニット１９３０も、プロセッサ１９００の内部の他のレジスタに対するプログラム／命令基準レジスタをリネームするレジスタリネーマを含む。リオーダ／リタイアメントユニット１９３５は、アウトオブオーダ実行、及び後の、アウトオブオーダ実行された命令のインオーダリタイアメントをサポートするための、上述したリオーダバッファ、ロードバッファ及びストアバッファ等のコンポーネントを含む。 In one example, the allocator or renamer block 1930 includes an allocator for reserving resources, such as a register file that stores instruction processing results. On the other hand, threads 1901a and 1901b are potentially capable of out-of-order execution, in which case the allocator and renamer block 1930 also reserve other resources such as a reorder buffer for tracking instruction results. Unit 1930 also includes a register renamer that renames program / instruction reference registers for other registers within processor 1900. The reorder / retirement unit 1935 includes components such as the reorder buffer, load buffer, and store buffer described above to support out-of-order execution and subsequent in-order retirement of out-of-order executed instructions.

スケジューラ及び実行ユニットブロック１９４０は、１つの実施形態では、実行ユニット上で命令／動作をスケジューリングするスケジューラユニットを含む。例えば、浮動小数点命令が、利用可能な浮動小数点実行ユニットを有する実行ユニットのポート上でスケジューリングされる。実行ユニットに関連付けられたレジスタファイルも、情報命令処理結果を記憶するために含まれる。例示的な実行ユニットは、浮動小数点実行ユニットと、整数実行ユニットと、ジャンプ実行ユニットと、ロード実行ユニットと、ストア実行ユニットと、他の既知の実行ユニットとを含む。 The scheduler and execution unit block 1940, in one embodiment, includes a scheduler unit that schedules instructions / operations on the execution unit. For example, floating point instructions are scheduled on a port of an execution unit that has an available floating point execution unit. A register file associated with the execution unit is also included for storing information instruction processing results. Exemplary execution units include a floating point execution unit, an integer execution unit, a jump execution unit, a load execution unit, a store execution unit, and other known execution units.

低レベルデータキャッシュ及びデータ変換バッファ（Ｄ−ＴＬＢ）１９５０は実行ユニット１９４０に結合される。データキャッシュは、潜在的にメモリコヒーレンシ状態に保持されているデータオペランド等の最近使用／操作された要素を記憶するためのものである。Ｄ−ＴＬＢは、最近の仮想／リニア対物理アドレス変換を記憶するためのものである。特定の例として、プロセッサは、物理メモリを複数の仮想ページに分割するページテーブル構造を含むことができる。 A low level data cache and data conversion buffer (D-TLB) 1950 is coupled to the execution unit 1940. The data cache is for storing recently used / operated elements such as data operands that are potentially held in memory coherency state. The D-TLB is for storing recent virtual / linear to physical address translation. As a specific example, the processor may include a page table structure that divides physical memory into multiple virtual pages.

ここで、コア１９０１及び１９０２は、オンチップインタフェース１９１０に関連付けられた第２のレベルのキャッシュ等のより高レベルの又は更に外側のキャッシュへのアクセスを共有する。より高レベル又は更に外側とは、キャッシュレベルが実行ユニットから増大するか又は離れることを指すことに留意されたい。１つの実施形態では、より高レベルのキャッシュは、第２の又は第３のレベルのデータキャッシュ等の、最終レベルデータキャッシュ、すなわち、プロセッサ１９００におけるメモリ階層における最後のキャッシュである。一方、より高いレベルのキャッシュはそれほど限定されていない。なぜなら、このキャッシュは命令キャッシュに関連付けられるか又は命令キャッシュを含むことができるためである。代わりに、トレースキャッシュ、すなわち命令キャッシュの一タイプをデコーダ１９２５が最近デコードされたトレースを記憶した後に結合することができる。ここで、命令は、潜在的にマクロ命令（すなわち、デコーダによって認識される一般的な命令）を指す。マクロ命令は、複数のマイクロ命令（マイクロ演算）にデコードすることができる。 Here, cores 1901 and 1902 share access to higher level or even outer caches such as second level caches associated with on-chip interface 1910. Note that higher or further outside refers to the cache level increasing or leaving the execution unit. In one embodiment, the higher level cache is a last level data cache, such as a second or third level data cache, ie, the last cache in the memory hierarchy in the processor 1900. On the other hand, higher level caches are not so limited. This is because this cache can be associated with or include an instruction cache. Alternatively, one type of trace cache, or instruction cache, can be combined after the decoder 1925 stores the recently decoded trace. Here, an instruction potentially refers to a macro instruction (ie, a general instruction recognized by a decoder). The macro instruction can be decoded into a plurality of micro instructions (micro operations).

示される構成では、プロセッサ１９００は、オンチップインタフェースモジュール１９１０も含む。従来から、以下でより詳細に説明されるメモリコントローラが、プロセッサ１９００の外部のコンピューティングシステムに含まれていた。このシナリオでは、オンチップインタフェース１９１０が、システムメモリ１９７５、チップセット（多くの場合、メモリ１９７５に接続するためのメモリコントローラハブ及び周辺デバイスに接続するためのＩ／Ｏコントローラハブを含む）、メモリコントローラハブ、ノースブリッジ又は他の集積回路等の、プロセッサ１９００の外部のデバイスと通信するためのものである。そしてこのシナリオでは、バス１９０５は、マルチドロップバス、ポイントツーポイントインターコネクト、シリアルインターコネクト、パラレルバス、コヒーレント（例えば、キャッシュコヒーレント）バス、階層化プロトコルアーキテクチャ、差動バス及びＧＴＬバス等の任意の既知のインターコネクトを含むことができる。 In the configuration shown, the processor 1900 also includes an on-chip interface module 1910. Traditionally, a memory controller, described in more detail below, has been included in a computing system external to processor 1900. In this scenario, the on-chip interface 1910 includes a system memory 1975, a chipset (often including a memory controller hub for connecting to the memory 1975 and an I / O controller hub for connecting to peripheral devices), a memory controller. For communication with devices external to the processor 1900, such as a hub, north bridge or other integrated circuit. And in this scenario, bus 1905 can be any known, such as multi-drop bus, point-to-point interconnect, serial interconnect, parallel bus, coherent (eg, cache coherent) bus, layered protocol architecture, differential bus, and GTL bus. Interconnects can be included.

メモリ１９７５は、プロセッサ１９００に専用とすることもできるし、又はシステム内の他のデバイスと共有することもできる。メモリ１９７５のタイプの一般的な例は、ＤＲＡＭ、ＳＲＡＭ、不揮発性メモリ（ＮＶメモリ）及び他の既知のストレージデバイスを含む。デバイス１９８０は、グラフィックアクセラレータ、メモリコントローラハブに結合されたプロセッサ若しくはカード、Ｉ／Ｏコントローラハブに結合されたデータストレージ、無線送受信機、フラッシュデバイス、オーディオコントローラ、ネットワークコントローラ又は他の既知のデバイスを含むことができることに留意されたい。 Memory 1975 can be dedicated to processor 1900 or can be shared with other devices in the system. Common examples of types of memory 1975 include DRAM, SRAM, non-volatile memory (NV memory) and other known storage devices. Device 1980 includes a graphics accelerator, a processor or card coupled to a memory controller hub, data storage coupled to an I / O controller hub, a wireless transceiver, a flash device, an audio controller, a network controller or other known device. Note that you can.

一方、最近、より多くのロジック及びデバイスがＳＯＣ等の単一のダイ上に集積されているため、これらのデバイスの各々をプロセッサ１９００上に組み込むことができる。例えば、１つの実施形態では、メモリコントローラハブは、プロセッサ１９００と同じパッケージ及び／又はダイ上にある。ここで、コアの一部分（オンコア部分）１９１０は、メモリ１９７５又はグラフィックデバイス１９８０等の他のデバイスとインタフェースするための１以上のコントローラを含む。そのようなデバイスとインタフェースするためのインターコネクト及びコントローラを含む構成は、多くの場合、オンコア（又はアンコア構成）と呼ばれる。例として、オンチップインタフェース１９１０は、オンチップ通信のためのリングインターコネクト及びオフチップ通信のための高速シリアルポイントツーポイントリンク１９０５を含む。しかし、ＳＯＣ環境では、ネットワークインタフェース、コプロセッサ、メモリ１９７５、グラフィックプロセッサ１９８０及び任意の他の既知のコンピュータデバイス／インタフェース等の更に多くのデバイスを単一のダイ又は集積回路上に集積し、高い機能性及び低電力消費で小さなフォームファクタを提供することができる。 On the other hand, since more logic and devices have recently been integrated on a single die, such as an SOC, each of these devices can be incorporated on the processor 1900. For example, in one embodiment, the memory controller hub is on the same package and / or die as the processor 1900. Here, a portion of the core (on-core portion) 1910 includes one or more controllers for interfacing with other devices such as memory 1975 or graphics device 1980. Configurations that include interconnects and controllers for interfacing with such devices are often referred to as on-core (or uncore configurations). By way of example, the on-chip interface 1910 includes a ring interconnect for on-chip communication and a high-speed serial point-to-point link 1905 for off-chip communication. However, in an SOC environment, more devices such as a network interface, coprocessor, memory 1975, graphics processor 1980, and any other known computer device / interface are integrated on a single die or integrated circuit to provide high functionality. Can provide a small form factor with low power consumption.

１つの実施形態では、プロセッサ１９００は、コンパイラ、最適化及び／又は変換器コード１９７７を実行して、本明細書に記載される装置及び方法並びにそれらとのインタフェースをサポートするためのアプリケーションコード１９７６をコンパイル、変換及び／又は最適化することができる。コンパイラは、多くの場合、ソーステキスト／コードをターゲットテキスト／コードに変換するプログラム又は１組のプログラムを含む。通例、コンパイラを用いたプログラム／アプリケーションコードのコンパイルは、複数のフェーズ及びパスで行われ、高レベルプログラミング言語コードを低レベルの機械又はアセンブリ言語コードに変換する。しかし、単純なコンパイルのために依然としてシングルパスコンパイラが用いられる場合がある。コンパイラは、任意の既知のコンパイル技法を利用して、語彙解析、前処理、パース、意味解析、コード生成、コード変換及びコード最適化等の任意の既知のコンパイラ動作を実行することができる。 In one embodiment, the processor 1900 executes compiler, optimization and / or converter code 1977 to provide application code 1976 to support the devices and methods described herein and interfaces therewith. It can be compiled, transformed and / or optimized. Compilers often include a program or set of programs that convert source text / code into target text / code. Typically, compiling program / application code with a compiler is done in multiple phases and passes, converting high-level programming language code into low-level machine or assembly language code. However, single pass compilers may still be used for simple compilation. The compiler can use any known compilation technique to perform any known compiler operations such as lexical analysis, preprocessing, parsing, semantic analysis, code generation, code conversion, and code optimization.

より大型のコンパイラは、多くの場合、複数のフェーズを含むが、最も多くの場合、これらのフェーズは大まかな２つのフェーズ、すなわち、（１）フロントエンド、すなわち、通常、構文処理、意味処理及びいくつかの変換／最適化を行うことができるところと、（２）バックエンド、すなわち、解析、変換、最適化及びコード生成を行うところと、に含まれる。いくつかのコンパイラは、コンパイラのフロントエンドとバックエンドとの間の境界が曖昧であることを示す中間を指す。結果として、コンパイラの挿入、関連付け、生成又は他の動作の参照を、上述したフェーズ又はパスのうちの任意のもの、及びコンパイラの任意の他の既知のフェーズ又はパスにおいて行うことができる。説明のための例として、コンパイラは、コンパイルのフロントエンドフェーズにおける呼／動作の挿入等のコンパイルの１以上のフェーズにおける動作、呼、機能等を潜在的に挿入し、次に、変換フェーズ中に呼／動作をより低レベルのコードに変換する。動的コンパイル中、コンパイラコード又は動的最適化コードは、そのような動作／呼を挿入し、ランタイム中の実行のためにコードを最適化することができることに留意されたい。説明のための特定の例として、バイナリコード（既にコンパイルされたコード）は、ランタイム中に動的に最適化することができる。ここで、プログラムコードは、動的最適化コード、バイナリコード又はこれらの組合せを含むことができる。 Larger compilers often include multiple phases, but most often these phases are roughly two phases: (1) the front end, ie usually syntax processing, semantic processing and Included in where several transformations / optimizations can be performed and (2) the backend, ie, where analysis, transformation, optimization and code generation are performed. Some compilers refer to the middle, which indicates that the boundary between the compiler front end and back end is ambiguous. As a result, compiler insertions, associations, generations, or other behavior references can be made in any of the phases or paths described above, and in any other known phase or path of the compiler. As an illustrative example, the compiler potentially inserts actions, calls, functions, etc. in one or more phases of compilation, such as inserting calls / actions in the front end phase of compilation, and then during the transformation phase Convert call / motion to lower level code. Note that during dynamic compilation, compiler code or dynamically optimized code can insert such operations / calls to optimize the code for execution during runtime. As a specific example for illustration, binary code (already compiled code) can be dynamically optimized during runtime. Here, the program code may include dynamic optimization code, binary code, or a combination thereof.

コンパイラと同様に、バイナリ変換器等の変換器は、コードを静的又は動的に変換してコードを最適化及び／又は変換する。したがって、コード、アプリケーションコード、プログラムコード又は他のソフトウェア環境の実行の参照は、（１）コンパイラプログラム、最適化コードオプティマイザ又は変換器を動的又は静的に実行して、プログラムコードをコンパイルするか、ソフトウェア構造を維持するか、他の動作を実行するか、コードを最適化するか若しくはコードを変換すること、（２）最適化／コンパイルされたアプリケーションコード等の動作／呼を含むメインプログラムコードを実行すること、（３）メインプログラムコードに関連付けられたライブラリ等の他のプログラムコードを実行して、ソフトウェア構造を維持するか、他のソフトウェア関連動作を実行するか若しくはコードを最適化すること、又は（４）それらの組合せ、を指すことができる。 Similar to a compiler, a converter, such as a binary converter, converts code statically or dynamically to optimize and / or convert the code. Thus, references to the execution of code, application code, program code or other software environment are: (1) whether the compiler program, optimizing code optimizer or translator is executed dynamically or statically to compile the program code Maintain software structure, perform other operations, optimize code or convert code, (2) main program code including operations / calls such as optimized / compiled application code (3) Run other program code, such as a library associated with the main program code, to maintain the software structure, perform other software-related operations, or optimize the code Or (4) a combination thereof .

図２０を参照すると、マルチコアプロセッサの一実施形態のブロック図が示されている。図２０の実施形態に示すように、プロセッサ２０００は複数の領域を含む。特に、コアドメイン２０３０は、複数のコア２０３０Ａ〜２０３０Ｎを含み、グラフィックドメイン２０６０は、メディアエンジン２０６５及びシステムエージェントドメイン２０１０を有する１以上のグラフィックエンジンを含む。 Referring to FIG. 20, a block diagram of one embodiment of a multi-core processor is shown. As shown in the embodiment of FIG. 20, the processor 2000 includes multiple regions. In particular, the core domain 2030 includes a plurality of cores 2030A-2030N, and the graphics domain 2060 includes one or more graphics engines having a media engine 2065 and a system agent domain 2010.

様々な実施形態では、システムエージェントドメイン２０１０は、所与のユニットにおいて生じるアクティブ性（又は非アクティブ性）を考慮して、ドメイン２０３０及び２０６０の個々のユニット（例えば、コア及び／又はグラフィックエンジン）が適切な電力モード／レベル（例えば、アクティブ、ターボ、スリープ、ハイバネート、ディープスリープ、又は他の進化型構成電力インタフェースのような状態）で動的に動作するように独立して制御可能であるように電力制御イベント及び電力管理を処理する。ドメイン２０３０及び２０６０の各々は、異なる電圧及び／又は電力で動作することができ、更に、ドメイン内の個々のユニットはそれぞれ、潜在的に独立した周波数及び電圧で動作する。３つのドメインのみを用いて示されているが本発明の範囲はこれに関して限定されず、他の実施形態では更なるドメインが存在してもよいことに留意されたい。 In various embodiments, the system agent domain 2010 allows individual units (eg, cores and / or graphic engines) in domains 2030 and 2060 to take into account the activity (or inactivity) that occurs in a given unit. To be independently controllable to operate dynamically in the appropriate power mode / level (eg, states such as active, turbo, sleep, hibernate, deep sleep, or other evolved configuration power interfaces) Handle power control events and power management. Each of domains 2030 and 2060 can operate at different voltages and / or powers, and each individual unit in the domain operates at a potentially independent frequency and voltage, respectively. It should be noted that although only three domains are shown, the scope of the invention is not limited in this regard, and additional domains may exist in other embodiments.

示すように、各コア２０３０は、様々な実行ユニット及び追加の処理要素に加えて、低レベルキャッシュを更に含む。ここで、様々なコアが互いに結合され、最終レベルキャッシュ（ＬＬＣ）２０４０Ａ〜２０４０Ｎの複数のユニット又はスライスから形成される共有キャッシュメモリに結合される。これらのＬＬＣは、多くの場合、ストレージ及びキャッシュコントローラの機能を含み、コア間で共有され、潜在的にはグラフィックエンジン間でも共有される。 As shown, each core 2030 further includes a low level cache in addition to various execution units and additional processing elements. Here, the various cores are coupled together and coupled to a shared cache memory formed from multiple units or slices of last level caches (LLCs) 2040A-2040N. These LLCs often include storage and cache controller functions and are shared between cores and potentially shared between graphic engines.

示すように、リングインターコネクト２０５０は、コアを共に結合し、それぞれがコアとＬＬＣスライスとの間の結合部にある複数のリング係止部２０５２Ａ〜２０５２Ｎを介して、コアドメイン２０３０と、グラフィックドメイン２０６０と、システムエージェント回路部２０１０との間のインターコネクトを提供する。図２０に示すように、インターコネクト２０５０は、アドレス情報、データ情報、確認応答情報及びスヌープ／無効情報を含む様々な情報を搬送するのに用いられる。リングインターコネクトが示されているが、任意の既知のダイ上インターコネクト又はファブリックを利用することができる。説明のための例として、上記で検討したファブリックのうちのいくつか（例えば、別のダイ上インターコネクト、オンチップシステムファブリック（ＯＳＦ）、進化型マイクロコントローラバスアーキテクチャ（ＡＭＢＡ）インターコネクト、多次元メッシュファブリック又は他の既知のインターコネクトアーキテクチャ）を同様に利用することができる。 As shown, the ring interconnect 2050 couples the cores together, with a core domain 2030 and a graphics domain 2060 via a plurality of ring locks 2052A-2052N, each at the junction between the core and the LLC slice. And the system agent circuit unit 2010 are provided. As shown in FIG. 20, the interconnect 2050 is used to carry various information including address information, data information, acknowledgment information and snoop / invalid information. Although a ring interconnect is shown, any known on-die interconnect or fabric can be utilized. As an illustrative example, some of the fabrics discussed above (eg, another on-die interconnect, on-chip system fabric (OSF), evolved microcontroller bus architecture (AMBA) interconnect, multi-dimensional mesh fabric or Other known interconnect architectures) can be used as well.

更に示すように、システムエージェントドメイン２０１０は、関連付けられた表示の制御及びこの表示へのインタフェースを提供するための表示エンジン２０１２を含む。システムエージェントドメイン２０１０は、システムメモリ（例えば、複数のＤＩＭＭを実装されたＤＲＡＭ、メモリコヒーレンス動作を実行するコヒーレンスロジック２０２２）にインタフェースを提供する集積メモリコントローラ２０２０等の他のユニットを含むことができる。プロセッサと他の回路部との間のインターコネクトを可能にするための複数のインタフェースが存在することができる。例えば、１つの実施形態では、少なくとも１つの直接メディアインタフェース（ＤＭＩ）２０１６インタフェース及び１以上のＰＣＩｅ（商標）インタフェース２０１４が提供される。表示エンジン及びこれらのインタフェースは、通常、ＰＣＩｅ（商標）ブリッジ２０１８を介してメモリに結合する。また更に、更なるプロセッサ又は他の回路部等の他のエージェント間の通信を提供するために、１以上の他のインタフェースを提供することができる。 As further shown, the system agent domain 2010 includes a display engine 2012 for providing control of the associated display and an interface to this display. The system agent domain 2010 may include other units such as an integrated memory controller 2020 that provides an interface to system memory (eg, DRAM with multiple DIMMs, coherence logic 2022 that performs memory coherence operations). There can be multiple interfaces to allow interconnection between the processor and other circuitry. For example, in one embodiment, at least one direct media interface (DMI) 2016 interface and one or more PCIe ™ interfaces 2014 are provided. Display engines and their interfaces are typically coupled to memory via a PCIe ™ bridge 2018. Still further, one or more other interfaces can be provided to provide communication between other agents, such as additional processors or other circuitry.

ここで、図２１を参照すると、代表的なコアのブロック図、特に、図２０からのコア２０３０等のコアのバックエンドの論理ブロックが示されている。通常、図２１に示す構造は、フロントエンドユニット２１７０を有するアウトオブオーダプロセッサを含む。このフロントエンドユニット２１７０は、着信命令をフェッチし、様々な処理（例えば、キャッシュ、デコード、分岐予測等）を実行し、アウトオブオーダ（ＯＯＯ）エンジン２１８０に沿って命令／動作を渡すのに用いられる。ＯＯＯエンジン２１８０は、デコードされた命令に対し更なる処理を実行する。 Referring now to FIG. 21, a block diagram of a representative core is shown, particularly a core back-end logic block such as core 2030 from FIG. In general, the structure shown in FIG. 21 includes an out-of-order processor having a front-end unit 2170. This front end unit 2170 is used to fetch incoming instructions, perform various processing (eg, cache, decode, branch prediction, etc.) and pass instructions / actions along an out-of-order (OOO) engine 2180. It is done. The OOO engine 2180 performs further processing on the decoded instruction.

特に、図２１の実施形態において、アウトオブオーダエンジン２１８０は、フロントエンドユニット２１７０から、１以上のマイクロ命令又はμｏｐの形態をとることができる、デコードされた命令を受信し、これらを、レジスタ等の適切なリソースに割り当てる割当てユニット２１８２を含む。次に、命令は予約ステーション２１８４に提供され、予約ステーション２１８４は、リソースを予約し、これらのリソースを複数の実行ユニット２１８６Ａ〜２１８６Ｎのうちの１つにおける実行のためにスケジューリングする。例えば、数ある中でも、算術ロジックユニット（ＡＬＵ）、ロード及びストアユニット、ベクトル処理ユニット（ＶＰＵ）、浮動小数点実行ユニットを含む様々なタイプの実行ユニットが存在することができる。これらの異なる実行ユニットからの結果をリオーダバッファ（ＲＯＢ）２１８８に提供することができる。ＲＯＢ２１８８は、順序付けされていない結果を取得し、プログラム順序を補正するためにこれらの結果を返す。 In particular, in the embodiment of FIG. 21, the out-of-order engine 2180 receives decoded instructions, which can take the form of one or more microinstructions or μops, from the front end unit 2170 and stores them in registers, etc. Allocation unit 2182 that allocates to the appropriate resources. The instructions are then provided to a reservation station 2184 that reserves resources and schedules these resources for execution in one of a plurality of execution units 2186A-2186N. For example, there can be various types of execution units including an arithmetic logic unit (ALU), a load and store unit, a vector processing unit (VPU), and a floating point execution unit, among others. Results from these different execution units can be provided to a reorder buffer (ROB) 2188. ROB 2188 takes unordered results and returns these results to correct the program order.

更に図２１を参照すると、フロントエンドユニット２１７０及びアウトオブオーダエンジン２１８０の双方がメモリ階層の異なるレベルに結合されることに留意されたい。特に示されるのは、命令レベルキャッシュ２１７２であり、そして、この命令レベルキャッシュは中間レベルキャッシュ２１７６に結合し、そして、この中間レベルキャッシュは、最終レベルキャッシュ２１９５に結合する。１つの実施形態では、最終レベルキャッシュ２１９５がオンチップ（アンコアと呼ばれる場合もある）ユニット２１９０において実施される。例として、ユニット２１９０は、図２０のシステムエージェント２０１０に類似している。上記で検討するように、アンコア２１９０は、システムメモリ２１９９と通信し、このシステムメモリ２１９９は、例示される実施形態では、ＥＤＲＡＭを介して実施される。アウトオブオーダエンジン２１８０内の様々な実行ユニット２１８６が、第１のレベルのキャッシュ２１７４と通信し、この第１のレベルのキャッシュ２１７４が中間レベルキャッシュ２１７６とも通信することにも留意されたい。追加のコア２１３０Ｎ−２〜２１３０ＮがＬＬＣ２１９５に結合することができることにも留意されたい。図２１の実施形態ではこの高レベルに示されているが、様々な代替及び追加のコンポーネントが存在する場合があることを理解されたい。 Still referring to FIG. 21, it should be noted that both front end unit 2170 and out-of-order engine 2180 are coupled to different levels of the memory hierarchy. Specifically shown is an instruction level cache 2172, which is coupled to an intermediate level cache 2176, which is coupled to a final level cache 2195. In one embodiment, final level cache 2195 is implemented in on-chip (sometimes referred to as uncore) unit 2190. By way of example, unit 2190 is similar to system agent 2010 of FIG. As discussed above, the uncore 2190 communicates with the system memory 2199, which in the illustrated embodiment is implemented via ED RAM. Note also that the various execution units 2186 in the out-of-order engine 2180 communicate with a first level cache 2174 that also communicates with the intermediate level cache 2176. Note also that additional cores 2130N-2 to 2130N can be coupled to LLC 2195. Although shown at this high level in the embodiment of FIG. 21, it should be understood that various alternative and additional components may exist.

図２２を参照すると、命令を実行する実行ユニットを含むプロセッサを用いて形成された例示的なコンピュータシステムのブロック図が示されている。ここで、インターコネクトのうちの１以上が本発明の１つの実施形態による１以上の特徴を実施する。システム２２００は、本明細書において説明される実施形態におけるように、本発明に従ってデータを処理するアルゴリズムを実行するロジックを含む実行ユニットを用いるプロセッサ２２０２等のコンポーネントを備える。システム２２００はＰＥＮＴＩＵＭ（登録商標）ＩＩＩ、ＰＥＮＴＩＵＭ（登録商標）４、Ｘｅｏｎ（商標）、Ｉｔａｎｉｕｍ、ＸＳｃａｌｅ（商標）及び／又はＳｔｒｏｎｇＡＲＭ（商標）マイクロプロセッサに基づく処理システムを表すが、他のシステム（他のマイクロプロセッサ、エンジニアリングワークステーション、セットトップボックス等を有するＰＣを含む）も用いることができる。１つの実施形態では、サンプルシステム２２００は、ワシントン州レドモンドのＭｉｃｒｏｓｏｆｔＣｏｒｐｏｒａｔｉｏｎから入手可能なＷＩＮＤＯＷＳ（登録商標）オペレーティングシステムのバージョンを実行するが、他のオペレーティングシステム（例えば、ＵＮＩＸ（登録商標）及びＬｉｎｕｘ（登録商標））、埋込みソフトウェア及び／又はグラフィカルユーザインタフェースも用いることができる。このため、本発明の実施形態は、ハードウェア回路部及びソフトウェアのいかなる特定の組合せにも限定されない。 Referring to FIG. 22, a block diagram of an exemplary computer system formed using a processor including an execution unit that executes instructions is shown. Here, one or more of the interconnects implement one or more features according to one embodiment of the invention. System 2200 comprises components, such as processor 2202, that employ execution units that include logic to execute algorithms for processing data in accordance with the present invention, as in the embodiments described herein. System 2200 represents a processing system based on PENTIUM (TM) III, PENTIUM (TM) 4, Xeon (TM), Itanium, XScale (TM) and / or StrongARM (TM) microprocessors, but other systems (others) Other microprocessors, engineering workstations, PCs with set-top boxes, etc.). In one embodiment, the sample system 2200 runs a version of the WINDOWS® operating system available from Microsoft Corporation of Redmond, WA, but other operating systems (eg, UNIX® and Linux ( Registered software), embedded software, and / or a graphical user interface may also be used. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.

実施形態は、コンピュータシステムに限定されない。本発明の代替的な実施形態は、ハンドヘルドデバイス及び埋込みアプリケーション等の他のデバイスにおいて用いることができる。ハンドヘルドデバイスのいくつかの例は、携帯電話、インターネットプロトコルデバイス、デジタルカメラ、携帯情報端末（ＰＤＡ）及びハンドヘルドＰＣを含む。埋込みアプリケーションは、マイクロコントローラ、デジタル信号プロセッサ（ＤＳＰ）、システムオンチップ、ネットワークコンピュータ（ＮｅｔＰＣ）、セットトップボックス、ネットワークハブ、広域ネットワーク（ＷＡＮ）スイッチ、又は、少なくとも１つの実施形態に従って１つ若しくは複数の命令を実行することができる任意の他のシステムを含むことができる。 Embodiments are not limited to computer systems. Alternative embodiments of the present invention can be used in other devices such as handheld devices and embedded applications. Some examples of handheld devices include mobile phones, internet protocol devices, digital cameras, personal digital assistants (PDAs), and handheld PCs. The embedded application may be a microcontroller, digital signal processor (DSP), system on chip, network computer (NetPC), set top box, network hub, wide area network (WAN) switch, or one or more according to at least one embodiment Any other system capable of executing the instructions may be included.

この説明される実施形態では、プロセッサ２２０２は、少なくとも１つの命令を実行するアルゴリズムを実施する１以上の実行ユニット２２０８を含む。１つの実施形態は、単一のプロセッサデスクトップ又はサーバシステムの文脈で説明される場合があるが、代替的な実施形態をマルチプロセッサシステムに含めてもよい。システム２２００は「ハブ」システムアーキテクチャの一例である。コンピュータシステム２２００は、データ信号を処理するプロセッサ２２０２を含む。プロセッサ２２０２は、１つの説明のための例として、複合命令セットコンピュータ（ＣＩＳＣ）マイクロプロセッサ、縮小命令セットコンピューティング（ＲＩＳＣ）マイクロプロセッサ、超長命令語（ＶＬＩＷ：ｖｅｒｙｌｏｎｇｉｎｓｔｒｕｃｔｉｏｎｗｏｒｄ）マイクロプロセッサ、命令セットの組合せを実施するプロセッサ、又は、例えばデジタル信号プロセッサ等の任意の他のプロセッサデバイスを含む。プロセッサ２２０２は、プロセッサ２２０２と、システム２２００内の他のコンポーネントとの間でデータ信号を送信するプロセッサバス２２１０に結合される。システム２２００の要素（例えば、グラフィックアクセラレータ２２１２、メモリコントローラハブ２２１６、メモリ２２２０、Ｉ／Ｏコントローラハブ２２２４、無線送受信機２２２６、フラッシュＢＩＯＳ２２２８、ネットワークコントローラ２２３４、オーディオコントローラ２２３６、シリアル拡張ポート２２３８、Ｉ／Ｏコントローラ２２４０等）が、当業者にはよく知られている従来の機能を実行する。 In the described embodiment, processor 2202 includes one or more execution units 2208 that implement algorithms that execute at least one instruction. One embodiment may be described in the context of a single processor desktop or server system, although alternative embodiments may be included in a multiprocessor system. System 2200 is an example of a “hub” system architecture. Computer system 2200 includes a processor 2202 that processes data signals. The processor 2202 includes, as an illustrative example, a compound instruction set computer (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, an instruction It includes a processor that implements the set combination, or any other processor device such as a digital signal processor. The processor 2202 is coupled to a processor bus 2210 that transmits data signals between the processor 2202 and other components in the system 2200. Elements of system 2200 (eg, graphics accelerator 2212, memory controller hub 2216, memory 2220, I / O controller hub 2224, wireless transceiver 2226, flash BIOS 2228, network controller 2234, audio controller 2236, serial expansion port 2238, I / O A controller 2240, etc.) performs conventional functions well known to those skilled in the art.

１つの実施形態において、プロセッサ２２０２は、レベル１（Ｌ１）内部キャッシュメモリ２２０４を備える。アーキテクチャに応じて、プロセッサ２２０２は、単一の内部キャッシュ又は複数レベルの内部キャッシュを有することができる。他の実施形態は、特定の実施態様及び需要に応じて、内部キャッシュ及び外部キャッシュの双方の組合せを含む。レジスタファイル２２０６は、整数レジスタ、浮動小数点レジスタ、ベクトルレジスタ、バンクレジスタ、シャドウレジスタ、チェックポイントレジスタ、ステータスレジスタ及び命令ポインタレジスタを含む様々なレジスタに様々なタイプのデータを記憶するためのものである。 In one embodiment, the processor 2202 includes a level 1 (L1) internal cache memory 2204. Depending on the architecture, the processor 2202 may have a single internal cache or multiple levels of internal cache. Other embodiments include a combination of both internal and external caches, depending on the particular implementation and demand. Register file 2206 is for storing various types of data in various registers including integer registers, floating point registers, vector registers, bank registers, shadow registers, checkpoint registers, status registers and instruction pointer registers. .

整数演算及び浮動小数点演算を実行するためのロジックを含む実行ユニット２２０８もプロセッサ２２０２内に存在する。プロセッサ２２０２は、１つの実施形態では、マイクロコードを記憶するためのマイクロコード（μコード）ＲＯＭを含む。このマイクロコードは、実行されると、あるマイクロ命令のためのアルゴリズムを実行するか、又は複雑なシナリオに対処する。ここで、マイクロコードは、プロセッサ２２０２のためのロジックバグ／修正を処理するように潜在的に更新可能である。１つの実施形態について、実行ユニット２２０８は、パックされた命令セット２２０９を処理するためのロジックを含む。パック化された命令セット２２０９を、汎用プロセッサ２２０２の命令セット内に、命令を実行するための関連付けられた回路部と共に含めることによって、多くのマルチメディアアプリケーションによって用いられる演算が、汎用プロセッサ２２０２内のパックされたデータを用いることによって実行され得る。このため、パックされたデータに対し演算を実行するためにプロセッサデータバスの全幅を用いることによって、多くのマルチメディアアプリケーションが加速され、より効率的に実行される。これにより、潜在的には、１以上の演算を実行するために、より小さなデータユニットをプロセッサのデータバスにわたって一度に１つのデータ要素ずつ転送する必要がなくなる。 There is also an execution unit 2208 in the processor 2202 that includes logic for performing integer and floating point operations. The processor 2202, in one embodiment, includes a microcode (μcode) ROM for storing microcode. When executed, this microcode executes an algorithm for a microinstruction or handles complex scenarios. Here, the microcode can potentially be updated to handle logic bugs / fixes for the processor 2202. For one embodiment, execution unit 2208 includes logic for processing packed instruction set 2209. By including the packed instruction set 2209 in the general processor 2202 instruction set with associated circuitry for executing the instructions, the operations used by many multimedia applications are within the general processor 2202. It can be implemented by using packed data. Thus, by using the full width of the processor data bus to perform operations on packed data, many multimedia applications are accelerated and executed more efficiently. This potentially eliminates the need to transfer smaller data units one data element at a time across the processor data bus in order to perform one or more operations.

実行ユニット２２０８の代替的な実施形態は、マイクロコントローラ、埋込みプロセッサ、グラフィックデバイス、ＤＳＰ及び他のタイプのロジック回路において用いることもできる。システム２２００はメモリ２２２０を含む。メモリ２２２０はダイナミックランダムアクセスメモリ（ＤＲＡＭ）デバイス、スタティックランダムアクセスメモリ（ＳＲＡＭ）デバイス、フラッシュメモリデバイス、又は他のメモリデバイスを含む。メモリ２２２０は、プロセッサ２２０２によって実行されるデータ信号によって表される命令及び／又はデータを記憶する。 Alternative embodiments of execution unit 2208 may also be used in microcontrollers, embedded processors, graphics devices, DSPs, and other types of logic circuits. System 2200 includes memory 2220. Memory 2220 includes dynamic random access memory (DRAM) devices, static random access memory (SRAM) devices, flash memory devices, or other memory devices. Memory 2220 stores instructions and / or data represented by data signals executed by processor 2202.

本発明の上述した特徴又は態様のうちの任意のものを、図２２に示す１以上のインターコネクトにおいて利用することができることに留意されたい。例えば、プロセッサ２２０２の内部ユニットを結合するための、示されていないダイ上インターコネクト（ＯＤＩ）は、上記で説明した本発明の１以上の態様を実施する。又は、本発明は、プロセッサバス２２１０（例えば、他の既知の高性能コンピューティングインターコネクト）、メモリ２２２０への高帯域幅メモリパス２２１８、グラフィックアクセラレータ２２１２へのポイントツーポイントリンク（例えば、周辺コンポーネントインターコネクトエクスプレス（ＰＣＩｅ）準拠ファブリック）、コントローラハブインターコネクト２２２２、Ｉ／Ｏ、又は他の示されるコンポーネントを結合するための他のインターコネクト（例えば、ＵＳＢ、ＰＣＩ、ＰＣＩｅ）に関連付けられる。そのようなコンポーネントのいくつかの例は、オーディオコントローラ２２３６、ファームウェアハブ（フラッシュＢＩＯＳ）２２２８、無線送受信機２２２６、データストレージ２２２４、ユーザ入力及びキーボードインタフェース２２４２を含むレガシＩ／Ｏコントローラ２２１０、ユニバーサルシリアルバス（ＵＳＢ）等のシリアル拡張ポート２２３８及びネットワークコントローラ２２３４を含む。データストレージデバイス２２２４は、ハードディスクドライブ、フロッピーディスクドライブ、ＣＤ−ＲＯＭデバイス、フラッシュメモリデバイス又は他のマスストレージデバイスを含むことができる。 It should be noted that any of the features or aspects described above of the present invention can be utilized in one or more interconnects shown in FIG. For example, an on-die interconnect (ODI) not shown for coupling internal units of the processor 2202 implements one or more aspects of the present invention described above. Alternatively, the present invention may include a processor bus 2210 (eg, other known high performance computing interconnects), a high bandwidth memory path 2218 to memory 2220, a point-to-point link to graphics accelerator 2212 (eg, peripheral component interconnect express). (PCIe) compliant fabric), controller hub interconnect 2222, I / O, or other interconnect for coupling other indicated components (eg, USB, PCI, PCIe). Some examples of such components are: audio controller 2236, firmware hub (flash BIOS) 2228, wireless transceiver 2226, data storage 2224, legacy I / O controller 2210 including user input and keyboard interface 2242, universal serial bus A serial expansion port 2238 such as (USB) and a network controller 2234 are included. Data storage device 2224 may include a hard disk drive, floppy disk drive, CD-ROM device, flash memory device, or other mass storage device.

ここで、図２３を参照すると、本発明の一実施形態による第２のシステム２３００のブロック図が示されている。図２３に示すように、マルチプロセッサシステム２３００は、ポイントツーポイントインターコネクトシステムであり、ポイントツーポイントインターコネクト２３５０を介して結合された第１のプロセッサ２３７０及び第２のプロセッサ２３８０を含む。プロセッサ２３７０及び２３８０の各々は、プロセッサのあるバージョンとすることができる。１つの実施形態では、２３５２及び２３５４は、高性能アーキテクチャ等の、シリアルのポイントツーポイントコヒーレントインターコネクトファブリックの一部である。結果として、本発明はＱＰＩアーキテクチャ内で実施することができる。 Referring now to FIG. 23, a block diagram of a second system 2300 according to one embodiment of the present invention is shown. As shown in FIG. 23, the multiprocessor system 2300 is a point-to-point interconnect system and includes a first processor 2370 and a second processor 2380 coupled via a point-to-point interconnect 2350. Each of processors 2370 and 2380 may be a version of the processor. In one embodiment, 2352 and 2354 are part of a serial point-to-point coherent interconnect fabric, such as a high performance architecture. As a result, the present invention can be implemented within a QPI architecture.

２つのみのプロセッサ２３７０、２３８０を用いて示されているが、本発明の範囲はそのように限定されていないことを理解されたい。他の実施形態では、所与のプロセッサ内に１以上の追加のプロセッサが存在し得る。 Although shown with only two processors 2370, 2380, it should be understood that the scope of the invention is not so limited. In other embodiments, there may be one or more additional processors within a given processor.

集積メモリコントローラユニット２３７２を含むプロセッサ２３７０及び集積コントローラユニット２３８２を含むプロセッサ２３８０が示されている。プロセッサ２３７０は、バスコントローラユニットの一部として、ポイントツーポイント（Ｐ−Ｐ）インタフェース２３７６及び２３７８も含み、同様に、第２のプロセッサ２３８０はＰ−Ｐインタフェース２３８６及び２３８８を含む。プロセッサ２３７０、２３８０は、Ｐ−Ｐインタフェース回路２３７８、２３８８を用いてポイントツーポイント（Ｐ−Ｐ）インタフェース２３５０を介して情報を交換することができる。図２３に示すように、ＩＭＣ２３７２及び２３８２は、プロセッサを、それぞれのメモリ、すなわちメモリ２３３２及びメモリ２３３４に結合する。メモリ２３３２及びメモリ２３３４は、それぞれのプロセッサにローカルでアタッチされたメインメモリの一部分とすることができる。 A processor 2370 including an integrated memory controller unit 2372 and a processor 2380 including an integrated controller unit 2382 are shown. The processor 2370 also includes point-to-point (PP) interfaces 2376 and 2378 as part of the bus controller unit, and similarly, the second processor 2380 includes PP interfaces 2386 and 2388. Processors 2370, 2380 can exchange information via point-to-point (PP) interface 2350 using PP interface circuits 2378, 2388. As shown in FIG. 23, IMCs 2372 and 2382 couple processors to their respective memories, namely memory 2332 and memory 2334. Memory 2332 and memory 2334 may be part of main memory that is locally attached to the respective processor.

プロセッサ２３７０、２３８０はそれぞれ、ポイントツーポイントインタフェース回路２３７６、２３９４、２３８６、２３９８を用いて個々のＰ−Ｐインタフェース２３５２、２３５４を介してチップセット２３９０と情報を交換する。また、チップセット２３９０は、高性能グラフィックインターコネクト２３３９に沿ってインタフェース回路２３９２を介して高性能グラフィック回路２３３８と情報を交換する。 Processors 2370 and 2380 exchange information with chipset 2390 via individual PP interfaces 2352 and 2354, respectively, using point-to-point interface circuits 2376, 2394, 2386, and 2398. The chipset 2390 also exchanges information with the high performance graphic circuit 2338 via the interface circuit 2392 along the high performance graphic interconnect 2339.

共有キャッシュ（図示せず）を、プロセッサのいずれかの中に、又は双方のプロセッサの外側であるがＰ−Ｐインターコネクトを介してプロセッサと接続して含めることができ、それによって、プロセッサが低電力モードに入った場合に、いずれか又は双方のプロセッサのローカルキャッシュ情報を共有キャッシュに記憶することができる。 A shared cache (not shown) can be included in either of the processors or outside of both processors but connected to the processor via the PP interconnect, thereby making the processor low power When entering the mode, the local cache information of either or both processors can be stored in the shared cache.

チップセット２３９０は、インタフェース２３９６を介して第１のバス２３１６に結合することができる。１つの実施形態では、第１のバス２３１６は周辺コンポーネントインターコネクト（ＰＣＩ）バス、又はＰＣＩＥｘｐｒｅｓｓバス若しくは別の第３世代Ｉ／Ｏインターコネクトバス等のバスとすることができるが、本発明の範囲はそのように限定されていない。 Chipset 2390 can be coupled to first bus 2316 via interface 2396. In one embodiment, the first bus 2316 may be a peripheral component interconnect (PCI) bus, or a bus such as a PCI Express bus or another third generation I / O interconnect bus, It is not so limited.

図２３に示すように、様々なＩ／Ｏデバイス２３１４が、第１のバス２３１６を第２のバス２３２０に結合するバスブリッジ２３１８と共に第１のバス２３１６に結合される。１つの実施形態では、第２のバス２３２０は、低ピンカウント（ＬＰＣ）バスを含む。１つの実施形態では、例えば、キーボード及び／又はマウス２３２２、通信デバイス２３２７、及び多くの場合に命令／コード及びデータ２３３０を含むディスクドライブ又は他のマスストレージデバイス等のストレージユニット２３２８を含む様々なデバイスが第２のバス２３２０に結合される。更に、第２のバス２３２０に結合されたオーディオＩ／Ｏ２３２４が示される。含まれるコンポーネント及びインターコネクトアーキテクチャが変動する他のアーキテクチャが可能であることに留意されたい。例えば、図２３のポイントツーポイントアーキテクチャの代わりに、システムは、マルチドロップバス又は他のそのようなアーキテクチャを実施することができる。 As shown in FIG. 23, various I / O devices 2314 are coupled to the first bus 2316 along with a bus bridge 2318 that couples the first bus 2316 to the second bus 2320. In one embodiment, the second bus 2320 includes a low pin count (LPC) bus. In one embodiment, various devices including a storage unit 2328 such as, for example, a keyboard and / or mouse 2322, a communication device 2327, and a disk drive or other mass storage device that often includes instructions / codes and data 2330. Are coupled to the second bus 2320. In addition, an audio I / O 2324 coupled to the second bus 2320 is shown. Note that other architectures are possible where the included components and interconnect architectures vary. For example, instead of the point-to-point architecture of FIG. 23, the system can implement a multi-drop bus or other such architecture.

次に図２４を参照すると、本発明によるシステムオンチップ（ＳＯＣ）設計の一実施形態が示される。説明のための特定の例として、ＳＯＣ２４００がユーザ機器（ＵＥ）に含まれる。１つの実施形態では、ＵＥは、ハンドヘルドフォン、スマートフォン、タブレット、超薄型ノートブック、ブロードバンドアダプター付きノートブック、又は任意の他の同様の通信デバイス等の、エンドユーザによって通信に用いられる任意のデバイスを指す。多くの場合、ＵＥは基地局又はノードに接続する。基地局又はノードは、潜在的に、ＧＳＭ（登録商標）ネットワークにおける移動局（ＭＳ）に性質が対応する。 Referring now to FIG. 24, one embodiment of a system on chip (SOC) design according to the present invention is shown. As a specific example for illustration, SOC 2400 is included in the user equipment (UE). In one embodiment, the UE is any device used for communication by the end user, such as a handheld phone, smartphone, tablet, ultra-thin notebook, notebook with broadband adapter, or any other similar communication device. Point to. In many cases, the UE connects to a base station or node. A base station or node potentially corresponds in nature to a mobile station (MS) in a GSM network.

ここで、ＳＯＣ２４００は２つのコア、すなわち２４０６及び２４０７を含む。上記の検討と同様に、コア２４０６及び２４０７は、Ｉｎｔｅｌ（登録商標）アーキテクチャＣｏｒｅ（商標）ベースプロセッサ、ＡｄｖａｎｃｅｄＭｉｃｒｏＤｅｖｉｃｅｓ、Ｉｎｃ．（ＡＭＤ）プロセッサ、ＭＩＰＳベースのプロセッサ、ＡＲＭベースのプロセッサ設計又はそれらの顧客、及びそれらのライセンス又は利用者に適合することができる。コア２４０６及び２４０７は、システム２４００の他の部分と通信するためのバスインタフェースユニット２４０９及びＬ２キャッシュ２４１１に関連付けられたキャッシュコントロール２４０８に結合される。インターコネクト２４１０は、上記で検討したＩＯＳＦ、ＡＭＢＡ又は他のインターコネクト等のオンチップインターコネクトを含む。これらは潜在的に、本明細書において説明した１以上の態様を実施する。 Here, the SOC 2400 includes two cores, 2406 and 2407. Similar to the discussion above, the cores 2406 and 2407 are Intel® Architecture Core ™ -based processors, Advanced Micro Devices, Inc. It can be adapted to (AMD) processors, MIPS based processors, ARM based processor designs or their customers, and their licenses or users. Cores 2406 and 2407 are coupled to a cache control 2408 associated with bus interface unit 2409 and L2 cache 2411 for communicating with the rest of system 2400. Interconnect 2410 includes an on-chip interconnect, such as IOSF, AMBA or other interconnect discussed above. These potentially implement one or more aspects described herein.

インタフェース２４１０は、加入者アイデンティティモジュール（ＳＩＭ）カードとインタフェースするためのＳＩＭ２３４０、ＳＯＣ２４００を初期化及びブートするためのコア２４０６及び２４０７によって実行されるブートコードを保持するブートＲＯＭ２４３５、外部メモリ（例えば、ＤＲＡＭ２４６０）とインタフェースするためのＳＤＲＡＭコントローラ２４４０、不揮発性メモリ（例えば、フラッシュ２４６５）とインタフェースするためのフラッシュコントローラ２４４５、周辺機器とインタフェースするための周辺制御２４５０（例えば、シリアル周辺インタフェース）、ビデオコーデック２４２０、及び入力（例えば、タッチ有効入力）を表示し受信するためのビデオインタフェース２４２５、グラフィック関連計算を実行するためのＧＰＵ２４１５等の、他のコンポーネントへの通信チャネルを提供する。これらのインタフェースの任意のものが、本明細書に記載の本発明の態様を組み込むことができる。 The interface 2410 includes a boot ROM 2435 that holds boot code executed by the SIM 2340 for interfacing with a subscriber identity module (SIM) card, cores 2406 and 2407 for initializing and booting the SOC 2400, external memory (eg, DRAM 2460). SDRAM controller 2440 for interfacing with non-volatile memory (eg flash 2465), peripheral controller 2450 for interfacing with peripheral devices (eg serial peripheral interface), video codec 2420, And a video interface 2425 for displaying and receiving input (eg, touch-enabled input), graphics related calculations Of GPU2415 like for row, it provides a communication channel to the other components. Any of these interfaces can incorporate aspects of the invention described herein.

更に、システムは、Ｂｌｕｅｔｏｏｔｈ（登録商標）モジュール２４７０、３Ｇモデム２４７５、ＧＰＳ２４８５及びＷｉＦｉ２４８５等の通信のための周辺機器を示す。上記で示したように、ＵＥは通信のための無線機を含むことに留意されたい。結果として、これらの周辺通信モジュールは全てが必要とされるわけではない。一方、ＵＥにおいて、外部通信のための何らかの形態の無線機が含まれる。 In addition, the system shows peripherals for communication such as the Bluetooth® module 2470, 3G modem 2475, GPS 2485, and WiFi 2485. Note that as indicated above, the UE includes a radio for communication. As a result, not all of these peripheral communication modules are required. Meanwhile, some form of radio for external communication is included in the UE.

本発明を、限られた数の実施形態に関して説明してきたが、当業者であれば、ここからの多数の変更及び変形を理解するであろう。添付の特許請求の範囲は、本発明の真の趣旨及び範囲内にある全ての変更及び変形をカバーすることが意図される。 Although the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. The appended claims are intended to cover all modifications and variations that fall within the true spirit and scope of the invention.

設計は、作成からシミュレーション、そして製造へと様々な段階を経ることができる。設計を表すデータは、複数の方式で設計を表すことができる。第１に、シミュレーションにおいて有用であるように、ハードウェアは、ハードウェア記述言語又は別の機能記述言語を用いて表すことができる。更に、ロジック及び／又はトランジスタゲートを有する回路レベルモデルを、設計プロセスのいくつかの段階で生成することができる。更に、ほとんどの設計は、いくつかの段階において、ハードウェアモデル内の様々なデバイスの物理的配置を表すデータレベルに到達する。従来の半導体製造技法が用いられる場合、ハードウェアモデルを表すデータは、集積回路を作製するのに用いられるマスクのための様々なマスク層における様々な特徴の存否を指定するデータとすることができる。設計の任意の表現において、データは任意の形態の機械可読媒体に記憶することができる。ディスク等のメモリ又は磁気若しくは光学ストレージは、そのような情報を送信するように変調されるか又は他の形で生成された光波又は電波を介して送信される情報を記憶するための機械可読媒体とすることができる。コード又は設計を示すか又は搬送する電気搬送波が送信されるとき、電気信号のコピー、バッファ又は再送が行われる限り、新たなコピーが作製される。このため、通信プロバイダ又はネットワークプロバイダは、有形機械可読媒体上に少なくとも一時的に、本発明の実施形態の技法を具現化する、搬送波にエンコードされた情報等を記憶することができる。 Design can go through various stages, from creation to simulation and manufacturing. Data representing a design can represent the design in multiple ways. First, as useful in simulation, hardware can be represented using a hardware description language or another functional description language. In addition, circuit level models with logic and / or transistor gates can be generated at several stages of the design process. In addition, most designs reach a data level that represents the physical arrangement of various devices in the hardware model in several stages. When conventional semiconductor manufacturing techniques are used, the data representing the hardware model can be data specifying the presence or absence of various features in various mask layers for the mask used to fabricate the integrated circuit. . In any representation of the design, the data can be stored on any form of machine-readable medium. Memory such as a disk or magnetic or optical storage is a machine-readable medium for storing information that is modulated to transmit such information or otherwise transmitted via light waves or radio waves It can be. When an electrical carrier is sent that indicates or carries a code or design, a new copy is made as long as the electrical signal is copied, buffered, or retransmitted. Thus, a communication provider or network provider can store information encoded in a carrier wave or the like that at least temporarily embodies the techniques of embodiments of the present invention on a tangible machine readable medium.

本明細書において用いられるモジュールは、ハードウェア、ソフトウェア及び／又はファームウェアの任意の組合せを指す。例として、モジュールは、マイクロコントローラによって実行されるように適合されたコードを記憶する非一時的媒体に関連付けられたマイクロコントローラ等のハードウェアを含む。したがって、モジュールへの言及は、１つの実施形態では、非一時的媒体上で保持されるコードを認識及び／又は実行するように特に構成されたハードウェアを指す。更に、別の実施形態では、モジュールの使用は、マイクロコントローラによって、所定の動作を行うために実行されるように特に適合されたコードを含む非一時的媒体を指す。そして、推測することができるように、更に別の実施形態では、モジュールという用語は（この例では）、マイクロコントローラと非一時的媒体との組合せを指すことができる。多くの場合、別個であるものとして示されるモジュール境界は、共通して変動し、潜在的に重複する。例えば、第１のモジュール及び第２のモジュールは、いくらかの独立したハードウェア、ソフトウェア又はファームウェアを潜在的に保持しながら、ハードウェア、ソフトウェア、ファームウェア、又はそれらの組合せを共有する場合がある。１つの実施形態では、ロジックという用語の使用は、トランジスタ、レジスタ、又はプログラム可能な論理デバイス等の他のハードウェアを含む。 As used herein, a module refers to any combination of hardware, software, and / or firmware. By way of example, a module includes hardware such as a microcontroller associated with a non-transitory medium that stores code adapted to be executed by the microcontroller. Thus, a reference to a module, in one embodiment, refers to hardware that is specifically configured to recognize and / or execute code held on non-transitory media. Furthermore, in another embodiment, the use of a module refers to a non-transitory medium that includes code that is specifically adapted to be executed by a microcontroller to perform a predetermined operation. And, as can be inferred, in yet another embodiment, the term module (in this example) can refer to a combination of a microcontroller and a non-transitory medium. In many cases, the module boundaries shown as being distinct vary in common and potentially overlap. For example, the first module and the second module may share hardware, software, firmware, or a combination thereof, while potentially holding some independent hardware, software, or firmware. In one embodiment, use of the term logic includes transistors, registers, or other hardware such as programmable logic devices.

「〜ように構成される」というフレーズの使用は、１つの実施形態では、装置、ハードウェア、ロジック又は要素を、指定又は決定されたタスクを実行するように、配置、組立て、製造、販売のオファー、インポート及び／又は設計することを指す。この例では、動作していない装置又はその要素は、依然として、指定されたタスクを行うように設計、結合及び／又は相互接続された場合に、この指定されたタスクを行う「ように構成されている」。単なる説明のための例として、論理ゲートは、動作中に０又は１を提供することができる。しかし、クロックにイネーブル信号を提供する「ように構成された」論理ゲートは、１又は０を提供することができる全ての潜在的な論理ゲートを含むわけではない。代わりに、論理ゲートは、何らかの方式で、動作中、１又は０の出力がクロックをイネーブルするように結合されている。ここでも、「〜ように構成される」という用語は、動作を必要とするものではなく、代わりに、装置、ハードウェア及び／又は要素の潜在状態に焦点を当てることに留意されたい。潜在状態において、装置、ハードウェア及び／又は要素は、装置、ハードウェア及び／又は要素が動作しているときに特定のタスクを実行するように設計される。 The use of the phrase “configured as” is, in one embodiment, the arrangement, assembly, manufacture, and sale of equipment, hardware, logic, or elements to perform a designated or determined task. Refers to offer, import and / or design. In this example, a non-operating device or element thereof is still configured to perform this specified task when it is designed, coupled and / or interconnected to perform the specified task. " By way of example only, logic gates can provide 0 or 1 during operation. However, logic gates “configured to” provide an enable signal to the clock do not include all potential logic gates that can provide 1 or 0. Instead, the logic gates are coupled in some manner such that during operation, an output of 1 or 0 enables the clock. Again, it should be noted that the term “configured as” does not require action, but instead focuses on the latent state of the device, hardware and / or elements. In a latent state, a device, hardware and / or element is designed to perform a specific task when the device, hardware and / or element is operating.

更に、「〜する」、「〜することができる」及び／又は「〜するように動作可能である」というフレーズの使用は、１つの実施形態では、装置、ロジック、ハードウェア及び／又は要素の使用を特定の方式で可能にするように設計された何らかの装置、ロジック、ハードウェア及び／又は要素を指す。上記と同様に、１つの実施形態では、「〜する」、「〜することができる」又は「〜するように動作可能である」の使用は、装置、ロジック、ハードウェア及び／又は要素の潜在的な状態を指し、ここで、装置、ロジック、ハードウェア及び／又は要素は、動作していないが、特定の方式で装置の使用を可能にするように設計されている。 Further, the use of the phrases “to do”, “can do” and / or “operable to do” in one embodiment is the use of devices, logic, hardware and / or elements. Any device, logic, hardware, and / or element designed to allow use in a particular manner. As above, in one embodiment, the use of “do”, “can do” or “operable to do” is the potential of the device, logic, hardware and / or elements. Where the device, logic, hardware and / or elements are not operating, but are designed to allow the device to be used in a particular manner.

本明細書において用いられるとき、値は、数、状態、論理状態又はバイナリ論理状態の任意の既知の表現を含む。多くの場合、ロジックレベル、ロジック値又は論理値の使用も１及び０として言及され、これは単にバイナリロジック状態を表す。例えば、１は高ロジックレベルを指し、０は低ロジックレベルを指す。１つの実施形態では、トランジスタ又はフラッシュセル等のストレージセルは、単一の論理値又は複数の論理値を保持することを可能とすることができる。一方、コンピュータシステムにおける値の他の表現が用いられている。例えば、１０進数の１０は、バイナリ値１０１０及び１６進数の文字Ａとして表すこともできる。したがって、値は、コンピュータシステム内に保持されることが可能な情報の任意の表現を含む。 As used herein, a value includes any known representation of a number, state, logic state or binary logic state. In many cases, the use of logic levels, logic values or logic values is also referred to as 1 and 0, which simply represents a binary logic state. For example, 1 refers to a high logic level and 0 refers to a low logic level. In one embodiment, a storage cell such as a transistor or flash cell may be capable of holding a single logic value or multiple logic values. On the other hand, other representations of values in computer systems are used. For example, the decimal number 10 can also be represented as the binary value 1010 and the hexadecimal character A. Thus, a value includes any representation of information that can be held in a computer system.

更に、状態は、値又は値の一部分によって表すことができる。例として、論理１等の第１の値は、デフォルト状態又は初期状態を表すことができるのに対し、論理ゼロ等の第２の値は、非デフォルト状態を表すことができる。更に、リセット及びセットという用語は、１つの実施形態では、それぞれ、デフォルトの値又は状態及び更新された値又は状態を指す。例えば、デフォルト値は潜在的に高い論理値、すなわちリセットを含むのに対し、更新された値は潜在的に低い論理値、すなわちセットを含む。値の任意の組合せを利用して任意の数の状態を表すことができることに留意されたい。 Further, a state can be represented by a value or a portion of a value. By way of example, a first value such as logic 1 can represent a default or initial state, while a second value such as logic zero can represent a non-default state. Further, the terms reset and set refer to a default value or state and an updated value or state, respectively, in one embodiment. For example, the default value includes a potentially high logic value, ie reset, while the updated value includes a potentially low logic value, ie set. Note that any number of states can be represented using any combination of values.

上記で示した方法、ハードウェア、ソフトウェア、ファームウェア又はコードの実施形態は、処理要素によって実行可能な、機械アクセス可能、機械可読、コンピュータアクセス可能、又はコンピュータ可読媒体上に記憶された命令又はコードにより実施することができる。非一時的機械アクセス可能／可読媒体は、コンピュータ又は電子システム等の機械によって読出し可能な形態で情報を提供（すなわち、記憶及び／又は送信）する任意のメカニズムを含む。例えば、非一時的機械アクセス可能媒体は、スタティックＲＡＭ（ＳＲＡＭ）又はダイナミックＲＡＭ（ＤＲＡＭ）等のランダムアクセスメモリ（ＲＡＭ）、ＲＯＭ、磁気又は光ストレージ媒体、フラッシュメモリデバイス、電気ストレージデバイス、光ストレージデバイス、音響ストレージデバイス、一時的（伝播）信号（例えば、搬送波、赤外線信号、デジタル信号）から受信した情報を保持するための他の形態のストレージデバイス等を含む。これらは、そこから情報を受信することができる非一時的媒体と区別される。 Embodiments of the method, hardware, software, firmware, or code shown above are by machine-accessible, machine-readable, computer-accessible, or instructions or code stored on a computer-readable medium that can be executed by a processing element. Can be implemented. A non-transitory machine-accessible / readable medium includes any mechanism that provides (ie, stores and / or transmits) information in a form readable by a machine, such as a computer or electronic system. For example, the non-transitory machine accessible medium is a random access memory (RAM) such as static RAM (SRAM) or dynamic RAM (DRAM), ROM, magnetic or optical storage medium, flash memory device, electrical storage device, optical storage device. , Acoustic storage devices, other forms of storage devices for holding information received from temporary (propagation) signals (eg, carrier waves, infrared signals, digital signals), and the like. These are distinguished from non-transitory media from which information can be received.

本発明の実施形態を実行するためのロジックをプログラムするのに用いられる命令は、ＤＲＡＭ、キャッシュ、フラッシュメモリ又は他のストレージ等の、システムのメモリ内に記憶することができる。更に、命令は、ネットワークを介して、又は他のコンピュータ可読媒体によって分配することができる。このため、機械可読媒体は、機械（例えば、コンピュータ）によって読出し可能な形態の情報を記憶又は送信するための任意のメカニズムを含むことができ、任意のメカニズムは、限定ではないが、フロッピーディスケット、光ディスク、コンパクトディスク読出し専用メモリ（ＣＤ−ＲＯＭ）及び磁気−光ディスク、読出し専用メモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、消去可能プログラム可能読出し専用メモリ（ＥＰＲＯＭ）、電気的に消去可能なプログラム可能読出し専用メモリ（ＥＥＰＲＯＭ）、磁気又は光カード、フラッシュメモリ、又は、電気、光、音響若しくは他の形式の伝播信号（例えば、搬送波、赤外線信号、デジタル信号等）によりインターネットを介した情報の送信に使用される有形機械可読ストレージである。したがって、コンピュータ可読媒体は、機械（例えば、コンピュータ）によって読出し可能な形態で電子命令又は情報を記憶又は送信するのに適した任意のタイプの有形機械可読媒体を含む。 The instructions used to program the logic to perform the embodiments of the present invention may be stored in system memory, such as DRAM, cache, flash memory or other storage. Further, the instructions can be distributed over a network or by other computer readable media. Thus, a machine-readable medium can include any mechanism for storing or transmitting information in a form readable by a machine (eg, a computer), including, but not limited to, a floppy diskette, Optical disk, compact disk read only memory (CD-ROM) and magnetic optical disk, read only memory (ROM), random access memory (RAM), erasable programmable read only memory (EPROM), electrically erasable programmable For transmitting information over the Internet with read-only memory (EEPROM), magnetic or optical cards, flash memory, or electrical, optical, acoustic or other forms of propagation signals (eg carrier waves, infrared signals, digital signals, etc.) Tangible machine-readable stray used It is. Accordingly, a computer readable medium includes any type of tangible machine readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (eg, a computer).

以下の実施例は、本明細書による実施形態に関する。１以上の実施形態は、物理リンクの１以上のデータレーン上でデータを受信し、物理リンクの別のレーン上で、有効なデータが１以上のデータレーン上で有効信号のアサートに続くことを特定する有効信号を受信し、物理リンクの別のレーン上で、１以上のデータレーン上のデータのタイプを特定するストリーム信号を受信するための、装置、システム、機械可読ストレージ、機械可読媒体、ハードウェア及び／又はソフトウェアベースのロジック、並びに方法を提供することができる。 The following examples relate to embodiments according to the present description. One or more embodiments may receive data on one or more data lanes of a physical link, and on another lane of the physical link, valid data may follow assertion of a valid signal on one or more data lanes. An apparatus, system, machine readable storage, machine readable medium, for receiving a valid signal to identify and receiving a stream signal identifying a type of data on one or more data lanes on another lane of a physical link, Hardware and / or software based logic and methods can be provided.

少なくとも１つの例では、物理層ロジックは更に、物理リンクの別のレーンを介してリンク状態機械管理信号を送信する。 In at least one example, the physical layer logic further transmits a link state machine management signal over another lane of the physical link.

少なくとも１つの例では、物理層ロジックは更に、サイドバンドリンクを介してサイドバンド信号を送信する。 In at least one example, the physical layer logic further transmits a sideband signal over the sideband link.

少なくとも１つの例では、タイプはデータに関連付けられたプロトコルを含み、プロトコルは、物理リンクを利用する複数のプロトコルのうちの１つである。 In at least one example, the type includes a protocol associated with the data, and the protocol is one of a plurality of protocols that utilize physical links.

少なくとも１つの例では、タイプはリンク層パケットデータを含む。 In at least one example, the type includes link layer packet data.

少なくとも１つの例では、データは物理リンクのためのリンク状態遷移を促進する。 In at least one example, the data facilitates link state transitions for the physical link.

少なくとも１つの例では、物理層ロジックは更に、ストリーム信号をデコードして、複数の異なるプロトコルのうちのいずれをデータに適用するかを特定する。 In at least one example, the physical layer logic further decodes the stream signal to identify which of a plurality of different protocols to apply to the data.

少なくとも１つの例では、物理層ロジックは更に、ストリーム信号において特定された複数のプロトコルのうちの特定の１つに対応する上位層プロトコルロジックにデータを渡す。 In at least one example, the physical layer logic further passes data to higher layer protocol logic corresponding to a particular one of the plurality of protocols identified in the stream signal.

少なくとも１つの例では、装置は、物理層ロジックに加えて、複数のプロトコルの各々のリンク層ロジック及び他の上位層ロジックを含む。 In at least one example, the device includes, in addition to physical layer logic, each link layer logic and other upper layer logic of the plurality of protocols.

少なくとも１つの例では、複数のプロトコルは、周辺コンポーネントインターコネクト（ＰＣＩ）、ＰＣＩＥｘｐｒｅｓｓ（ＰＣＩｅ）、Ｉｎｔｅｌのダイ内インターコネクト（ＩＤＩ）及びクイックパスインターコネクト（ＱＰＩ）のうちの少なくとも２つを含む。 In at least one example, the plurality of protocols includes at least two of a peripheral component interconnect (PCI), a PCI Express (PCIe), an Intel intra-die interconnect (IDI), and a quick path interconnect (QPI).

少なくとも１つの例では、物理層ロジックは更に、複数のプロトコルの各々におけるエラーを判断する。 In at least one example, the physical layer logic further determines an error in each of the plurality of protocols.

少なくとも１つの例では、物理層ロジックは更に、有効信号及びストリーム信号のうちの１以上におけるエラーを判断する。 In at least one example, the physical layer logic further determines an error in one or more of the valid signal and the stream signal.

少なくとも１つの例では、物理層ロジックは更に、データレーン上で送信されるデータのデータウィンドウを定義し、データウィンドウは有効信号に対応する。 In at least one example, the physical layer logic further defines a data window of data transmitted on the data lane, the data window corresponding to a valid signal.

少なくとも１つの例では、データウィンドウはデータシンボルに対応し、有効信号は、データが送信されることになるウィンドウの直前のウィンドウにおいてアサートされる。 In at least one example, the data window corresponds to a data symbol and the valid signal is asserted in the window immediately preceding the window in which data is to be transmitted.

少なくとも１つの例では、データは、有効信号がアサートされない先行するウィンドウの直後のウィンドウ内のデータレーンにおいて無視される。 In at least one example, the data is ignored in the data lane in the window immediately following the previous window where no valid signal is asserted.

少なくとも１つの例では、ウィンドウはバイト期間に対応する。 In at least one example, the window corresponds to a byte period.

少なくとも１つの例では、有効信号、データ及びストリーム信号の各々が物理リンクのために定義されたデータウィンドウに従ってアラインされる。 In at least one example, each of the valid signal, data, and stream signal is aligned according to a data window defined for the physical link.

少なくとも１つの例では、ストリーム信号は同じウィンドウ中にデータとして送信される。 In at least one example, the stream signal is transmitted as data in the same window.

少なくとも１つの例では、物理リンクは２つのデバイスをマルチチップパッケージにおいて接続する。 In at least one example, a physical link connects two devices in a multichip package.

少なくとも１つの例では、物理層ロジックは更に、物理リンクのレーンにおいて信号を再センタリングする。 In at least one example, the physical layer logic further re-centers the signal in the lane of the physical link.

少なくとも１つの例では、レーンは有効信号に基づいて再センタリングされる。 In at least one example, the lane is re-centered based on the valid signal.

少なくとも１つの例では、第２のウィンドウ中のデータリンクの専用ストリーム信号レーン上でストリーム信号が受信され、ストリーム信号がデコードされ、ストリーム信号のデコードから、データに関連付けられたプロトコルが決定される。 In at least one example, a stream signal is received on a dedicated stream signal lane of the data link in the second window, the stream signal is decoded, and from the decoding of the stream signal, the protocol associated with the data is determined.

少なくとも１つの例では、データリンクは、複数の異なるプロトコルのデータを送信するように適合される。 In at least one example, the data link is adapted to transmit data of a plurality of different protocols.

少なくとも１つの例では、データリンク上で送信されるデータが特定され、送信されるデータに対応する特定のウィンドウ中に、データリンクの発信有効信号レーン上で有効信号が送信され、特定のウィンドウの直後の別のウィンドウ中に専用発信データレーン上でデータが送信される。 In at least one example, data to be transmitted on a data link is identified, and during a particular window corresponding to the data to be transmitted, a valid signal is transmitted on the outgoing valid signal lane of the data link, Data is transmitted on a dedicated outgoing data lane in another window immediately after.

少なくとも１つの例では、複数のデータレーンは、専用リンク状態機械サイドバンドレーンを更に含む。 In at least one example, the plurality of data lanes further includes a dedicated link state machine sideband lane.

少なくとも１つの例では、第１のデバイスはパッケージ内の第１のダイを含み、第２のデバイスはパッケージ内の第２のダイを含む。 In at least one example, the first device includes a first die in the package and the second device includes a second die in the package.

少なくとも１つの例では、第１のデバイスはオンパッケージデバイスを含み、第２のデバイスはオフパッケージデバイスを含む。 In at least one example, the first device includes an on-package device and the second device includes an off-package device.

１以上の実施形態は、データリンクの専用データレーン上で送信されるデータを特定し、データリンク上で送信されるデータに対応する特定のウィンドウ中にデータリンクの専用有効信号レーン上で有効信号を送信し、特定のウィンドウの直後の別のウィンドウ中にデータリンクの専用データレーン上でデータを送信し、データのタイプを特定するようにエンコードされたストリーム信号リンク上でストリーム信号を送信するための、装置、システム、機械可読ストレージ、機械可読媒体、ハードウェア及び／又はソフトウェアベースのロジック、並びに方法を提供することができる。 One or more embodiments identify data to be transmitted on a dedicated data lane of the data link, and a valid signal on the dedicated valid signal lane of the data link during a particular window corresponding to the data transmitted on the data link. To transmit data on a dedicated data lane of a data link during another window immediately after a specific window, and to transmit a stream signal on a stream signal link encoded to identify the type of data Devices, systems, machine readable storage, machine readable media, hardware and / or software based logic, and methods can be provided.

少なくとも１つの例では、有効信号は、他のウィンドウ中のデータレーン上のデータが有効なデータであることを示す。 In at least one example, the valid signal indicates that the data on the data lane in the other window is valid data.

少なくとも１つの例では、ストリーム信号リンクは専用ストリーム信号リンクを含む。 In at least one example, the stream signal link includes a dedicated stream signal link.

少なくとも１つの例では、ストリーム信号は、特定のプロトコルがデータに関連付けられているか否かを特定するように適合される。 In at least one example, the stream signal is adapted to identify whether a particular protocol is associated with the data.

少なくとも１つの例では、物理層ロジックは共通物理層に含まれ、複数のプロトコルがこの共通物理層を利用し、特定のプロトコルが複数のプロトコルに含まれる。 In at least one example, physical layer logic is included in a common physical layer, multiple protocols utilize this common physical layer, and a particular protocol is included in multiple protocols.

少なくとも１つの例では、複数のプロトコルは、ＰＣＩ、ＰＣＩｅ、ＩＤＩ及びＱＰＩのうちの２つ以上を含む。 In at least one example, the plurality of protocols includes two or more of PCI, PCIe, IDI, and QPI.

少なくとも１つの例では、ストリーム信号は、データがリンク層パケットを含むか否かを特定するように更に適合される。 In at least one example, the stream signal is further adapted to identify whether the data includes a link layer packet.

少なくとも１つの例では、ストリーム信号は、データがサイドバンドデータであるか否かを特定するように更に適合される。 In at least one example, the stream signal is further adapted to identify whether the data is sideband data.

少なくとも１つの例では、物理層ロジックは更に、データタイプを決定し、決定されたタイプを特定するストリーム信号をエンコードする。 In at least one example, the physical layer logic further determines a data type and encodes a stream signal that identifies the determined type.

少なくとも１つの例では、物理層ロジックは更に、データリンクの専用ＬＳＭ＿ＳＢレーン上でリンク状態機械サイドバンド（ＬＳＭ＿ＳＢ）信号を送信する。 In at least one example, the physical layer logic further transmits a link state machine sideband (LSM_SB) signal on a dedicated LSM_SB lane of the data link.

少なくとも１つの例では、物理層ロジックは更に、データリンクと別個のサイドバンドリンク上でサイドバンド信号を送信する。 In at least one example, the physical layer logic further transmits sideband signals on a sideband link that is separate from the data link.

少なくとも１つの例では、物理層ロジックは更に、データレーン上でリンク層データを送信し、リンク層データは、第１のリンク状態から第２のリンク状態にデータリンクを遷移させるのに用いられる。 In at least one example, the physical layer logic further transmits link layer data on the data lane, and the link layer data is used to transition the data link from the first link state to the second link state.

少なくとも１つの例では、第１のリンク状態はアクティブリンク状態を含み、第２のリンク状態は低電力リンク状態を含む。 In at least one example, the first link state includes an active link state and the second link state includes a low power link state.

少なくとも１つの例では、物理層ロジックは更に、有効信号に対応する第１のデータウィンドウを特定し、第１のデータウィンドウの直後の第２のデータウィンドウ内のデータレーン上でデータを送信する。 In at least one example, the physical layer logic further identifies a first data window corresponding to the valid signal and transmits data on a data lane in a second data window immediately following the first data window.

少なくとも１つの例では、有効信号をアサートしないことは、直後のウィンドウ内のデータレーン上のデータが無効であるとして無視されることを示す。 In at least one example, not asserting a valid signal indicates that data on the data lane in the immediately following window is ignored as invalid.

少なくとも１つの例では、第１のデータウィンドウ及び第２のデータウィンドウの各々が、バイト期間に対応するように定義される。 In at least one example, each of the first data window and the second data window is defined to correspond to a byte period.

少なくとも１つの例では、有効信号、データ及びストリーム信号の各々が、物理リンクについて定義されたデータウィンドウに従ってアラインされる。 In at least one example, each of the valid signal, data, and stream signal is aligned according to a data window defined for the physical link.

少なくとも１つの例では、ストリーム信号は、データと同じウィンドウ中に送信される。 In at least one example, the stream signal is transmitted in the same window as the data.

少なくとも１つの例では、物理層ロジックは更に、有効信号及びストリーム信号の各々を生成する。 In at least one example, the physical layer logic further generates each of a valid signal and a stream signal.

少なくとも１つの例では、データリンクは、マルチチップパッケージ内の２つのデバイスを接続する。 In at least one example, the data link connects two devices in a multi-chip package.

少なくとも１つの例では、データリンクは８Ｇｂ／ｓを超えるデータ速度をサポートする。 In at least one example, the data link supports data rates in excess of 8 Gb / s.

１以上の実施形態は、複数のデータレーン、１以上の有効信号レーン、１以上のストリームレーンを含む複数のレーンを含むデータリンク上でシングルエンドシグナリングを提供し、複数のレーンによって用いるためのクロック信号を分配するための、装置、システム、機械可読ストレージ、機械可読媒体、ハードウェア及び／又はソフトウェアベースのロジック、並びに方法を提供することができる。ここで、複数のレーンの各々において送信される信号は、クロック信号にアラインされる。 One or more embodiments provide single-ended signaling on a data link that includes multiple lanes, including multiple data lanes, one or more valid signal lanes, and one or more stream lanes, and a clock for use by the multiple lanes Devices, systems, machine readable storage, machine readable media, hardware and / or software based logic, and methods for distributing signals may be provided. Here, the signal transmitted in each of the plurality of lanes is aligned with the clock signal.

少なくとも１つの例では、データレーンの各々は、調整された電圧に中間レール終端される。 In at least one example, each of the data lanes is intermediate rail terminated to a regulated voltage.

少なくとも１つの例では、調整された電圧は、単一の電圧レギュレータによって複数のデータレーンの各々に提供される。 In at least one example, the regulated voltage is provided to each of the plurality of data lanes by a single voltage regulator.

少なくとも１つの例では、調整された電圧は、実質的にＶｃｃ／２に等しく、ここで、Ｖｃｃは供給電圧を含む。 In at least one example, the regulated voltage is substantially equal to Vcc / 2, where Vcc includes the supply voltage.

少なくとも１つの例では、物理層ロジックは、複数のデータレーンのうちの２つ以上の間のクロストークキャンセルを提供することを試みる。 In at least one example, physical layer logic attempts to provide crosstalk cancellation between two or more of the plurality of data lanes.

少なくとも１つの例では、クロストークキャンセルは、２つ以上のデータレーンのうちの第１のものにおける重み付き高域通過フィルタリングされたアグレッサ信号を、２つ以上のデータレーンのうちの第２のものの信号に加えることによって提供される。 In at least one example, the crosstalk cancellation may be a weighted high pass filtered aggressor signal in the first one of the two or more data lanes and the second one of the two or more data lanes. Provided by adding to the signal.

少なくとも１つの例では、物理層ロジックは、少なくとも部分的にレジスタ−キャパシタ（ＲＣ）低域通過フィルタを用いて重み付き高域通過フィルタリングされたアグレッサ信号を生成する。 In at least one example, the physical layer logic generates a weighted high pass filtered aggressor signal using at least partially a resistor-capacitor (RC) low pass filter.

少なくとも１つの例では、物理層ロジックは、ビットごとのデューティサイクル補正を提供する。 In at least one example, the physical layer logic provides bit-by-bit duty cycle correction.

少なくとも１つの例では、物理層ロジックは、データレーンのうちの少なくとも特定の１つにおけるスキューを検出し、この特定のデータレーンをデスキューする。 In at least one example, physical layer logic detects skew in at least a particular one of the data lanes and deskews this particular data lane.

少なくとも１つの例では、物理層ロジックは更に、データレーンのうちの少なくとも１つにＡＣデータバス反転（ＤＢＩ）を適用する。 In at least one example, the physical layer logic further applies AC data bus inversion (DBI) to at least one of the data lanes.

少なくとも１つの例では、クロック信号は、半レート進められたクロック信号を含む。 In at least one example, the clock signal includes a half-rate advanced clock signal.

少なくとも１つの例では、物理層ロジックは更に、静電放電保護を提供する。 In at least one example, the physical layer logic further provides electrostatic discharge protection.

少なくとも１つの例では、物理層ロジックは、少なくとも部分的にハードウェア回路を通じて実装される。 In at least one example, physical layer logic is implemented at least partially through hardware circuitry.

少なくとも１つの例では、有効信号は、有効信号レーン上で送信され、各有効信号は、有効なデータが複数のデータレーン上の有効信号のアサートに続くことを特定し、ストリーム信号は、ストリーム信号レーン上で送信され、各ストリーム信号は、１以上のデータレーン上のデータのタイプを特定する。 In at least one example, valid signals are transmitted on valid signal lanes, each valid signal specifies that valid data follows the assertion of valid signals on multiple data lanes, and the stream signal is a stream signal. Each stream signal transmitted on a lane identifies the type of data on one or more data lanes.

少なくとも１つの例では、データリンクは、８Ｇｂ／ｓを超えるデータ速度をサポートする。 In at least one example, the data link supports data rates in excess of 8 Gb / s.

１以上の実施形態は、複数のデータレーンを含む複数のレーン、１以上の有効信号レーン、１以上のストリームレーン、及び１以上のリンク状態機械サイドバンドレーンを含むデータリンク上でシングルエンドシグナリングを提供し、複数のレーンによって用いるためのクロック信号を分配するための、装置、システム、機械可読ストレージ、機械可読媒体、ハードウェア及び／又はソフトウェアベースのロジック、並びに方法を提供することができる。複数のレーンの各々において送信される信号はクロック信号とアラインされ、複数のデータレーンのうちの２つ以上の間でクロストークキャンセルを提供し、データリンクのためのビットごとのデューティサイクル補正を提供し、ここで、データレーンの各々は調整された電圧に中間レール終端される。 One or more embodiments provide single-ended signaling on a data link including multiple lanes including multiple data lanes, one or more valid signal lanes, one or more stream lanes, and one or more link state machine sideband lanes. Devices, systems, machine readable storage, machine readable media, hardware and / or software based logic, and methods can be provided for providing and distributing clock signals for use by multiple lanes. The signal transmitted in each of the multiple lanes is aligned with the clock signal, providing crosstalk cancellation between two or more of the multiple data lanes and providing bit-by-bit duty cycle correction for the data link Here, each of the data lanes is intermediate rail terminated to a regulated voltage.

少なくとも１つの例では、物理層ロジックは更に、データレーンのうちの少なくとも１つのスキューを検出し、特定のデータレーンをデスキューする。 In at least one example, the physical layer logic further detects skew of at least one of the data lanes and deskews the particular data lane.

１以上の実施形態は、複数の階層化プロトコルの各々のそれぞれの上位層がアクティブリンク状態から低電力リンク状態へのデータリンクの遷移をリクエストしていることを特定し、複数の階層化プロトコルの各々の上位層が低電力リンク状態への遷移をリクエストしていることを特定することに基づいて、データリンクをアクティブリンク状態から低電力リンク状態に遷移させるための、装置、システム、機械可読ストレージ、機械可読媒体、ハードウェア及び／又はソフトウェアベースのロジック、並びに方法を提供することができる。複数の階層化プロトコルの各々は、データリンクを物理層として利用する。 One or more embodiments identify that each upper layer of each of the plurality of layered protocols requests a data link transition from an active link state to a low power link state, Apparatus, system, and machine readable storage for transitioning a data link from an active link state to a low power link state based on identifying that each upper layer is requesting a transition to a low power link state , Machine-readable media, hardware and / or software-based logic, and methods can be provided. Each of the plurality of layered protocols uses a data link as a physical layer.

少なくとも１つの例では、物理層ロジックは更に、別のデバイスとのハンドシェイクに参加し、データリンクに、低電力リンク状態に遷移させる。 In at least one example, the physical layer logic further participates in a handshake with another device and transitions the data link to a low power link state.

少なくとも１つの例では、ハンドシェイクはリンク層ハンドシェイクを含む。 In at least one example, the handshake includes a link layer handshake.

少なくとも１つの例では、物理層ロジックは、データリンクがアクティブリンク状態である間、リンク層ハンドシェイクにおいてリンク層データを送信する。 In at least one example, physical layer logic transmits link layer data in a link layer handshake while the data link is in an active link state.

少なくとも１つの例では、物理層ロジックは、リンク層データと実質的に同時にストリーム信号を送信し、データリンクのデータ層上で送信されるデータがリンク層パケットを含むことを特定する。 In at least one example, the physical layer logic transmits the stream signal substantially simultaneously with the link layer data and identifies that the data transmitted on the data layer of the data link includes a link layer packet.

少なくとも１つの例では、ストリーム信号は、特定のウィンドウ中にデータリンクの専用ストリーム信号レーン上で送信され、リンク層データも特定のウィンドウ中に送信される。 In at least one example, the stream signal is transmitted on a dedicated stream signal lane of the data link during a specific window, and link layer data is also transmitted during the specific window.

少なくとも１つの例では、物理層ロジックは、データリンクの専用有効信号レーン上で有効信号を送信し、有効信号は、特定のウィンドウの直前の別のウィンドウにおいて送信され、特定のウィンドウにおいて送信されるデータが有効であることを示す。 In at least one example, the physical layer logic transmits a valid signal on a dedicated valid signal lane of the data link, and the valid signal is transmitted in another window immediately before the specific window and transmitted in the specific window. Indicates that the data is valid.

少なくとも１つの例では、ハンドシェイクはサイドバンドリンクを介したハンドシェイク通信を含む。 In at least one example, the handshake includes handshake communication over a sideband link.

少なくとも１つの例では、ハンドシェイクは、リンク層ハンドシェイクと、サイドバンドリンクを通じたハンドシェイク通信とを含む。 In at least one example, the handshake includes a link layer handshake and handshake communication over a sideband link.

少なくとも１つの例では、サイドバンドリンクを介したハンドシェイク通信はリンク層ハンドシェイクを確認する。 In at least one example, handshake communication over the sideband link confirms the link layer handshake.

少なくとも１つの例では、物理層ロジックは更に、複数の階層化プロトコルのうちの第１の階層化プロトコルの上位層から、アクティブリンク状態から低電力リンク状態へのデータリンクの遷移のリクエストを特定する。 In at least one example, the physical layer logic further identifies a data link transition request from an active link state to a low power link state from an upper layer of the first layered protocol of the plurality of layered protocols. .

少なくとも１つの例では、物理層ロジックは更に、複数の階層化プロトコル内の他のプロトコル各々からリクエストが受信されるまで、データリンクがアクティブリンク状態から低電力リンク状態に遷移するのを待機する。 In at least one example, the physical layer logic further waits for the data link to transition from the active link state to the low power link state until a request is received from each of the other protocols in the plurality of layered protocols.

少なくとも１つの例では、物理層ロジックは、複数の階層化プロトコルの各々について、プロトコルがアクティブリンク状態から低電力リンク状態にデータリンクを遷移させることをリクエストしたか否かを追跡する。 In at least one example, the physical layer logic tracks, for each of the multiple layered protocols, whether the protocol has requested that the data link transition from an active link state to a low power link state.

少なくとも１つの例では、物理層ロジックは更に、アクティブ状態から低電力リンク状態へのデータリンクの実際の遷移の前に、アクティブリンク状態から低電力リンク状態へのデータリンクの遷移を確認する、リクエストに対する応答を生成する。 In at least one example, the physical layer logic further confirms the data link transition from the active link state to the low power link state prior to the actual transition of the data link from the active state to the low power link state. Generate a response to.

少なくとも１つの例では、アクティブリンク状態から低電力リンク状態へのデータリンクの遷移の承認が、複数の階層化プロトコルにおける他のプロトコルのうちの１以上から未処理となっている間、応答が送信される。 In at least one example, a response is transmitted while the approval of the data link transition from the active link state to the low power link state is outstanding from one or more of the other protocols in the multiple layered protocols Is done.

少なくとも１つの例では、低電力リンク状態はアイドルリンク状態を含む。 In at least one example, the low power link condition includes an idle link condition.

少なくとも１つの例では、複数の階層化プロトコルは、ＰＣＩ、ＰＣＩｅ、ＩＤＩ及びＱＰＩのうちの１以上を含む。 In at least one example, the plurality of layered protocols include one or more of PCI, PCIe, IDI, and QPI.

データリンクを用いて第１のデバイスに通信可能に結合された、複数のレーン、第１のデバイス及び第２のデバイスを含むデータリンクを備えるシステムを提供することができ、第２のデバイスは、第１のプロトコルの上位層ロジックと、第２のプロトコルの上位層ロジックであって、複数のプロトコルスタックの各々が共通物理層を利用する、上位層ロジックと、共通物理層のための物理層ロジックであって、物理層ロジックは、データリンクを低電力リンク状態に遷移させる前に、第１のプロトコル及び第２のプロトコルを含むプロトコルの各々が、データリンクを利用して、アクティブリンク状態から低電力リンク状態にデータリンクを遷移させることを承認していることを判断する、物理層ロジックとを含む。 A system comprising a data link comprising a plurality of lanes, a first device and a second device communicatively coupled to the first device using a data link can be provided, the second device comprising: An upper layer logic of the first protocol and an upper layer logic of the second protocol, each of the plurality of protocol stacks using a common physical layer, and a physical layer logic for the common physical layer And before the physical layer logic transitions the data link to the low power link state, each of the protocols, including the first protocol and the second protocol, uses the data link to go from the active link state to the low state. And physical layer logic that determines that the data link is authorized to transition to a power link state.

少なくとも１つの例では、複数のレーンは、複数のデータレーンと、１以上の有効信号レーンと、１以上のストリームレーンとを含む。 In at least one example, the plurality of lanes includes a plurality of data lanes, one or more valid signal lanes, and one or more stream lanes.

少なくとも１つの例では、有効信号は、有効信号レーン上で送信され、各有効信号は、有効なデータが複数のデータレーンにおける有効信号のアサートに続くことを特定し、ストリーム信号は、ストリーム信号レーン上で送信され、各ストリーム信号は、１以上のデータレーン上のデータのタイプを特定する。 In at least one example, valid signals are transmitted on valid signal lanes, each valid signal specifying that valid data follows the assertion of valid signals in multiple data lanes, and the stream signal is a stream signal lane. Each stream signal transmitted above identifies the type of data on one or more data lanes.

少なくとも１つの例では、データリンクを低電力リンク状態に遷移させることは、第１のデバイスと第２のデバイスとの間のハンドシェイクを含む。 In at least one example, transitioning the data link to a low power link state includes a handshake between the first device and the second device.

少なくとも１つの例では、ハンドシェイクはリンク層ハンドシェイク及びサイドバンドハンドシェイクを含む。 In at least one example, the handshake includes a link layer handshake and a sideband handshake.

少なくとも１つの例では、第１のデバイスはパッケージ内の第１のダイを含み、第２のデバイスは、パッケージ内の第２のダイを含む。 In at least one example, the first device includes a first die in the package and the second device includes a second die in the package.

本明細書全体を通じて、「１つの実施形態」又は「一実施形態」とは、実施形態に関連して説明される特定の特徴、構造又は特性が本発明の少なくとも１つの実施形態に含まれることを意味する。このため、本明細書全体を通じた様々な箇所に登場する「１つの実施形態において」又は「一実施形態において」というフレーズは、必ずしも全てが同じ実施形態を指しているわけではない。更に、特定の特徴、構造又は特性は、１以上の実施形態において任意の適切な方式で組み合わせることができる。 Throughout this specification "one embodiment" or "one embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. Means. Thus, the phrases “in one embodiment” or “in one embodiment” appearing in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

上記の明細書において、特定の例示的な実施形態を参照して詳細な説明が行われた。しかしながら、添付の特許請求の範囲において示すような本発明のより広い趣旨及び範囲から逸脱することなく、様々な変更及び変形を行うことができることが明らかであろう。したがって、明細書及び図面は、限定的意味ではなく例示的意味で考慮される。更に、実施形態及び他の例示的な言語の上記の使用は、必ずしも同じ実施形態又は同じ例を指すものではなく、潜在的に同じ実施形態のみでなく異なる実施形態及び別個の実施形態を指す場合もある。 In the foregoing specification, a detailed description has been given with reference to specific exemplary embodiments. However, it will be apparent that various changes and modifications can be made without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, therefore, to be regarded in an illustrative sense rather than a restrictive sense. Further, the above uses of embodiments and other exemplary languages do not necessarily refer to the same embodiment or the same example, but potentially to the same embodiment or different embodiments and separate embodiments. There is also.

Claims

A device comprising a physical layer interface,
The physical layer interface is
A clock lane that supports the clock signal;
A control interface that supports one or more control signals, wherein the one or more control signals include a control signal that transitions in a link state according to a state machine;
A plurality of data lanes for transmitting data;
An effective signal lane that supports transmission of an effective signal, wherein transmission of data on the data lane is aligned with transmission of the effective signal on the effective signal lane; and
An apparatus comprising:

The apparatus of claim 1, further comprising physical layer logic for generating the valid signal.

The apparatus of claim 2, wherein the physical layer logic manages training of a link comprising the plurality of data lanes.

4. The apparatus of claim 3, wherein the link state includes a plurality of link training states, and the link training state is used in training the link.

The apparatus of claim 4, wherein the particular one of the plurality of link training states includes a low power state in which no data is transmitted on the link.

6. The apparatus according to any one of claims 3 to 5, wherein training of the link includes transmitting and receiving a training sequence on the link.

The apparatus according to claim 1, wherein the control signal includes a sideband control signal.

The apparatus according to claim 1, wherein the physical layer interface comprises a physical layer abstraction.

The apparatus according to any one of claims 1 to 8, further comprising a transmitter for transmitting the valid signal and transmitting specific data on a plurality of data lanes aligned with the valid signal.

A device comprising an interface,
The interface is
A clock signal lane for receiving the clock signal; and
A control interface for receiving one or more control signals, the one or more control signals including a control signal to transition in a link state according to a state machine;
A plurality of data lanes for receiving data transmitted on the link by other devices;
A valid signal lane that receives a valid signal corresponding to the data, wherein transmission of data on the data lane is aligned with transmission of the valid signal on the valid signal lane; and
An apparatus comprising:

The physical layer logic of claim 10 further comprising receiving the valid signal on the valid signal lane and processing the data received on the plurality of data lanes based on the receipt of the valid signal. Equipment.

The apparatus according to claim 10, wherein the control signal includes a sideband signal.

The apparatus according to claim 10, wherein the valid signal is aligned with an edge of the clock signal.

The apparatus according to claim 10, wherein the link state includes a link training state.

The apparatus of claim 14, wherein a training sequence is transmitted and received by the apparatus during one or more of the link training states.

The apparatus of claim 15, wherein the training sequence follows a defined interconnect protocol.

The apparatus of claim 16, wherein the defined interconnect protocol comprises one of a plurality of different interconnect protocols that support use of the interface.

Sending a clock signal on the dedicated clock lane of the interface;
Transmitting a control signal on a control lane, wherein the control signal transitions in a link state according to a state machine; and
Identifying data transmitted on a plurality of data lanes of the interface;
Transmitting a valid signal on a dedicated valid lane of the interface, the valid signal transmitting a valid signal corresponding to specific data;
Transmitting the specific data on the plurality of data lanes aligned with the valid signal;
Including a method.

An apparatus comprising means for performing the method of claim 18.

A first computing device;
A second computing device connected to the first computing device by a link, the second computing device comprising a physical layer interface that supports the link;
A system comprising:
The physical layer interface is
A clock signal lane for receiving the clock signal; and
A control interface for receiving one or more control signals, the one or more control signals including a control signal to transition in a link state according to a state machine;
Multiple data lanes for receiving data transmitted on the link by other devices;
A valid signal lane that receives a valid signal corresponding to the data, wherein transmission of data on the data lane is aligned with transmission of the valid signal on the valid signal lane; and
A system comprising:

The system of claim 20, wherein the second computing device comprises a processor.

The system of claim 21, wherein the first computing device comprises a second processor.

The system of claim 20, wherein the second computing device comprises a memory controller.

The system of claim 20, wherein the second computing device comprises a graphics processor.

The system of claim 20, wherein the second computing device comprises a network controller.