JP6110560B2

JP6110560B2 - Method, apparatus and system for handling PCIe link failures

Info

Publication number: JP6110560B2
Application number: JP2016510920A
Authority: JP
Inventors: ▲閣▼ 杜
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2014-04-02
Filing date: 2014-04-02
Publication date: 2017-04-05
Anticipated expiration: 2034-04-02
Also published as: JP2016526311A; EP2961098A4; EP2961098B1; WO2015149293A1; CN104170322A; CN104170322B; US20150324268A1; EP2961098A1; US9785530B2

Description

本発明は、データ伝送技術の分野に関し、特にＰＣＩｅリンク故障を処理する方法、装置及びシステムに関する。 The present invention relates to the field of data transmission technology, and more particularly to a method, apparatus and system for handling PCIe link failures.

現在、データ伝送技術の分野では、ＰＣＩｅ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔＥｘｐｒｅｓｓ、ペリフェラル・コンポーネント・インターコネクト・エクスプレス）プロトコルが広く適用されてきた。ＰＣＩｅプロトコルが装置に適用されるとき、データ伝送はポイント・ツー・ポイント形式により装置間で実行される。ＰＣＩｅプロトコルを利用することによってデータ伝送を実行する装置は、ＰＣＩｅ装置として総称される。システムでは、リンク接続は、シリアライザ／デ・シリアライザ（ｓｅｒｄｅｓ，Ｓｅｒｉａｌｉｚｅｒ／Ｄｅ−Ｓｅｒｉａｌｉｚｅｒ）回路を利用することによって、２つのＰＣＩｅ装置の間の通信のため実現可能である。２つのＰＣＩｅ装置がデータ伝送を実行するとき、データ伝送は、ネゴシエートされたレートによりｓｅｒｄｅｓを利用することによって実行される。２つのＰＣＩｅ装置の間のリンクは、１，２，４，８，１６又は３２個のｓｅｒｄｅｓを含むものであってもよい。複数のｓｅｒｄｅｓがあるとき、これらのｓｅｒｄｅｓは連続する数字を昇順に利用することによって、逐次的に番号付けされる。１つのｓｅｒｄｅｓはリンクの１つのレーン（ｌａｎｅ）であり、ｓｅｒｄｅｓ番号はレーン番号として参照される。 Currently, in the field of data transmission technology, a peripheral component interconnect express (PCIe) protocol has been widely applied. When the PCIe protocol is applied to devices, data transmission is performed between devices in a point-to-point format. Devices that perform data transmission by using the PCIe protocol are collectively referred to as PCIe devices. In the system, link connection can be realized for communication between two PCIe devices by utilizing a serializer / deserializer (serdes, Serializer / De-Serializer) circuit. When two PCIe devices perform data transmission, data transmission is performed by utilizing serdes with the negotiated rate. The link between two PCIe devices may include 1, 2, 4, 8, 16, or 32 serdes. When there are multiple serdes, these serdes are numbered sequentially by using consecutive numbers in ascending order. One serdes is one lane of the link, and the serdes number is referred to as a lane number.

２つのＰＣＩｅ装置の間のデータ伝送の帯域幅（Ｗ）は、レーン数（Ｎ）とネゴシエートされたレート（Ｓ）との積に等しく、すなわち、帯域幅の式はＷ＝Ｎ×Ｓである。２つのＰＣＩｅ装置の間のリンクのネゴシエートされたレート（Ｓ）は、使用されるＰＣＩｅプロトコルのバージョンにより変わり、現在、１秒間に１つの回路により伝送可能なデータの容量を示す、ＧＥＮ１（２．５ＧＴ／ｓ）、ＧＥＮ２（５．０ＧＴ／ｓ）、ＧＥＮ３（８．０ＧＴ／ｓ）及びＧＥＮ４（１６．０ＧＴ／ｓ）である４つのタイプのネゴシエートされたレートがある。一般に、１つのＰＣＩｅ装置のリンクによりサポートされるネゴシエートされたレート（Ｓ）は固定され、この場合、通信帯域幅（Ｗ）のますます高くなるユーザ要求を充たすため、帯域幅は、リンクのレーン数（Ｎ）を増やすことによってしか改善できない。 The bandwidth (W) of data transmission between two PCIe devices is equal to the product of the number of lanes (N) and the negotiated rate (S), ie the bandwidth equation is W = N × S. . The negotiated rate (S) of the link between two PCIe devices varies depending on the version of the PCIe protocol used and currently indicates the amount of data that can be transmitted by one circuit per second, GEN1 (2. There are four types of negotiated rates: 5 GT / s), GEN2 (5.0 GT / s), GEN3 (8.0 GT / s) and GEN4 (16.0 GT / s). In general, the negotiated rate (S) supported by a link of one PCIe device is fixed, in this case the bandwidth is the link lane in order to meet the increasingly demanding user demand of communication bandwidth (W). It can be improved only by increasing the number (N).

２つのＰＣＩｅ装置がデータを送信するとき、２つのＰＣＩｅ装置の間のリンクの全てのレーンが同時に利用される必要がある。レーンの１つにおいて故障が発生した場合、データ伝送が中断される。従来技術では、ＰＣＩｅ装置に接続されるレーンにおいて故障が発生すると、ＰＣＩｅ装置は、ＰＣＩｅプロトコルの再ネゴシエーション機構に従ってリンクネゴシエーションを実行する。リンクネゴシエーション中、ネゴシエーションは、最小のレーン番号のレーンから始まって実行され、リンクネゴシエーションは、レーン番号の昇順に逐次的に実行される。レーン番号の昇順にリンク再ネゴシエーションを連続的に実行する方式は、上方ネゴシエーションとして参照される。例えば、ＧＥＮ２のレート及び１６のリンク幅（ＰＣＩｅ２．０×１６）について当初にネゴシエートする必要があるＰＣＩｅ装置のレーン番号２において故障が発生すると、ＰＣＩｅ装置は、ＰＣＩｅプロトコルの再ネゴシエーション機構に従って、レーン番号０から始まって上方にリンクネゴシエーションを実行する。ネゴシエーションは、レーン番号２における故障のためレーン番号２に対して逐次的に実行することができず、レーン番号０からレーン番号１までネゴシエーションが実行された後、リンクネゴシエーションは継続できない。この場合、ネゴシエーションは２つのレーン、レーン番号０及びレーン番号１のみに対して実行が成功し、すなわち、データ伝送はレーン番号０及びレーン番号１においてしか継続できない。ＰＣＩｅプロトコルに規定される２つのＰＣＩｅ装置の間のリンクのレーン幅によると、２つのＰＣＩｅ装置の間のリンクのレーン幅は、再ネゴシエーションにより取得される２であり、すなわち、２つのレーンしかデータを送信するためＰＣＩｅ装置に提供することはできない。この場合、帯域幅の式におけるＮは元の１６から２に変更され、ＰＣＩｅ装置の間のデータ伝送の性能は、元の性能の１／８にすぎない。しかしながら、レーン番号１において故障が発生した場合、上記の再ネゴシエーション方法によると、１つのレーンしかネゴシエートできず、この場合、データ伝送性能は、元の性能の１／１６にすぎない。 When two PCIe devices transmit data, all the lanes of the link between the two PCIe devices need to be used simultaneously. If a failure occurs in one of the lanes, data transmission is interrupted. In the prior art, when a failure occurs in a lane connected to a PCIe device, the PCIe device performs link negotiation according to a renegotiation mechanism of the PCIe protocol. During link negotiation, negotiation is performed starting from the lane with the smallest lane number, and link negotiation is sequentially performed in ascending order of lane numbers. A method of continuously executing link renegotiation in ascending order of lane numbers is referred to as upward negotiation. For example, if a failure occurs in lane number 2 of a PCIe device that needs to be initially negotiated for a GEN2 rate and a link width of 16 (PCIe 2.0 × 16), the PCIe device follows the renegotiation mechanism of the PCIe protocol, Link negotiation is performed upward starting from number 0. Negotiation cannot be executed sequentially for lane number 2 due to a failure in lane number 2, and link negotiation cannot be continued after the negotiation from lane number 0 to lane number 1 is executed. In this case, the negotiation is successful for only two lanes, lane number 0 and lane number 1, that is, data transmission can only continue on lane number 0 and lane number 1. According to the lane width of the link between two PCIe devices specified in the PCIe protocol, the lane width of the link between two PCIe devices is 2, which is obtained by renegotiation, i.e. only two lanes of data Can not be provided to the PCIe device for transmitting. In this case, N in the bandwidth equation is changed from the original 16 to 2, and the performance of data transmission between PCIe devices is only 1/8 of the original performance. However, when a failure occurs in lane number 1, according to the above renegotiation method, only one lane can be negotiated. In this case, the data transmission performance is only 1/16 of the original performance.

従来技術では、ＰＣＩｅ装置の間のリンクのレーンにおいて故障が発生すると、リンクのレーン幅の再ネゴシエーションは、故障したレーンのレーン番号によって大きく制限され、再ネゴシエーションにより取得されるレーン幅に関する不確実性とレーン幅が大きく減少するケースとを導き、レーンのデータ伝送性能に重大な影響を与える。 In the prior art, when a failure occurs in the lane of a link between PCIe devices, the lane width renegotiation of the link is largely limited by the lane number of the failed lane, and the uncertainty regarding the lane width obtained by renegotiation And a case where the lane width is greatly reduced, which has a significant influence on the data transmission performance of the lane.

これを鑑み、本発明は、故障が発生するレーンのレーン番号の制限のため、再ネゴシエーションにより取得されたレーン幅において不確実性が存在し、レーン幅が大きく減少するケースが発生するかもしれず、レーンの伝送性能に重大な影響を与える従来技術における問題を解決するため、ＰＣＩｅリンク故障を処理する方法、装置及びシステムを提供する。 In view of this, the present invention may cause a case where there is uncertainty in the lane width obtained by renegotiation due to the limitation of the lane number of the lane in which the failure occurs, and the lane width is greatly reduced. To solve the problems in the prior art that have a significant impact on lane transmission performance, a method, apparatus and system for handling PCIe link failures is provided.

本発明の第１の態様は、ＰＣＩｅリンク故障を処理する方法であって、
ＰＣＩｅ装置が、前記ＰＣＩｅ装置と下流のＰＣＩｅ装置との間のリンクのレーンにおいて故障が発生したと検出し、メッセージ・シグナルド・インタラプトＭＳＩメッセージを中央処理ユニットＣＰＵに送信するステップであって、前記ＭＳＩメッセージは前記ＰＣＩｅ装置のデバイスＩＤを有する、送信するステップと、
前記ＰＣＩｅ装置が、現在レーン幅値Ｎを取得するため、前記下流のＰＣＩｅ装置とネゴシエートするステップと、
前記ＣＰＵが、前記受信したＭＳＩメッセージにおけるデバイスＩＤに従って、前記ＰＣＩｅ装置から前記ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍ及び前記現在レーン幅値Ｎを取得するステップと、
前記ＣＰＵが、ＮとＭ／２とを比較するステップと、
Ｎ＜Ｍ／２である場合、前記ＣＰＵが、レーン反転処理を実行するよう前記ＰＣＩｅ装置に指示するステップと、
前記ＰＣＩｅ装置が、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクに対して前記レーン反転処理を実行するステップと、
前記ＰＣＩｅ装置が、新たな現在レーン幅値Ｎ’を取得するため、前記下流のＰＣＩｅ装置とネゴシエートし、Ｎ’個のレーンを利用することによって前記下流のＰＣＩｅ装置とのデータ伝送を実行し続けるステップと、
を有する方法を提供する。 A first aspect of the invention is a method for handling a PCIe link failure comprising:
A PCIe device detecting that a failure has occurred in a lane of a link between the PCIe device and a downstream PCIe device, and sending a message signaled interrupt MSI message to a central processing unit CPU, the MSI Sending a message having a device ID of said PCIe device;
The PCIe device negotiates with the downstream PCIe device to obtain a current lane width value N;
The CPU obtaining the PCIe device lane negotiation capability value M and the current lane width value N from the PCIe device according to the device ID in the received MSI message;
Said CPU comparing N and M / 2;
If N <M / 2, the CPU instructs the PCIe device to perform lane inversion processing;
The PCIe device performs the lane inversion process on a link between the PCIe device and the downstream PCIe device;
The PCIe device continues to execute data transmission with the downstream PCIe device by negotiating with the downstream PCIe device and using the N ′ lanes to obtain a new current lane width value N ′. Steps,
A method is provided.

第１の態様の第１の可能な実現方式では、当該方法は更に、Ｎ≧Ｍ／２である場合、前記ＰＣＩｅ装置が、ネゴシエーションにより取得されたＮ個のレーンを利用することによって、前記下流のＰＣＩｅ装置とのデータ伝送を実行し続けるステップを有する。 In a first possible implementation manner of the first aspect, the method further comprises: if N ≧ M / 2, the PCIe device uses the N lanes obtained by negotiation to cause the downstream The step of continuing to perform data transmission with other PCIe devices.

第１の態様又は第１の態様の第１の可能な実現方式を参照して、第２の可能な実現方式では、Ｎ＜Ｍ／２である場合、当該方法は更に、前記ＣＰＵが、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーン番号０からレーン番号（Ｍ／２−１）までを無効にするステップを有する。 Referring to the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner, if N <M / 2, the method further comprises: Invalidating the lane number 0 to the lane number (M / 2-1) of the link between the PCIe device and the downstream PCIe device.

本発明の第２の態様は、ＰＣＩｅリンク故障を処理する方法であって、
ＰＣＩｅ装置により報告されたメッセージ・シグナルド・インタラプトＭＳＩメッセージを受信するステップであって、前記ＭＳＩメッセージは前記ＰＣＩｅ装置のデバイスＩＤを有する、受信するステップと、
前記デバイスＩＤに従って前記ＰＣＩｅ装置から前記ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍ及び現在レーン幅値Ｎを取得するステップであって、前記現在レーン幅値Ｎは、前記ＰＣＩｅ装置が下流のＰＣＩｅ装置とネゴシエートすることによって取得される、取得するステップと、
ＮとＭ／２とを比較するステップと、
Ｎ＜Ｍ／２である場合、レーン反転処理を実行するよう前記ＰＣＩｅ装置に指示するステップと、
を有する方法を提供する。 A second aspect of the invention is a method for handling a PCIe link failure comprising:
Receiving a message signaled interrupt MSI message reported by a PCIe device, wherein the MSI message has a device ID of the PCIe device;
Obtaining a lane negotiation capability value M and a current lane width value N of the PCIe device from the PCIe device according to the device ID, the current lane width value N being negotiated with a downstream PCIe device by the PCIe device; A step of obtaining, obtained by
Comparing N and M / 2;
If N <M / 2, instructing the PCIe device to perform lane inversion processing;
A method is provided.

第２の態様の第１の可能な実現方式では、Ｎ＜Ｍ／２である場合、当該方法は更に、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーン番号０からレーン番号（Ｍ／２−１）までを無効にするステップを有する。 In a first possible implementation manner of the second aspect, if N <M / 2, the method further includes lane number 0 to lane number () of the link between the PCIe device and the downstream PCIe device. M / 2-1) is invalidated.

第２の態様又は第２の態様の第１の可能な実現方式を参照して、第２の可能な実現方式では、前記ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍは、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーンの合計数に等しい。 Referring to the second aspect or the first possible implementation scheme of the second aspect, in a second possible implementation scheme, the lane negotiation capability value M of the PCIe device is the PCIe device and the downstream PCIe Equal to the total number of lanes in the link to the device.

本発明の第３の態様は、ＰＣＩｅリンク故障を処理する方法であって、
ＰＣＩｅ装置が、前記ＰＣＩｅ装置と下流のＰＣＩｅ装置との間のリンクのレーンにおいて故障が発生したと検出し、メッセージ・シグナルド・インタラプトＭＳＩメッセージを中央処理ユニットＣＰＵに送信するステップであって、前記ＭＳＩメッセージは前記ＰＣＩｅ装置のデバイスＩＤを有する、送信するステップと、
現在レーン幅値Ｎを取得するため、前記下流のＰＣＩｅ装置とネゴシエートするステップと、
前記ＣＰＵにより送信されたレーン反転処理を実行する指示を受信し、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクに対して前記レーン反転処理を実行するステップと、
新たな現在レーン幅値Ｎ’を取得するため、前記下流のＰＣＩｅ装置とネゴシエートし、Ｎ’個のレーンを利用することによって前記下流のＰＣＩｅ装置とのデータ伝送を実行し続けるステップと、
を有する方法を提供する。 A third aspect of the present invention is a method for handling a PCIe link failure comprising:
A PCIe device detecting that a failure has occurred in a lane of a link between the PCIe device and a downstream PCIe device, and sending a message signaled interrupt MSI message to a central processing unit CPU, the MSI Sending a message having a device ID of said PCIe device;
Negotiating with the downstream PCIe device to obtain a current lane width value N;
Receiving an instruction to perform lane inversion processing transmitted by the CPU, and executing the lane inversion processing for a link between the PCIe device and the downstream PCIe device;
Negotiating with the downstream PCIe device to obtain a new current lane width value N ′ and continuing to perform data transmission with the downstream PCIe device by using N ′ lanes;
A method is provided.

第３の態様の第１の可能な実現方式では、前記ＰＣＩｅ装置が前記ＣＰＵにより送信されたレーン反転処理を実行する指示を所定の時間内に受信しなかった場合、前記ＰＣＩｅ装置が、Ｎ個のレーンを利用することによって、前記下流のＰＣＩｅ装置とのデータ伝送を実行し続ける。 In a first possible implementation manner of the third aspect, if the PCIe device does not receive the instruction to execute the lane inversion processing transmitted by the CPU within a predetermined time, the number of PCIe devices is N. , The data transmission with the downstream PCIe device continues to be executed.

本発明の第４の態様は、ＰＣＩｅリンク故障を処理するシステムであって、当該システムは、中央処理ユニットＣＰＵ、ＰＣＩｅ装置及び下流のＰＣＩｅ装置を有し、前記ＣＰＵは前記ＰＣＩｅ装置に接続され、前記ＰＣＩｅ装置はリンクを利用することによって前記下流のＰＣＩｅ装置に接続され、
前記ＰＣＩｅ装置は、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーンにおいて故障が発生したか検出し、故障が発生すると、メッセージ・シグナルド・インタラプトＭＳＩメッセージを前記ＣＰＵに報告するよう構成され、前記ＭＳＩメッセージは前記ＰＣＩｅ装置のデバイスＩＤを有し、前記ＰＣＩｅ装置は更に、現在レーン幅値Ｎを取得するため、前記下流のＰＣＩｅ装置とネゴシエートするよう構成され、
前記ＣＰＵは、前記ＭＳＩメッセージにおけるデバイスＩＤに従って、前記ＰＣＩｅ装置から前記ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍ及び前記現在レーン幅値Ｎを取得し、ＮとＭ／２とを比較し、Ｎ＜Ｍ／２であるとき、レーン反転処理を実行するよう前記ＰＣＩｅ装置に指示するよう構成され、
前記ＰＣＩｅ装置は更に、前記ＣＰＵにより送信されたレーン反転処理を実行する指示を受信した後に、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクに対して前記レーン反転処理を実行し、新たな現在レーン幅値Ｎ’を取得するため、前記下流のＰＣＩｅ装置とネゴシエートするよう構成されるシステムを提供する。 A fourth aspect of the present invention is a system for handling a PCIe link failure, the system comprising a central processing unit CPU, a PCIe device and a downstream PCIe device, the CPU being connected to the PCIe device, The PCIe device is connected to the downstream PCIe device by using a link,
The PCIe device is configured to detect whether a failure has occurred in a lane of a link between the PCIe device and the downstream PCIe device, and report a message / signaled interrupt MSI message to the CPU when the failure occurs. The MSI message has a device ID of the PCIe device, and the PCIe device is further configured to negotiate with the downstream PCIe device to obtain a current lane width value N;
The CPU obtains the lane negotiation capability value M and the current lane width value N of the PCIe device from the PCIe device according to the device ID in the MSI message, compares N with M / 2, and N <M / 2 is configured to instruct the PCIe device to perform lane inversion processing;
The PCIe device further executes the lane inversion processing for the link between the PCIe device and the downstream PCIe device after receiving the instruction to execute the lane inversion processing transmitted by the CPU. A system configured to negotiate with the downstream PCIe device to obtain a current lane width value N ′.

第４の態様の第１の可能な実現方式では、Ｎ＜Ｍ／２であるとき、前記ＣＰＵは更に、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーン番号０からレーン番号（Ｍ／２−１）までを無効にするよう構成される。 In a first possible implementation manner of the fourth aspect, when N <M / 2, the CPU further lane number 0 to lane number () of the link between the PCIe device and the downstream PCIe device. M / 2-1) is invalidated.

本発明の第５の態様は、ＰＣＩｅリンク故障を処理するＰＣＩｅ装置であって、検出モジュール５０３１、ＭＳＩモジュール５０３３、ネゴシエートモジュール５０３５、レジスタ５０３７及びレーン反転モジュール５０３９を有し、
前記レジスタ５０３７は、前記ＰＣＩｅ装置の現在レーン幅値Ｎ及びレーンネゴシエーション能力値Ｍを格納し、
前記検出モジュール５０３１は、前記ＰＣＩｅ装置５０３と下流のＰＣＩｅ装置との間のリンク５０４の通信状態をモニタリングし、前記リンク５０４のレーンにおいて故障が発生したと検出すると、レーン故障指示メッセージを前記ＭＳＩモジュール５０３３に送信するよう構成され、
前記ＭＳＩモジュール５０３３は、前記検出モジュール５０３１により送信された前記レーン故障指示メッセージを受信した後に、ＭＳＩメッセージを中央処理ユニットＣＰＵ５０１に送信するよう構成され、前記ＭＳＩメッセージは前記ＰＣＩｅ装置５０３のデバイスＩＤを含み、
前記ネゴシエートモジュール５０３５は、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置との間のリンクのレーン幅をネゴシエートするよう構成され、
前記レーン反転モジュール５０３９は、前記ＣＰＵにより送信されたレーン反転処理を実行する指示を受信した後に、前記ＰＣＩｅ装置と前記下流のＰＣＩｅ装置の間のリンクのレーンに対して前記レーン反転処理を実行するよう構成されるＰＣＩｅ装置を提供する。 A fifth aspect of the present invention is a PCIe device for processing a PCIe link failure, and includes a detection module 5031, an MSI module 5033, a negotiate module 5035, a register 5037, and a lane inversion module 5039.
The register 5037 stores a current lane width value N and a lane negotiation capability value M of the PCIe device,
The detection module 5031 monitors the communication state of the link 504 between the PCIe device 503 and the downstream PCIe device, and when detecting that a failure has occurred in the lane of the link 504, the detection module 5031 sends a lane failure indication message to the MSI module. Configured to transmit to 5033,
The MSI module 5033 is configured to transmit an MSI message to the central processing unit CPU 501 after receiving the lane failure indication message transmitted by the detection module 5031, and the MSI message includes a device ID of the PCIe device 503. Including
The negotiate module 5035 is configured to negotiate the lane width of the link between the PCIe device and the downstream PCIe device;
The lane reversal module 5039 performs the lane reversal process on the lane of the link between the PCIe device and the downstream PCIe device after receiving the instruction to perform the lane reversal process transmitted by the CPU. A PCIe device configured as described above is provided.

従来技術と比較して、本発明は、ＰＣＩｅリンク故障を処理する方法、装置及びシステムを開示し、ＰＣＩｅ装置の現在レーン幅値とレーンネゴシエーション能力値Ｍとを比較することによって、当該方法は、ＰＣＩｅ装置のレーン幅が、故障が発生したレーンのレーン番号に関わらず、レーンネゴシエーション能力値の半分を維持することを保証することができることが、上記の技術的方策からわかる。本発明において開示されるＰＣＩｅリンク故障を処理する方法、装置及びシステムは、リンク再ネゴシエーションに課せられる故障したレーンのレーン番号の制限を大きく緩和し、これにより、最適なリンク幅が再ネゴシエーションの後に達成できる。 Compared to the prior art, the present invention discloses a method, apparatus and system for handling PCIe link failure, and by comparing the current lane width value of the PCIe apparatus and the lane negotiation capability value M, the method comprises: It can be seen from the above technical measures that the PCIe device lane width can be guaranteed to maintain half of the lane negotiation capability value regardless of the lane number of the lane in which the failure occurred. The method, apparatus and system for handling PCIe link failures disclosed in the present invention greatly relaxes the lane number limitation of failed lanes imposed on link renegotiation, so that the optimal link width can be improved after renegotiation. Can be achieved.

本発明の実施例又は従来技術における技術的方策をより明確に説明するため、以下において、実施例又は従来技術を説明するのに必要な添付図面が簡単に紹介される。明らかに、以下の説明における添付図面は本発明の単なる実施例を示し、当業者は、提供された添付図面から他の図面を依然として導出してもよい。
図１は、本発明の実施例において開示されるＰＣＩｅ装置の間の接続の概略図である。図２は、本発明の実施例において開示されるＰＣＩｅ装置の間の接続の簡略化された概略図である。図３は、本発明の実施例において開示されるＰＣＩｅ装置の間の他の接続の簡略化された概略図である。図４は、本発明の実施例において開示されるＰＣＩｅ装置のリンク故障を処理する方法のフローチャートである。図５は、本発明の実施例において開示されるＰＣＩｅ装置の概略的な構成図である。 BRIEF DESCRIPTION OF THE DRAWINGS To describe the technical solutions in the embodiments of the present invention or in the prior art more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments or the prior art. Apparently, the accompanying drawings in the following description show merely examples of the present invention, and those skilled in the art may still derive other drawings from the provided accompanying drawings.
FIG. 1 is a schematic diagram of a connection between PCIe devices disclosed in an embodiment of the present invention. FIG. 2 is a simplified schematic diagram of connections between PCIe devices disclosed in embodiments of the present invention. FIG. 3 is a simplified schematic diagram of another connection between PCIe devices disclosed in embodiments of the present invention. FIG. 4 is a flowchart of a method for handling a link failure of a PCIe device disclosed in an embodiment of the present invention. FIG. 5 is a schematic configuration diagram of a PCIe device disclosed in an embodiment of the present invention.

以下は、本発明の実施例における添付図面を参照して本発明の実施例における技術的方策を明確且つ完全に説明する。明らかに、説明される実施例は、本発明の実施例の全てでなく単に一部である。 The following clearly and completely describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present invention.

本発明は、ＰＣＩｅ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔＥｘｐｒｅｓｓ、ペリフェラル・コンポーネント・インターコネクト・エクスプレス）装置のリンク故障を処理する新規な方法を提供する。ＰＣＩｅプロトコルを利用することによってデータ伝送を実行する装置は、ＰＣＩｅ装置として総称される。ＰＣＩｅ装置は、独立した装置に統合されるチップであってもよいし、又は物理的に独立した装置であってもよいし、ここでは限定されない。ＰＣＩｅ装置の間の接続関係について、図１が参照されてもよい。図１に示されるように、２つのＰＣＩｅ装置はデータ伝送を実行するため直接接続されてもよく、例えば、ルートコンプレクス（ｒｏｏｔｃｏｍｐｌｅｘ）装置の下流のポートは、ルートコンプレクスとＰＣＩｅ装置１との間のデータ伝送を実現するため、ＰＣＩｅ装置１の上流のポートに直接接続される。データ伝送はまた、例えば、スイッチを利用することによってデータ伝送を実現するなど、２つのＰＣＩｅ装置の間で間接的に実行されてもよい。例えば、ルートコンプレクスの下流のポートはスイッチの上流のポートに接続され、スイッチの下流のポートは、ルートコンプレクスとＰＣＩｅ装置２との間のデータ伝送を実現するため、ＰＣＩｅ装置２の上流のポートに接続される。１つのＰＣＩｅ装置は複数のＰＣＩｅ装置に同時に接続されてもよく、図１に示されるように、ルートコンプレクスは、ＰＣＩｅ装置１及びスイッチに同時に接続される。これらのＰＣＩｅ装置は、中央化された方式により中央処理ユニットＣＰＵによって管理されてもよい。 The present invention provides a novel method for handling link failures in PCIe (Peripheral Component Interconnect Express) devices. Devices that perform data transmission by using the PCIe protocol are collectively referred to as PCIe devices. The PCIe device may be a chip integrated into an independent device, or may be a physically independent device, and is not limited here. Reference may be made to FIG. 1 for connection relationships between PCIe devices. As shown in FIG. 1, two PCIe devices may be directly connected to perform data transmission, for example, the downstream port of a root complex device is between the root complex and the PCIe device 1 In order to realize this data transmission, it is directly connected to the upstream port of the PCIe device 1. Data transmission may also be performed indirectly between two PCIe devices, for example, implementing data transmission by utilizing a switch. For example, the downstream port of the route complex is connected to the upstream port of the switch, and the downstream port of the switch is connected to the upstream port of the PCIe device 2 in order to realize data transmission between the root complex and the PCIe device 2. Connected. One PCIe device may be simultaneously connected to a plurality of PCIe devices. As shown in FIG. 1, the route complex is simultaneously connected to the PCIe device 1 and the switch. These PCIe devices may be managed by the central processing unit CPU in a centralized manner.

データ伝送はリンク接続によって２つのＰＣＩｅ装置の間で実現され、リンクはシリアライザ／デ・シリアライザ（ｓｅｒｄｅｓ，Ｓｅｒｉａｌｉｚｅｒ／Ｄｅ−Ｓｅｒｉａｌｉｚｅｒ）回路であってもよい。例えば、ルートコンプレクスの下流のポートは、複数のｓｅｒｄｅｓを利用することによってＰＣＩｅ装置１の上流のポートに接続され、これらのｓｅｒｄｅｓは、ルートコンプレクスとＰＣＩｅ装置１との間のリンクを形成する。２つのＰＣＩｅ装置の間には１，２，４，８，１６又は３２個のｓｅｒｄｅｓがあってもよく、２つのＰＣＩｅ装置の間のｓｅｒｄｅｓは２つのＰＣＩｅ装置の間のリンクを形成する。２つのＰＣＩｅ装置の間のデータ伝送の帯域幅（Ｗ）は、レーン数（Ｎ）とネゴシエートされたレート（Ｓ）との積に等しく、すなわち、帯域幅の式はＷ＝Ｎ×Ｓである。２つのＰＣＩｅ装置の間のリンクのネゴシエートされたレート（Ｓ）は、利用されるＰＣＩｅプロトコルのバージョンによって変わり、現在、１秒間に１つの回路により伝送可能なデータの容量を示すＧＥＮ１（２．５ＧＴ／ｓ）、ＧＥＮ２（５．０ＧＴ／ｓ）、ＧＥＮ３（８．０ＧＴ／ｓ）及びＧＥＮ４（１６．０ＧＴ／ｓ）である４つのタイプのネゴシエートされたレートがある。一般に、１つのＰＣＩｅ装置のリンクによりサポートされるネゴシエートされたレート（Ｓ）は固定され、この場合、通信帯域幅（Ｗ）のますます高くなるユーザ要求を充たすため、帯域幅は、リンクのレーン数（Ｎ）を増やすことによってしか改善できない。 Data transmission is realized between the two PCIe devices by a link connection, and the link may be a serializer / deserializer (serdesizer, serializer / de-serializer) circuit. For example, the downstream port of the root complex is connected to the upstream port of the PCIe device 1 by using a plurality of serdes, and these serdes form a link between the root complex and the PCIe device 1. There may be 1, 2, 4, 8, 16 or 32 serdes between two PCIe devices, and the serdes between two PCIe devices form a link between the two PCIe devices. The bandwidth (W) of data transmission between two PCIe devices is equal to the product of the number of lanes (N) and the negotiated rate (S), ie the bandwidth equation is W = N × S. . The negotiated rate (S) of the link between two PCIe devices varies depending on the version of the PCIe protocol used and is currently GEN1 (2.5 indicating the amount of data that can be transmitted by one circuit per second. There are four types of negotiated rates: GT / s), GEN2 (5.0 GT / s), GEN3 (8.0 GT / s) and GEN4 (16.0 GT / s). In general, the negotiated rate (S) supported by a link of one PCIe device is fixed, in this case the bandwidth is the link lane in order to meet the increasingly demanding user demand of communication bandwidth (W). It can be improved only by increasing the number (N).

複数のｓｅｒｄｅｓがあるとき、これらのｓｅｒｄｅｓは、昇順に連続する数字を利用することによって逐次的に番号付けされ、この場合、１つｓｅｒｄｅｓは２つのＰＣＩｅ装置の間のリンクのレーン（ｌａｎｅ）であり、ｓｅｒｄｅｓ番号はレーン番号として参照される。例えば、図１に示されるルートコンプレクスとＰＣＩｅ装置１との間のレーンは、０から１５までの数字を利用することによって、左から右に逐次的に番号付けされる。ここに説明されるｓｅｒｄｅｓは、既存のｓｅｒｄｅｓ構造を利用し、既存の機能をサポートし、ここでは別には再説明されない。 When there are multiple serdes, these serdes are numbered sequentially by using consecutive numbers in ascending order, where one serdes is a lane of links between two PCIe devices. Yes, the serdes number is referred to as the lane number. For example, the lanes between the root complex shown in FIG. 1 and the PCIe device 1 are sequentially numbered from left to right by using numbers from 0 to 15. The serdes described here utilize the existing serdes structure and support existing functionality and will not be described again here.

２つのＰＣＩｅ装置が接続されるとき、リンクネゴシエーションを開始した一方のＰＣＩｅ装置が上流のＰＣＩｅ装置として参照され、当該一方のＰＣＩｅ装置に接続された他方が下流のＰＣＩｅ装置として参照される。図１に示されるように、ルートコンプレクス及びＰＣＩｅ装置１が接続されるとき、ルートコンプレクスが上流のＰＣＩｅ装置であり、ＰＣＩｅ装置１が下流のＰＣＩｅ装置である。ルートコンプレクス及びスイッチが接続されるとき、ルートコンプレクスが上流のＰＣＩｅ装置であり、スイッチが下流のＰＣＩｅ装置である。スイッチは更にＰＣＩｅ装置２に接続され、この場合、スイッチが上流のＰＣＩｅ装置であり、ＰＣＩｅ装置２が下流のＰＣＩｅ装置である。１つのＰＣＩｅ装置が複数のＰＣＩｅ装置に同時に接続されるとき、データ伝送は、独立したリンクを利用することによって、当該１つのＰＣＩｅ装置と他の全てのＰＣＩｅ装置との間で実行される。例えば、ＰＣＩｅスイッチがＰＣＩｅ装置２及びＰＣＩｅ装置３に同時に接続され、リンク１を利用することによって、ＰＣＩｅスイッチとＰＣＩｅ装置２との間のデータ伝送が実現され、リンク２を利用することによって、スイッチとＰＣＩｅ装置３との間のデータ伝送が実現される。リンク１及びリンク２は互いに独立して存在し、互いに影響を与えない。リンク１及びリンク２のレーン数は、同じであってもよいし、又は異なるものであってもよい。しかしながら、データ伝送は、同じ方式によりリンクを利用することによってＰＣＩｅ装置の間で実行される。以下において、２つのＰＣＩｅ装置の間のリンクにおいて実行される故障を処理する方法の具体例を利用することによって説明が与えられる。 When two PCIe devices are connected, one PCIe device that has started link negotiation is referred to as an upstream PCIe device, and the other connected to the one PCIe device is referred to as a downstream PCIe device. As shown in FIG. 1, when the route complex and the PCIe device 1 are connected, the route complex is an upstream PCIe device, and the PCIe device 1 is a downstream PCIe device. When the route complex and the switch are connected, the route complex is the upstream PCIe device and the switch is the downstream PCIe device. The switch is further connected to a PCIe device 2, in which case the switch is an upstream PCIe device and the PCIe device 2 is a downstream PCIe device. When one PCIe device is simultaneously connected to multiple PCIe devices, data transmission is performed between the one PCIe device and all other PCIe devices by utilizing independent links. For example, a PCIe switch is simultaneously connected to the PCIe device 2 and the PCIe device 3, and data transmission between the PCIe switch and the PCIe device 2 is realized by using the link 1, and the switch is obtained by using the link 2. And the PCIe device 3 are realized. Link 1 and link 2 exist independently of each other and do not affect each other. The number of lanes of link 1 and link 2 may be the same or different. However, data transmission is performed between PCIe devices by utilizing links in the same manner. In the following, an explanation will be given by utilizing a specific example of a method for handling a fault performed on a link between two PCIe devices.

全体的な処理をより明確且つより明示的にするため、図２に示されるように、２つのＰＣＩｅ装置の間の接続関係が簡単化され、ここで、第１ＰＣＩｅ装置は、上流のＰＣＩｅ装置であり、第２ＰＣＩｅ装置とのレーンネゴシエーションを実行する。第２ＰＣＩｅ装置は、下流のＰＣＩｅ装置である。中央処理ユニットＣＰＵは、２つのＰＣＩｅ装置を管理する。第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクは１６個のｓｅｒｄｅｓを含み、各ｓｅｒｄｅｓは、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンであり、１６個のレーンは、左から右に０から始まって１５まで連続的に番号付けされる。図２に示されるように、番号０のレーンはレーン番号０として参照され、番号１のレーンはレーン番号１として参照され、同様に、番号１５のレーンはレーン番号１５である。ここでのレーン数は、説明のためだけに利用される。実際の利用では、１，２，４，８又は３２個のレーンが要求に従って設定されてもよく、それの実現原理は、１６個のレーンの実現原理と同じである。 To make the overall process clearer and more explicit, the connection relationship between the two PCIe devices is simplified, as shown in FIG. 2, where the first PCIe device is the upstream PCIe device. Yes, lane negotiation with the second PCIe device is executed. The second PCIe device is a downstream PCIe device. The central processing unit CPU manages two PCIe devices. The link between the first PCIe device and the second PCIe device includes 16 serdes, each serdes is the lane of the link between the first PCIe device and the second PCIe device, and the 16 lanes are from left to right Are numbered sequentially from 0 to 15. As shown in FIG. 2, the lane number 0 is referred to as lane number 0, the lane number 1 is referred to as lane number 1, and similarly, the lane number 15 is lane number 15. The number of lanes here is used for explanation only. In actual use, 1, 2, 4, 8 or 32 lanes may be set according to the requirements, and its implementation principle is the same as that of 16 lanes.

メッセージ・シグナルド・インタラプト（ＭＳＩ，ｍｅｓｓａｇｅｓｉｇｎａｌｉｎｔｅｒｒｕｐｔ）機能が、第１ＰＣＩｅ装置に設定される。第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間に接続されるリンクの１以上のレーンにおいて故障が発生したと検出すると、第１ＰＣＩｅ装置は、ＭＳＩメッセージをＣＰＵに報告し、ここで、ＭＳＩメッセージは第１ＰＣＩｅ装置のデバイスＩＤを含む。第１ＰＣＩｅ装置は、故障したレーンのレーン番号をレジスタに記憶する。第１ＰＣＩｅ装置は、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクの現在レーン幅値Ｎを取得するため、ＰＣＩｅプロトコルの再ネゴシエーション機構に従って第２ＰＣＩｅ装置とのレーン再ネゴシエーションを実行する。レーン再ネゴシエーション中、ＰＣＩｅに規定されるレーンネゴシエーション機構に従って、第１ＰＣＩｅ装置は、ネゴシエーションが故障したレーンに対して実行されるまで、最小のレーン番号のレーンから開始してレーン番号の昇順に上方に第２ＰＣＩｅ装置とのリンクネゴシエーションを連続的に実行する。故障したレーンに対するネゴシエーションの実行は成功できないため、再ネゴシエーション処理は終了する。第１ＰＣＩｅ装置は更に、ＰＣＩｅプロトコルに指定される２つのＰＣＩｅ装置の間のリンクのレーン数に従って、リンクの現在レーン幅値Ｎを決定する。レーン番号は連続的である必要があり、リンクネゴシエーションは最小の番号のレーンから開始されるため、再ネゴシエーションにより取得されたレーン数は故障したレーンのレーン番号によって変わり、不確実性が存在する。故障したレーンのレーン番号が相対的に小さいとき、再ネゴシエーションにより取得されたレーン数もまた大きく減少し、すなわち、取得された現在レーン幅値Ｎは大きく減少し、これにより、２つのＰＣＩｅ装置の間のデータ伝送の性能に重大な影響を与える。例えば、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーン番号３において故障が発生すると、第１ＰＣＩｅ装置は、ＰＣＩｅプロトコルのネゴシエーション機構に従って、レーン番号０から開始して、第２ＰＣＩｅ装置とのリンクネゴシエーションを実行する。レーン番号０には故障は発生しておらず、従って、ネゴシエーションは成功し、ネゴシエーションはレーン番号１に進み、レーン番号１におけるネゴシエーションは成功し、ネゴシエーションはレーン番号２に進み、レーン番号２におけるネゴシエーションは成功し、ネゴシエーションはレーン番号３に進む。レーン番号において故障が発生するため、ネゴシエーションの実行は成功できない。さらに、レーン番号は連続的である必要があり、ネゴシエーションはレーン番号３において中断した後に継続できなくなる。この場合、３つのレーンのみ、すなわち、レーン番号０、レーン番号１及びレーン番号２が利用可能である。さらに、２つのＰＣＩｅ装置の間のリンクはＰＣＩｅプロトコルに従って１，２，４，８，１６又は３２個のレーンを含みうるため、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のネゴシエーションにより取得される現在レーン幅値Ｎは２である。当初は、データ伝送を実行するのに利用可能な第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーンは１６個あるが、レーン番号３において故障が発生した後、再ネゴシエーションにより取得され、データ伝送を実行するのに利用可能であるレーンは２つしかなく、これは当初の１／８である。この場合、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーン幅は大きく減少し、データ伝送性能は明らかに低下する。第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーン番号１において故障が発生した場合、同様に、レーン番号０しか利用可能でない。この場合、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間の再ネゴシエーションにより取得される現在レーン幅値Ｎは１であり、当初の１／１６であり、データ伝送性能は大きく低下する。 A message signaled interrupt (MSI) function is set in the first PCIe device. Upon detecting that a failure has occurred in one or more lanes of the link connected between the first PCIe device and the second PCIe device, the first PCIe device reports an MSI message to the CPU, where the MSI message is the first PCIe device. Contains the device ID of the device. The first PCIe device stores the lane number of the failed lane in the register. The first PCIe device performs lane renegotiation with the second PCIe device according to the renegotiation mechanism of the PCIe protocol in order to obtain the current lane width value N of the link between the first PCIe device and the second PCIe device. During lane renegotiation, according to the lane negotiation mechanism specified in PCIe, the first PCIe device starts from the lane with the lowest lane number and proceeds upward in ascending lane number until the negotiation is performed on the failed lane. The link negotiation with the second PCIe device is continuously executed. Since the negotiation for the failed lane cannot be successfully executed, the renegotiation process ends. The first PCIe device further determines the current lane width value N of the link according to the number of lanes of the link between the two PCIe devices specified in the PCIe protocol. Since the lane numbers need to be contiguous and link negotiation starts from the lowest numbered lane, the number of lanes acquired by renegotiation varies with the lane number of the failed lane, and there is uncertainty. When the lane number of the failed lane is relatively small, the number of lanes acquired by renegotiation is also greatly reduced, i.e., the acquired current lane width value N is greatly reduced, thereby causing the two PCIe devices to The data transmission performance between the two will be seriously affected. For example, if a failure occurs in lane number 3 between the first PCIe device and the second PCIe device, the first PCIe device starts from lane number 0 according to the negotiation mechanism of the PCIe protocol, and performs link negotiation with the second PCIe device. Run. No failure has occurred in lane number 0, so the negotiation is successful, the negotiation proceeds to lane number 1, the negotiation in lane number 1 is successful, the negotiation proceeds to lane number 2, and the negotiation in lane number 2 Succeeds and negotiation proceeds to lane number 3. Since a failure occurs in the lane number, the negotiation cannot be executed successfully. Furthermore, the lane numbers need to be sequential and the negotiation cannot continue after suspending at lane number 3. In this case, only three lanes, ie, lane number 0, lane number 1 and lane number 2 are available. In addition, since the link between two PCIe devices can contain 1, 2, 4, 8, 16 or 32 lanes according to the PCIe protocol, the current obtained by negotiation between the first PCIe device and the second PCIe device The lane width value N is 2. Initially, there are 16 lanes between the first PCIe device and the second PCIe device that can be used to perform data transmission, but after a failure occurs in lane number 3, it is obtained by renegotiation, and data transmission is performed. There are only two lanes available to run, which is 1/8 of the original. In this case, the lane width of the link between the first PCIe device and the second PCIe device is greatly reduced, and the data transmission performance is clearly reduced. Similarly, when a failure occurs in lane number 1 between the first PCIe device and the second PCIe device, only lane number 0 can be used. In this case, the current lane width value N acquired by renegotiation between the first PCIe device and the second PCIe device is 1, which is 1/16 of the initial value, and the data transmission performance is greatly deteriorated.

本発明の本実施例では、第１ＰＣＩｅ装置により報告されたＭＳＩメッセージを受信した後に、ＣＰＵはインタラプト処理プロセスに入る。ＣＰＵは、ＭＳＩメッセージにおける第１ＰＣＩｅ装置のデバイスＩＤに従って、第１ＰＣＩｅ装置から第１ＰＣＩｅ装置の現在レーン幅値Ｎ及び第１ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍを取得する。第１ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍは、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置とネゴシエートすることによって取得可能な最大レーン幅値を表す。第１ＰＣＩｅ装置によるネゴシエートにより取得可能な最大レーン幅値は、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンの合計数である。本実施例では、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンは１６個あり、従って、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置とネゴシエートすることにより取得可能な最大レーン幅値は１６であり、第１ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍは１６である。第１ＰＣＩｅ装置の現在レーン幅値Ｎは、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置と再ネゴシエートすることにより取得される現在レーン幅値Ｎである。例えば、上述されるように、第１ＰＣＩｅ装置及び第２ＰＣＩｅ装置を接続する１６個のレーンがあり、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーン番号３において故障が発生すると、第１ＰＣＩｅ装置は、２である現在レーン幅値Ｎを取得するため、ＰＣＩｅプロトコルの再ネゴシエーション機構に従って第２ＰＣＩｅ装置と再ネゴシエートする。 In this embodiment of the invention, after receiving the MSI message reported by the first PCIe device, the CPU enters an interrupt processing process. The CPU acquires the current lane width value N of the first PCIe device and the lane negotiation capability value M of the first PCIe device from the first PCIe device according to the device ID of the first PCIe device in the MSI message. The lane negotiation capability value M of the first PCIe device represents a maximum lane width value that can be acquired by the first PCIe device negotiating with the second PCIe device. The maximum lane width value that can be acquired by negotiation by the first PCIe device is the total number of lanes of the link between the first PCIe device and the second PCIe device. In this embodiment, there are 16 lanes of the link between the first PCIe device and the second PCIe device, and therefore the maximum lane width value that can be obtained by the first PCIe device negotiating with the second PCIe device is 16. The lane negotiation capability value M of the first PCIe device is 16. The current lane width value N of the first PCIe device is a current lane width value N acquired when the first PCIe device renegotiates with the second PCIe device. For example, as described above, there are 16 lanes connecting the first PCIe device and the second PCIe device, and when a failure occurs in lane number 3 between the first PCIe device and the second PCIe device, the first PCIe device In order to obtain a current lane width value N of 2, renegotiate with the second PCIe device according to the renegotiation mechanism of the PCIe protocol.

ＣＰＵは、第１ＰＣＩｅ装置の取得した現在レーン幅値Ｎとレーンネゴシエーション能力値Ｍとを比較する。本発明では、ＣＰＵは、２つのＰＣＩｅ装置の間のリンクのレーン数の規則に従って、ＮとＭ／２とを比較する。 The CPU compares the current lane width value N acquired by the first PCIe device with the lane negotiation capability value M. In the present invention, the CPU compares N and M / 2 according to the rule of the number of lanes of links between two PCIe devices.

Ｎ≧Ｍ／２である場合、ＣＰＵは処理を実行しない。第１ＰＣＩｅ装置がＣＰＵにより送信された指示を所定の時間内に受信しない場合、第１ＰＣＩｅ装置は、再ネゴシエーションにより取得されたＮ個のレーンを利用することによって、第２ＰＣＩｅ装置とのデータ伝送を実行し続ける。この場合、Ｎ≧Ｍ／２は、故障したレーンのレーン番号が（Ｍ／２−１）より大きいことを示しており、２つのＰＣＩｅ装置の間のリンクは１，２，４，８，１６又は３２個のレーンを含みうるため、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置と再ネゴシエートすることによって取得される現在レーン幅値ＮはＭ／２であり、第１ＰＣＩｅ装置は、再ネゴシエーションにより取得されるリンクレーンを利用することによって、第２ＰＣＩｅ装置とのデータ伝送を実行し続ける。 When N ≧ M / 2, the CPU does not execute the process. If the first PCIe device does not receive the instruction transmitted by the CPU within a predetermined time, the first PCIe device performs data transmission with the second PCIe device by using N lanes acquired by renegotiation. Keep doing. In this case, N ≧ M / 2 indicates that the lane number of the failed lane is greater than (M / 2-1), and the link between the two PCIe devices is 1, 2, 4, 8, 16 Alternatively, the current lane width value N obtained by the first PCIe device renegotiating with the second PCIe device is M / 2 since the first PCIe device may include 32 lanes, and the first PCIe device is a link obtained by the renegotiation. By using the lane, data transmission with the second PCIe device is continued.

Ｎ＜Ｍ／２である場合、ＣＰＵは、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクに対してレーン反転処理を実行するよう第１ＰＣＩｅ装置に指示する。Ｎ＜Ｍ／２であるとき、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置と再ネゴシエートすることにより取得される現在レーン幅は、トータルのリンク幅の半分未満であり、これは、故障したレーンのレーン番号がＭ／２未満であることを示す。レーン反転処理は、第１ＰＣＩｅ装置が第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンを反対方向に再番号付けすることを示す。第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンが当初は左から右に０から始まって１５まで番号付けされている場合、第１ＰＣＩｅ装置がレーン反転処理を実行した後、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーンは右から左に０から１５まで番号付けされ、逆の場合も同様である。例えば、本実施例では、第１ＰＣＩｅ装置がレーン反転処理を実行した後、元のレーン番号１５のレーン番号は１５から０に変更され、すなわち、元のレーン番号１５はレーン番号０に変更され、元のレーン番号１４のレーン番号は１４から１に変更され、すなわち、元のレーン番号１４はレーン番号１に変更され、残りは同様に導くことができる。レーン反転処理を実行するよう第１ＰＣＩｅ装置に指示する前に、ＣＰＵは更に、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーン番号０からレーン番号（Ｍ／２−１）を無効にしてもよい。レーン番号０からレーン番号（Ｍ／２−１）が無効にされた後、第１ＰＣＩｅ装置は、無効にされたレーンをもはや番号付けする必要はない。この場合、第１ＰＣＩｅ装置は、元のレーン番号Ｍ／２のレーン番号が（Ｍ／２−１）に変更されるまで、レーン反転処理を実行し、元のレーン番号Ｍのレーン番号を０に変更し、元のレーン番号（Ｍ−１）のレーン番号を１に変更するなど行う。 When N <M / 2, the CPU instructs the first PCIe device to execute the lane inversion process for the link between the first PCIe device and the second PCIe device. When N <M / 2, the current lane width obtained by the first PCIe device renegotiating with the second PCIe device is less than half of the total link width, because the lane number of the failed lane is Indicates less than M / 2. The lane inversion process indicates that the first PCIe device renumbers the lane of the link between the first PCIe device and the second PCIe device in the opposite direction. If the lane of the link between the first PCIe device and the second PCIe device is initially numbered from 0 to 15 starting from 0 to the left, the first PCIe device performs the lane inversion process and then the first PCIe device The lane of the link between the device and the second PCIe device is numbered from 0 to 15 from right to left, and vice versa. For example, in the present embodiment, after the first PCIe device performs the lane inversion process, the lane number of the original lane number 15 is changed from 15 to 0, that is, the original lane number 15 is changed to lane number 0, The lane number of the original lane number 14 is changed from 14 to 1, that is, the original lane number 14 is changed to lane number 1 and the rest can be similarly derived. Before instructing the first PCIe device to execute the lane inversion process, the CPU may further invalidate the lane number (M / 2-1) from lane number 0 between the first PCIe device and the second PCIe device. . After the lane number 0 to lane number (M / 2-1) is invalidated, the first PCIe device no longer needs to number the invalidated lanes. In this case, the first PCIe device executes lane inversion processing until the lane number of the original lane number M / 2 is changed to (M / 2-1), and the lane number of the original lane number M is set to 0. And the lane number of the original lane number (M-1) is changed to 1.

レーン反転処理を実行した後、第１ＰＣＩｅ装置は、新たな現在レーン幅値Ｎ’を取得するため、第２ＰＣＩｅ装置との新たな現在レーン幅値に対して再ネゴシエーションを実行する。第１ＰＣＩｅ装置が第２ＰＣＩｅ装置と現在レーン幅を再ネゴシエートするための方法は、上記の再ネゴシエーション方法と同じである。ＰＣＩｅプロトコルの要求に従って、第１ＰＣＩｅ装置は、反転後の最小の番号のレーンから開始してネゴシエーションを実行し、レーン番号の昇順により上方に連続的に第２ＰＣＩｅ装置とのリンクネゴシエーションを再実行する。すなわち、第１ＰＣＩｅ装置は反転されたレーン番号０から始まるネゴシエーションを実行し、その後、ネゴシエーションがレーン番号（Ｍ／２−１）に対して実行されるまで、レーン番号１及びレーン番号２に対して逐次的にネゴシエーションを実行する。第１ＰＣＩｅ装置は、Ｎ＜Ｍ／２の場合にはレーン反転処理を実行し、この場合、故障したレーンのレーン番号はＭ／２未満であり、すなわち、故障したレーンは、レーン番号０からレーン番号（Ｍ／２−１）までの１つであり、レーン番号０からレーン番号（Ｍ／２−１）までは無効とされるが、レーン番号Ｍ／２からレーン番号Ｍにおいて故障は発生せず、データ伝送が実行可能である。従って、第１ＰＣＩｅ装置がレーン反転を実行した後、新たなレーン番号０からレーン番号（Ｍ／２−１）において故障は発生せず、第１ＰＣＩｅ装置が第２ＰＣＩｅ装置と再ネゴシエートすることにより取得される新たな現在レーン幅値Ｎ’はＭ／２である。 After executing the lane inversion process, the first PCIe device executes renegotiation on the new current lane width value with the second PCIe device in order to obtain a new current lane width value N ′. The method for the first PCIe device to renegotiate the current lane width with the second PCIe device is the same as the above-described renegotiation method. In accordance with the request of the PCIe protocol, the first PCIe device executes the negotiation starting from the lowest-numbered lane after the inversion, and continuously re-executes the link negotiation with the second PCIe device upward in the ascending order of the lane numbers. That is, the first PCIe device performs the negotiation starting from the inverted lane number 0, and thereafter, for lane number 1 and lane number 2, until the negotiation is performed for the lane number (M / 2-1). Perform negotiations sequentially. The first PCIe device executes the lane inversion process when N <M / 2, and in this case, the lane number of the failed lane is less than M / 2, that is, the failed lane is lane number 0 to lane No. from lane number 0 to lane number (M / 2-1) is invalid, but no failure occurs from lane number M / 2 to lane number M. Instead, data transmission can be performed. Therefore, after the first PCIe device performs the lane inversion, a failure does not occur in the new lane number 0 to lane number (M / 2-1), and the first PCIe device is acquired by renegotiating with the second PCIe device. The new current lane width value N ′ is M / 2.

ＰＣＩｅ装置のレーンにおいて故障が発生すると、本発明の方策を利用することは、再ネゴシエーションにより取得されるレーン幅が、故障したレーンのレーン番号に関わらず、元のレーン幅の１／２であることを保証することが可能であり、故障したレーンのレーン番号が相対的に小さいとき、ＰＣＩｅ装置のレーン値が大きく減少し、これにより、ＰＣＩｅ装置の間のデータ伝送速度が影響を受けることを回避する。 When a failure occurs in a lane of a PCIe device, using the measures of the present invention, the lane width obtained by renegotiation is 1/2 of the original lane width regardless of the lane number of the failed lane. When the lane number of the failed lane is relatively small, the lane value of the PCIe device is greatly reduced, which affects the data transmission rate between the PCIe devices. To avoid.

本発明の技術的方策が、特定の具体例を利用することによって以下において詳細に記載及び説明される。ＰＣＩｅ装置の間の接続関係が図３に示され、方法の処理については、図４を参照されたい。 The technical measures of the present invention are described and explained in detail below by utilizing specific embodiments. The connection relationship between the PCIe devices is shown in FIG. 3, and see FIG. 4 for the processing of the method.

以下に説明されるＰＣＩｅ装置は、ＰＣＩｅプロトコルを利用することによって他の装置と通信する装置を参照する。図３に示されるように、ルートコンプレクス（ｒｏｏｔｃｏｍｐｌｅｘ）装置は、ＰＣＩｅプロトコルを利用することによって下流のＰＣＩｅ装置と通信するＰＣＩｅ装置、すなわち、図２に示される第１ＰＣＩｅ装置などの上流のＰＣＩｅ装置を表す。ＰＣＩｅ装置１（ＰＣＩｅａｐｐａｒａｔｕｓ１）は、ＰＣＩｅプロトコルを利用することによって上流のＰＣＩｅ装置と通信するＰＣＩｅ装置、すなわち、図２に示される第２ＰＣＩｅ装置などの下流のＰＣＩｅ装置を表す。さらに、ＰＣＩｅ装置は、物理的に独立した装置であってもよいし、又は、デバイスに統合されたチップであってもよく、図１に示されるＰＣＩｅ装置に限定されるものでない。 A PCIe device described below refers to a device that communicates with other devices by utilizing the PCIe protocol. As shown in FIG. 3, a root complex device is a PCIe device that communicates with a downstream PCIe device by utilizing the PCIe protocol, ie, an upstream PCIe device such as the first PCIe device shown in FIG. Represents. The PCIe device 1 (PCIe apparatus 1) represents a PCIe device that communicates with an upstream PCIe device by using the PCIe protocol, that is, a downstream PCIe device such as the second PCIe device shown in FIG. Further, the PCIe device may be a physically independent device, or may be a chip integrated into the device, and is not limited to the PCIe device shown in FIG.

ｒｏｏｔｃｏｍｐｌｅｘは、下流のポートを利用してデータ伝送を実現することによって、ＰＣＩｅ装置１の上流のポートに接続される。本実施例は、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間に１６個のｓｅｒｄｅｓがある具体例を利用することによって説明される。これら１６個のｓｅｒｄｅｓは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のリンクを形成し、１つのｓｅｒｄｅｓは１つのレーンとして参照される。すなわち、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン数は１６であり、すなわち、Ｘ１６である。さらに、１６個のレーンは、左から右に０から１５まで連続的に番号付けされる。この場合、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間の最大レーン幅値は１６である。ｒｏｏｔｃｏｍｐｌｅｘのレーンネゴシエーション能力値Ｍは、ｒｏｏｔｃｏｍｐｌｅｘの最大レーン幅値に等しく、ｒｏｏｔｃｏｍｐｌｅｘの最大レーン幅値は、ｒｏｏｔｃｏｍｐｌｅｘによるネゴシエートにより取得可能なｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間の最大レーン数を表す。本実施例では、ｒｏｏｔｃｏｍｐｌｅｘのレーンネゴシエーション能力値Ｍは１６に等しい。ｒｏｏｔｃｏｍｐｌｅｘは、ｒｏｏｔｃｏｍｐｌｅｘのレーンネゴシエーション能力値Ｍをレジスタに記憶する。ＰＣＩｅプロトコルの規定に従って、２つのＰＣＩｅ装置の間のリンクは１，２，４，８，１６又は３２個のレーンを含むものであってもよい。本実施例は、具体例として１６個のレーンを利用することによって説明される。リンクが１，２，４，８又は３２個のレーンを有するとき、その実現方式は１６個のレーンの実現方式と同じである。 The root complex is connected to the upstream port of the PCIe device 1 by realizing data transmission using the downstream port. The present embodiment will be described by using a specific example in which there are 16 serdes between the root complex and the PCIe device 1. These 16 serdes form a link between the root complex and the PCIe device 1, and one serdes is referred to as one lane. That is, the number of lanes between the root complex and the PCIe device 1 is 16, that is, X16. In addition, the 16 lanes are numbered consecutively from 0 to 15 from left to right. In this case, the maximum lane width value between the root complex and the PCIe device 1 is 16. The lane negotiation capability value M of the root complex is equal to the maximum lane width value of the root complex, and the maximum lane width value of the root complex is the maximum number of lanes between the root complex and the PCIe device 1 that can be obtained by negotiation by the root complex. Represents. In the present embodiment, the lane negotiation capability value M of the root complex is equal to 16. The root complex stores the lane negotiation capability value M of the root complex in a register. The link between two PCIe devices may include 1, 2, 4, 8, 16, or 32 lanes, as defined by the PCIe protocol. This embodiment will be described by using 16 lanes as a specific example. When a link has 1, 2, 4, 8 or 32 lanes, its implementation is the same as that of 16 lanes.

ｒｏｏｔｃｏｍｐｌｅｘは更に、他のＰＣＩｅ装置に接続されてもよい。ｒｏｏｔｃｏｍｐｌｅｘが複数のＰＣＩｅ装置に接続されるとき、図１に示されるように、ｒｏｏｔｃｏｍｐｌｅｘは更にスイッチに接続されてもよく、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置との間のレーンネゴシエーション能力値は、互いを区別するため、Ｍ１，Ｍ２及びＭ３として別々に記録されてもよい。以下の方法では、ｒｏｏｔｃｏｍｐｌｅｘが１つのＰＣＩｅ装置に接続される具体例を利用することによって説明が与えられる。ｒｏｏｔｃｏｍｐｌｅｘが複数のＰＣＩｅ装置に接続されるとき、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置の何れかとの間のレーンにおける故障を処理する方法は、本実施例におけるものと同様である。 The root complex may also be connected to other PCIe devices. When the root complex is connected to multiple PCIe devices, as shown in FIG. 1, the root complex may be further connected to a switch, and the lane negotiation capability value between the root complex and the PCIe device For distinction, they may be recorded separately as M1, M2 and M3. In the following method, an explanation is given by using a specific example in which a root complex is connected to one PCIe device. When the root complex is connected to a plurality of PCIe devices, the method for handling a failure in the lane between the root complex and one of the PCIe devices is the same as in the present embodiment.

メッセージ・シグナルド・インタラプト（ＭＳＩ，ｍｅｓｓａｇｅｓｉｇｎａｌｉｎｔｅｒｒｕｐｔ）機能が、ｒｏｏｔｃｏｍｐｌｅｘに設定される。ｒｏｏｔｃｏｍｐｌｅｘは、ＰＣＩｅ装置１に接続されるレーンをモニタリングし、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン幅が変更されたことを検出すると、ｒｏｏｔｃｏｍｐｌｅｘは、ＭＳＩメッセージをＣＰＵに報告する。ｒｏｏｔｃｏｍｐｌｅｘがＰＣＩｅ装置１とのデータ伝送を実行すると、２つの装置の間に接続される全てのレーン（すなわち、全てのｓｅｒｄｅｓ）が利用される。その何れか１以上のレーンにおいて故障が発生すると、ｒｏｏｔｃｏｍｐｌｅｘは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン幅が変更されたことを検出し、ｒｏｏｔｃｏｍｐｌｅｘは、ＭＳＩメッセージを生成し、ＭＳＩメッセージをＣＰＵに報告する。ＭＳＩメッセージは、ｒｏｏｔｃｏｍｐｌｅｘのポートのＢ：Ｄ：Ｆ（バス番号：デバイス番号：ファンクション番号）などのｒｏｏｔｃｏｍｐｌｅｘのデバイスＩＤを含む。ｒｏｏｔｃｏｍｐｌｅｘは、故障したレーンのレーン番号をレジスタに記憶する。本実施例では、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン番号５において故障が発生し、ｒｏｏｔｃｏｍｐｌｅｘは、レーン番号５において故障が発生したと検出し、故障したレーンのレーン番号情報をレジスタに記憶する。 A message signaled interrupt (MSI, message signal interrupt) function is set to the root complex. The root complex monitors the lane connected to the PCIe device 1 and detects that the lane width between the root complex and the PCIe device 1 has changed, and the root complex reports an MSI message to the CPU. When the root complex executes data transmission with the PCIe device 1, all lanes connected between the two devices (that is, all serdes) are used. If a failure occurs in any one or more of the lanes, the root complex detects that the lane width between the root complex and the PCIe device 1 has been changed, and the root complex generates an MSI message and generates an MSI message. To the CPU. The MSI message includes a root complex device ID such as B: D: F (bus number: device number: function number) of the root complex port. The root complex stores the lane number of the failed lane in a register. In the present embodiment, a failure occurs in lane number 5 between the root complex and the PCIe device 1, and the root complex detects that a failure has occurred in lane number 5, and stores the lane number information of the failed lane in a register. Remember.

ＭＳＩメッセージをＣＰＵに報告した後、ｒｏｏｔｃｏｍｐｌｅｘは、現在レーン幅値Ｎを取得するため、ＰＣＩｅ装置１と再ネゴシエートする。ＰＣＩｅプロトコルの規定によると、ＰＣＩｅ装置の間のリンクのレーンネゴシエーションは、故障が発生したレーンに対してネゴシエーションが実行されるまで、最小の番号のレーンから開始される。このようにして、ネゴシエーションの実行が成功したレーンのレーン番号は依然として連続的であり、データを送信するのに利用可能なレーン数が取得されてもよい。本実施例では、ｒｏｏｔｃｏｍｐｌｅｘは、レーン番号０から始まってＰＣＩｅ装置１とネゴシエートし、レーン番号０に対して実行されたネゴシエーションが成功した後（すなわち、レーン番号０が正常であり、データを送信するのに利用可能である）、それから、レーン番号０に対してネゴシエーションが実行される。残りは同様に導くことができる。ネゴシエーションがレーン番号５に対して実行されると、レーン番号５において故障が発生しているため、レーン番号５に対して実行されるネゴシエーションは不成功であり、ｒｏｏｔｃｏｍｐｌｅｘはＰＣＩｅ装置１とのネゴシエートを止める。ルートコンプレクスとＰＣＩｅ装置１との間のレーン番号０からレーン番号４までに対するネゴシエーションの実行は成功し、すなわち、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間の５つのレーンに対するネゴシエーションの実行は成功する。このとき、２つのＰＣＩｅ装置の間のレーンが１，２，４，８，１６又は３２個のレーンを含みうるＰＣＩｅプロトコルの規定によると、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間の現在レーン幅値Ｎが４であると判断される。ｒｏｏｔｃｏｍｐｌｅｘは、再ネゴシエーションにより取得された現在レーン幅値Ｎをレジスタに記憶する。 After reporting the MSI message to the CPU, the root complex renegotiates with the PCIe device 1 to obtain the current lane width value N. According to the specification of the PCIe protocol, the lane negotiation of the link between the PCIe devices is started from the lowest-numbered lane until the negotiation is performed on the lane where the failure has occurred. In this way, the lane numbers of the lanes that have been successfully negotiated are still sequential, and the number of lanes available for transmitting data may be obtained. In this embodiment, the root complex starts with lane number 0 and negotiates with the PCIe device 1, and after the negotiation executed for lane number 0 is successful (that is, lane number 0 is normal and data is transmitted). Then, negotiation is performed for lane number 0. The rest can be guided as well. When the negotiation is executed for the lane number 5, since a failure has occurred in the lane number 5, the negotiation executed for the lane number 5 is unsuccessful, and the root complex is negotiated with the PCIe device 1. Stop. Negotiations for lane number 0 to lane number 4 between the root complex and the PCIe device 1 are successful, that is, negotiations for five lanes between the root complex and the PCIe device 1 are successful. At this time, the current lane width value between the root complex and the PCIe device 1 according to the specification of the PCIe protocol in which the lane between the two PCIe devices may include 1, 2, 4, 8, 16, or 32 lanes. It is determined that N is 4. The root complex stores the current lane width value N acquired by renegotiation in a register.

ｒｏｏｔｃｏｍｐｌｅｘにより報告されたＭＳＩメッセージを受信した後に、ＣＰＵは、インタラプト処理プロセスに入る。ＣＰＵは、ＭＳＩメッセージからｒｏｏｔｃｏｍｐｌｅｘのＩＤを抽出し、それから、取得したＩＤに従ってｒｏｏｔｃｏｍｐｌｅｘのレジスタからｒｏｏｔｃｏｍｐｌｅｘの現在レーン幅値Ｎ及びレーンネゴシエーション能力値Ｍを読む。ｒｏｏｔｃｏｍｐｌｅｘの現在レーン幅値Ｎは、ｒｏｏｔｃｏｍｐｌｅｘがｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンにおいて故障が発生したことを検出した後に、ＰＣＩｅ装置１との再ネゴシエートにより取得される。本実施例では、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のｓｅｒｄｅｓレーン番号５において故障が発生した後、再ネゴシエーションにより取得された現在レーン幅値Ｎは４である。具体的なネゴシエーション方法は前のパラグラフで説明され、ここでは個別に再説明されない。さらに、上述されるように、本実施例では、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間には１６個のレーンがあり、従って、ｒｏｏｔｃｏｍｐｌｅｘのレーンネゴシエーション能力値Ｍは１６である。 After receiving the MSI message reported by the root complex, the CPU enters an interrupt processing process. The CPU extracts the root complex ID from the MSI message, and then reads the current lane width value N and lane negotiation capability value M of the root complex from the root complex register according to the acquired ID. The current lane width value N of the root complex is obtained by renegotiation with the PCIe device 1 after the root complex detects that a failure has occurred in the lane between the root complex and the PCIe device 1. In the present embodiment, the current lane width value N acquired by renegotiation is 4 after a failure has occurred in the serdes lane number 5 between the root complex and the PCIe device 1. The specific negotiation method is described in the previous paragraph and is not individually re-explained here. Furthermore, as described above, in this embodiment, there are 16 lanes between the root complex and the PCIe device 1, and thus the lane negotiation capability value M of the root complex is 16.

ＣＰＵは、ｒｏｏｔｃｏｍｐｌｅｘの現在レーン幅値Ｎとｒｏｏｔｃｏｍｐｌｅｘのレーンネゴシエーション能力値Ｍとを比較する。本発明では、ＣＰＵは、２つのＰＣＩｅ装置の間のリンクのレーン数の規則に従ってＮとＭ／２とを比較する。 The CPU compares the current lane width value N of the root complex with the lane negotiation capability value M of the root complex. In the present invention, the CPU compares N and M / 2 according to the rule for the number of lanes of links between two PCIe devices.

Ｎ≧Ｍ／２であるとき、ｒｏｏｔｃｏｍｐｌｅｘがＰＣＩｅ装置１とネゴシエートすることによって取得される現在レーン幅値は最大レーン幅値の半分であり、この場合、ＣＰＵは処理を実行しない。ｒｏｏｔｃｏｍｐｌｅｘがＣＰＵにより送信された指示を所定の時間内に受信しなかった場合、ｒｏｏｔｃｏｍｐｌｅｘは、再ネゴシエーションにより取得されるＮ個のリンクレーンを利用することによって、ＰＣＩｅ装置１つのデータ伝送を実行し続ける。 When N ≧ M / 2, the current lane width value obtained by the root complex negotiating with the PCIe device 1 is half of the maximum lane width value, and in this case, the CPU does not execute processing. If the root complex does not receive the instruction sent by the CPU within a predetermined time, the root complex performs data transmission for one PCIe device by using N link lanes acquired by renegotiation. Keep doing.

Ｎ＜Ｍ／２であるとき、ｒｏｏｔｃｏｍｐｌｅｘがＰＣＩｅ装置１とネゴシエートすることによって取得される現在レーン幅値は、最大レーン幅値の半分未満であり、この場合、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のリンクの幅は大きく減少する。この場合、ＣＰＵは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン番号０からレーン番号（Ｍ／２−１）までを無効にし、ｒｏｏｔｃｏｍｐｌｅｘにレーン反転処理を開始するよう指示する。レーン反転処理は、ＰＣＩｅ装置がＰＣＩｅ装置と下流のＰＣＩｅ装置との間のリンクのレーンを反対方向に再び番号付けする。レーン反転処理を開始した後、ｒｏｏｔｃｏｍｐｌｅｘは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンを反対方向に再び番号付けする。ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンは、左から右に昇順に連続的に番号付けされる。レーン反転処理を開始した後、ｒｏｏｔｃｏｍｐｌｅｘは、右から左に昇順にｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンを連続的に番号付けする。具体的な番号付け処理の関連する説明は、上記に提供され、ここでは個別に再説明されない。元のレーン番号０からレーン番号（Ｍ／２−１）までは無効にされるため、ｒｏｏｔｃｏｍｐｌｅｘは、レーン反転処理を開始した後に無効にされたレーンを番号付けしない。このように、ｒｏｏｔｃｏｍｐｌｅｘは、右から左に昇順に０から始めて（Ｍ／２−１）までｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンを連続的に番号付けする。 When N <M / 2, the current lane width value obtained by the root complex negotiating with the PCIe device 1 is less than half of the maximum lane width value. In this case, the root complex and the PCIe device 1 The width of the link in between is greatly reduced. In this case, the CPU invalidates the lane number 0 to the lane number (M / 2-1) between the root complex and the PCIe device 1 and instructs the root complex to start the lane inversion process. In the lane reversal process, the PCIe device renumbers the lanes of the link between the PCIe device and the downstream PCIe device in the opposite direction. After starting the lane inversion process, the root complex renumbers the lanes between the root complex and the PCIe device 1 in the opposite direction. Lanes between the root complex and the PCIe device 1 are sequentially numbered from left to right in ascending order. After starting the lane inversion process, the root complex sequentially numbers the lanes between the root complex and the PCIe device 1 in ascending order from right to left. A related description of the specific numbering process is provided above and will not be reintroduced here individually. Since the original lane number 0 to the lane number (M / 2-1) are invalidated, the root complex does not number invalidated lanes after starting the lane inversion process. In this way, the root complex sequentially numbers the lanes between the root complex and the PCIe device 1 starting from 0 in ascending order from right to left to (M / 2-1).

レーン反転処理を終了した後、ｒｏｏｔｃｏｍｐｌｅｘは、それからｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置との間のリンクの新たな現在レーン幅値Ｎ’を取得するため、ＰＣＩｅ装置１とのレーン幅ネゴシエーションを実行する。ＰＣＩｅプロトコルの規則によると、ネゴシエーションが故障が発生したレーンに対して実行されるまで、ＰＣＩｅ装置の間のネゴシエーションが最小番号のレーンから開始される。レーン反転処理を終了した後、ｒｏｏｔｃｏｍｐｌｅｘは、故障が発生したレーンに対してネゴシエーションが実行されるまで、新たなレーン番号０から始まるネゴシエーションを実行し、通信に利用可能なレーン数を決定する。ｒｏｏｔｃｏｍｐｌｅｘは、その後、ＰＣＩｅプロトコルに規定される、２つのＰＣＩｅ装置の間のリンクのレーン数の要件に従って新たな現在レーン幅値Ｎ’を決定する。ｒｏｏｔｃｏｍｐｌｅｘは、ネゴシエーションにより取得される新たなＮ’の個数のレーンを利用することによって、ＰＣＩｅ装置１とのデータ伝送を実行し続ける。 After completing the lane inversion process, the root complex then performs a lane width negotiation with the PCIe device 1 to obtain a new current lane width value N ′ of the link between the root complex and the PCIe device. According to the rules of the PCIe protocol, negotiations between PCIe devices are started from the lowest numbered lane until negotiation is performed on the failed lane. After completing the lane inversion process, the root complex executes a negotiation starting from a new lane number 0 until the negotiation is performed on the lane in which the failure has occurred, and determines the number of lanes available for communication. The root complex then determines a new current lane width value N 'according to the requirement for the number of lanes of the link between two PCIe devices as defined in the PCIe protocol. The root complex continues to execute data transmission with the PCIe device 1 by using the new N ′ number of lanes acquired by the negotiation.

本実施例では、レーン番号５において故障が発生し、Ｎは４であり、Ｍは１６であり、４＜８（Ｎ＜Ｍ／２）である。ＣＰＵは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーン番号０からレーン番号（Ｍ／２−１）（すなわち、７番）までを無効にし、レーン反転処理を実行するようｒｏｏｔｃｏｍｐｌｅｘに指示する。当該指示を受信した後、ｒｏｏｔｃｏｍｐｌｅｘは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間のレーンを反対方向に再び番号付けする。レーン番号１５のレーン番号は１５から０に変更され、すなわち、元のレーン番号１５はレーン番号０に変更され、レーン番号１４のレーン番号は１４から１に変更され、すなわち、元のレーン番号１４はレーン番号１に変更され、残りは同様に導くことができる。レーン番号０からレーン番号（Ｍ／２−１）（すなわち、７番）までが無効にされるため、レーン反転処理は、元のレーン番号Ｍ／２（すなわち、８番）がレーン番号（Ｍ／２−１）（すなわち、７番）に変更された後に終了される。レーン反転を終了した後、ｒｏｏｔｃｏｍｐｌｅｘはＰＣＩｅ装置１とのレーン再ネゴシエーションを実行し、新たな反転されたレーン番号０から始まるネゴシエーションが実行され、レーン番号に従って上方にネゴシエーションが連続的に実行される。データ伝送は、故障が発生しない元のレーン番号８から元のレーン番号１５までに対して実行可能であるため、ｒｏｏｔｃｏｍｐｌｅｘがＰＣＩｅ装置１とのレーン再ネゴシエーションを再び実行するとき、８つのレーンに対してネゴシエーションの実行が成功する。さらに、ＰＣＩｅプロトコルは、レーン幅が１，２，４，８，１６及び３２の１つであると規定しているため、ＰＣＩｅ装置１とのルートコンプレクスによる再ネゴシエートにより取得される新たな現在レーン幅値Ｎ’は８である。 In this embodiment, a failure occurs in lane number 5, N is 4, M is 16, and 4 <8 (N <M / 2). The CPU invalidates the lane number 0 to the lane number (M / 2-1) (that is, No. 7) between the root complex and the PCIe device 1 and instructs the root complex to execute the lane inversion process. After receiving the instruction, the root complex renumbers the lanes between the root complex and the PCIe device 1 in the opposite direction. The lane number of lane number 15 is changed from 15 to 0, that is, the original lane number 15 is changed to lane number 0, and the lane number of lane number 14 is changed from 14 to 1, that is, the original lane number 14 Is changed to lane number 1 and the rest can be led in the same way. Since the lane number 0 to the lane number (M / 2-1) (ie, No. 7) are invalidated, the lane inversion processing is performed so that the original lane number M / 2 (ie, No. 8) / 2-1) (i.e., No. 7) and then finished. After completing the lane inversion, the root complex executes the lane renegotiation with the PCIe device 1, the negotiation starting from the newly inverted lane number 0 is executed, and the negotiation is continuously executed upward according to the lane number. . Since the data transmission can be performed for the original lane number 8 to the original lane number 15 in which no failure occurs, when the root complex executes the lane renegotiation with the PCIe device 1 again, the data transmission is performed in eight lanes. The negotiation is successfully executed. Further, since the PCIe protocol stipulates that the lane width is one of 1, 2, 4, 8, 16 and 32, a new current lane acquired by renegotiation by the route complex with the PCIe device 1 The width value N ′ is 8.

ｒｏｏｔｃｏｍｐｌｅｘは、ｒｏｏｔｃｏｍｐｌｅｘとＰＣＩｅ装置１との間の新たな反転されたレーン番号０から新たな反転されたレーン番号７（すなわち、８番）までを利用することによって、データ伝送を実行し続ける。具体的なデータ伝送処理は、既存の実現方式と同様であり、ここでは個別に再説明されない。 The root complex continues to perform data transmission by utilizing the new inverted lane number 0 to the new inverted lane number 7 (ie, number 8) between the root complex and the PCIe device 1. . The specific data transmission process is the same as that of the existing implementation method, and is not described again here.

このようにして、２つのＰＣＩｅ装置の間のリンクのレーンにおいて故障が発生した場合、本発明において提供される方策を利用することは、ＰＣＩｅ装置のレーンネゴシエーション能力値の１／２である個数のレーンが、故障したレーンのレーン番号に関係なく利用可能であることを保証でき、ＰＣＩｅ装置の間のリンクの伝送速度及び性能を最大程度まで保証する。 Thus, in the event of a failure in the lane of the link between two PCIe devices, utilizing the measures provided in the present invention will reduce the number of lane negotiation capability values of ½ of the PCIe device. The lane can be guaranteed to be usable regardless of the lane number of the failed lane, and the transmission speed and performance of the link between the PCIe devices are guaranteed to the maximum extent.

本発明は更に、ＰＣＩｅ装置のリンク故障を処理するシステムを提供し、システムの構成は図２に示される。システムは、中央処理ユニットＣＰＵ２０１、第１ＰＣＩｅ装置２０３及び第２ＰＣＩｅ装置２０５を有する。第１ＰＣＩｅ装置及び第２ＰＣＩｅ装置は、複数のシリアライザ／デ・シリアライザ（ｓｅｒｄｅｓ，Ｓｅｒｉａｌｉｚｅｒ／Ｄｅ−Ｓｅｒｉａｌｉｚｅｒ）回路を利用することによって接続される。複数のｓｅｒｄｅｓは、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間でデータを伝送するのに利用されるリンク２０４を構成し、各ｓｅｒｄｅｓはレーンであり、レーンはある順序に連続的に番号付けされる。２つのＰＣＩｅ装置の間には１，２，４，８，１６又は３２個のｓｅｒｄｅｓがあってもよく、本実施例は具体例として１６個のｓｅｒｄｅｓを利用することによって説明される。すなわち、第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間には１６個のｓｅｒｄｅｓ、すなわち、１６個のレーンがあり、これら１６個のレーンは左から右に０から始まって１５まで連続的且つ順次的に番号付けされる。番号０のレーンはレーン番号０であり、同様に、番号１５のレーンはレーン番号１５である。ｓｅｒｄｅｓは、既存のｓｅｒｄｅｓ構造を利用し、既存の機能をサポートし、ここでは個別に再説明されない。本発明の本実施例では、いくつのｓｅｒｄｅｓがリンクにあっても、その実現原理は同様である。 The present invention further provides a system for handling link failures of PCIe devices, the system configuration being shown in FIG. The system includes a central processing unit CPU 201, a first PCIe device 203, and a second PCIe device 205. The first PCIe device and the second PCIe device are connected by using a plurality of serializer / deserializer (serdes, Serializer / De-Serializer) circuits. The plurality of serdes constitute a link 204 used to transmit data between the first and second PCIe devices, each serdes is a lane, and the lanes are sequentially numbered in a certain order. . There may be 1, 2, 4, 8, 16 or 32 serdes between two PCIe devices, and this embodiment will be described by using 16 serdes as a specific example. That is, there are 16 serdes, that is, 16 lanes between the first PCIe device 203 and the second PCIe device 205, and these 16 lanes are sequentially and sequentially from left to right starting from 0 to 15. Numbered. The lane number 0 is lane number 0. Similarly, the lane number 15 is lane number 15. serdes uses the existing serdes structure and supports existing functionality and is not individually re-described here. In this embodiment of the present invention, the realization principle is the same regardless of how many serdes are in the link.

メッセージ・シグナルド・インタラプト（ＭＳＩ、ｍｅｓｓａｇｅｓｉｇｎａｌｉｎｔｅｒｒｕｐｔ）機能が、第１ＰＣＩｅ装置２０３に設定される。第２ＰＣＩｅ装置２０５に接続されるリンク２０４の１以上のレーンにおいて故障が発生したと検出すると、第１ＰＣＩｅ装置２０３は、ＭＳＩメッセージをＣＰＵ２０１に報告し、ここで、ＭＳＩメッセージは第１ＰＣＩｅ装置２０３のデバイスＩＤを含む。 A message signaled interrupt (MSI) function is set in the first PCIe device 203. When detecting that a failure has occurred in one or more lanes of the link 204 connected to the second PCIe device 205, the first PCIe device 203 reports an MSI message to the CPU 201, where the MSI message is the device of the first PCIe device 203. Includes ID.

第１ＰＣＩｅ装置２０３は、第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間のリンク２０４の現在レーン幅値Ｎを取得するため、ＰＣＩｅプロトコルの再ネゴシエーション機構に従って第２ＰＣＩｅ装置２０５とレーン再ネゴシエーションを実行する。レーン再ネゴシエーション中、ＰＣＩｅプロトコルに規定されるレーンネゴシエーション機構によると、第１ＰＣＩｅ装置２０３は、故障したレーンに対してネゴシエーションが実行されるまで、最小のレーン番号のレーンから始めてレーン番号の昇順により上方に第２ＰＣＩｅ装置２０５とのリンクネゴシエーションを再び連続的に実行する。故障したレーンに対してネゴシエーションの実行が成功できないため、再ネゴシエーション処理は終了される。第１ＰＣＩｅ装置２０３は更に、ＰＣＩｅプロトコルに規定される２つのＰＣＩｅ装置の間のリンクのレーン数に従って、リンクの現在レーン幅値Ｎを決定する。 The first PCIe device 203 performs lane renegotiation with the second PCIe device 205 according to the PCIe protocol renegotiation mechanism in order to obtain the current lane width value N of the link 204 between the first PCIe device 203 and the second PCIe device 205. . During lane renegotiation, according to the lane negotiation mechanism specified in the PCIe protocol, the first PCIe device 203 starts from the lane with the lowest lane number and proceeds upward in ascending lane number until the failed lane is negotiated. The link negotiation with the second PCIe device 205 is continuously executed again. Since the negotiation cannot be successfully executed for the failed lane, the renegotiation process is terminated. The first PCIe device 203 further determines the current lane width value N of the link according to the number of lanes of the link between the two PCIe devices specified by the PCIe protocol.

レーン番号は連続している必要があり、リンクネゴシエーションは最小番号のレーンから開始されるため、再ネゴシエーションにより取得されるレーン数は故障したレーンのレーン番号によって変わり、不確実性が存在する。故障したレーンのレーン番号が相対的に小さいとき、再ネゴシエーションにより取得されるレーン数はまた大きく減少し、すなわち、取得された現在レーン幅値Ｎは大きく減少し、従って、２つのＰＣＩｅ装置の間のデータ伝送の性能に重大な影響を与える。 Since the lane numbers need to be consecutive and link negotiation starts from the lane with the smallest number, the number of lanes acquired by renegotiation varies depending on the lane number of the failed lane, and there is uncertainty. When the lane number of the failed lane is relatively small, the number of lanes acquired by renegotiation is also greatly reduced, i.e., the acquired current lane width value N is greatly reduced, and therefore between two PCIe devices. Will significantly affect the performance of data transmission.

本発明の本実施例では、第１ＰＣＩｅ装置２０３により報告されたＭＳＩメッセージを受信した後に、ＣＰＵ２０１はインタラプト処理プロセスに入る。ＣＰＵ２０１は、ＭＳＩメッセージにおける第１ＰＣＩｅ装置２０３のデバイスＩＤに従って、第１ＰＣＩｅ装置２０３から第１ＰＣＩｅ装置２０３の現在レーン幅値Ｎ及び第１ＰＣＩｅ装置２０３のレーンネゴシエーション能力値Ｍを取得する。第１ＰＣＩｅ装置２０３のレーンネゴシエーション能力値Ｍは、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５とネゴシエートすることにより取得可能な最大レーン幅値、すなわち、第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間のレーンの合計数を表す。本実施例では、第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間に１６個のレーンがあり、従って、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５とネゴシエートすることにより取得可能な最大レーン幅値は１６であり、第１ＰＣＩｅ装置２０３のレーンネゴシエーション能力値Ｍは１６である。第１ＰＣＩｅ装置２０３の現在レーン幅値Ｎは、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５と再ネゴシエートすることにより取得される現在レーン幅値Ｎである。 In this embodiment of the invention, after receiving the MSI message reported by the first PCIe device 203, the CPU 201 enters an interrupt processing process. The CPU 201 acquires the current lane width value N of the first PCIe device 203 and the lane negotiation capability value M of the first PCIe device 203 from the first PCIe device 203 according to the device ID of the first PCIe device 203 in the MSI message. The lane negotiation capability value M of the first PCIe device 203 is the maximum lane width value that can be acquired by the first PCIe device 203 negotiating with the second PCIe device 205, that is, the lane between the first PCIe device 203 and the second PCIe device 205. Represents the total number of In the present embodiment, there are 16 lanes between the first PCIe device 203 and the second PCIe device 205. Therefore, the maximum lane width value that can be acquired by the first PCIe device 203 negotiating with the second PCIe device 205 is 16 The lane negotiation capability value M of the first PCIe device 203 is 16. The current lane width value N of the first PCIe device 203 is the current lane width value N acquired when the first PCIe device 203 renegotiates with the second PCIe device 205.

ＣＰＵ２０１は、第１ＰＣＩｅ装置２０３の取得された現在レーン幅値Ｎとレーンネゴシエーション能力値Ｍとを比較する。本発明では、ＣＰＵ２０１は、２つのＰＣＩｅ装置の間のリンクのレーン数の規則に従って、ＮとＭ／２とを比較する。 The CPU 201 compares the acquired current lane width value N and the lane negotiation capability value M of the first PCIe device 203. In the present invention, the CPU 201 compares N and M / 2 according to the rule for the number of lanes of links between two PCIe devices.

Ｎ≧Ｍ／２である場合、第１ＰＣＩｅ装置２０３は、再ネゴシエーションにより取得されるリンクレーンを利用することによって、第２ＰＣＩｅ装置２０５とのデータ伝送を実行し続ける。この場合、Ｎ≧Ｍ／２は、故障したレーンのレーン番号は（Ｍ／２−１）より大きいことを示し、２つのＰＣＩｅ装置の間のリンクが１，２，４，８，１６又は３２個のレーンを含みうるため、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５と再ネゴシエートすることによって取得される現在レーン幅値ＮはＭ／２であり、第１ＰＣＩｅ装置２０３は、再ネゴシエーションにより取得されるリンクレーンを利用することによって、第２ＰＣＩｅ装置２０５とのデータ伝送を実行し続ける。ＣＰＵ２０１は、このような状態においては処理を実行しない。第１ＰＣＩｅ装置２０３は期間を設定してもよく、第１ＰＣＩｅ装置２０３が設定された期間内にＣＰＵ２０１の指示を受信しなかった場合、第１ＰＣＩｅ装置２０３は、再ネゴシエーションにより取得されるリンクレーンを利用することによって、第２ＰＣＩｅ装置２０５とのデータ伝送を実行し続ける。 When N ≧ M / 2, the first PCIe device 203 continues to execute data transmission with the second PCIe device 205 by using the link lane acquired by renegotiation. In this case, N ≧ M / 2 indicates that the lane number of the failed lane is greater than (M / 2-1), and the link between the two PCIe devices is 1, 2, 4, 8, 16, or 32. Since the first PCIe device 203 re-negotiates with the second PCIe device 205, the current lane width value N obtained by re-negotiation with the second PCIe device 205 is M / 2, and the first PCIe device 203 is acquired by renegotiation. By using the link lane, data transmission with the second PCIe device 205 is continued. The CPU 201 does not execute processing in such a state. The first PCIe device 203 may set a period, and if the first PCIe device 203 does not receive an instruction from the CPU 201 within the set period, the first PCIe device 203 uses a link lane acquired by renegotiation. As a result, the data transmission with the second PCIe device 205 is continued.

Ｎ＜Ｍ／２である場合、ＣＰＵ２０１は、第１ＰＣＩｅ装置２０３にレーン反転処理を実行するよう指示する。Ｎ＜Ｍ／２であるとき、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５と再ネゴシエートすることによって取得される現在レーン幅は、合計のリンク幅の半分未満であり、故障したレーンのレーン番号がＭ／２未満であることを示す。レーン反転処理は、第１ＰＣＩｅ装置２０３が第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間のレーンを反対方向に再番号付けする。例えば、本実施例では、第１ＰＣＩｅ装置２０３と第２ＰＣＩｅ装置２０５との間のレーンは、当初は左から右に０から始まって１５まで番号付けされる。第１ＰＣＩｅ装置２０３がレーン反転処理を実行した後、元のレーン番号１５のレーン番号は１５から０に変更され、すなわち、元のレーン番号１５はレーン番号０に変更され、元のレーン番号１４のレーン番号は１４から１に変更され、すなわち、元のレーン番号１４はレーン番号１に変更され、残りは同様に導くことができる。レーン反転処理を実行するよう第１ＰＣＩｅ装置に指示する前に、ＣＰＵ２０１は更に、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のレーン番号０からレーン番号（Ｍ／２−１）までを無効にしてもよい。レーン番号０からレーン番号（Ｍ／２−１）までが無効にされた後、第１ＰＣＩｅ装置２０３は、元のレーン番号Ｍのレーン番号を０に変更し、元のレーン番号（Ｍ−１）のレーン番号を１に変更するなど、元のレーン番号Ｍ／２のレーン番号が（Ｍ／２−１）に変更されるまで、レーン反転処理を実行する。レーン反転処理を実行した後、第１ＰＣＩｅ装置２０３は、新たな現在レーン幅値Ｎ’を取得するため第２ＰＣＩｅ装置２０５による新たな現在レーン幅に関する再ネゴシエーションを実行する。第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５と現在レーン幅を再ネゴシエートする方法は、上記の再ネゴシエーション方法と同じである。ＰＣＩｅプロトコルの要求に従って、第１ＰＣＩｅ装置２０３は、反転後に最小番号のレーンから始めてネゴシエーションを実行し、レーン番号の昇順に上方に連続的に第２ＰＣＩｅ装置２０５とのリンクネゴシエーションを再実行する。すなわち、第１ＰＣＩｅ装置２０３は、反転されたレーン番号０から始まるネゴシエーションを実行し、その後、ネゴシエーションがレーン番号（Ｍ／２−１）に対して実行されるまで、レーン番号１及びレーン番号２に対して逐次的にネゴシエーションを実行する。第１ＰＣＩｅ装置２０３は、Ｎ＜Ｍ／２の場合にレーン反転処理を実行し、この場合、故障したレーンのレーン番号はＭ／２未満であり、すなわち、故障したレーンはレーン番号０からレーン番号（Ｍ／２−１）までの１つであり、レーン番号０からレーン番号（Ｍ／２−１）までが無効とされるが、レーン番号Ｍ／２からレーン番号Ｍまでは故障が発生しておらず、データ伝送が実行可能である。その後、第１ＰＣＩｅ装置２０３がレーン反転を実行した後、新たなレーン番号０からレーン番号（Ｍ／２−１）までにおいて故障は発生せず、第１ＰＣＩｅ装置２０３が第２ＰＣＩｅ装置２０５と再ネゴシエートすることによって取得された新たな現在レーン幅値Ｎ’はＭ／２である。第１ＰＣＩｅ装置２０３は、再ネゴシエーションにより取得される新たなレーンリンクを利用することによって、第２ＰＣＩｅ装置２０５とのデータ伝送を実行し続ける。 When N <M / 2, the CPU 201 instructs the first PCIe device 203 to execute lane inversion processing. When N <M / 2, the current lane width obtained by the first PCIe device 203 renegotiating with the second PCIe device 205 is less than half of the total link width, and the lane number of the failed lane is M Indicates less than / 2. In the lane inversion process, the first PCIe device 203 renumbers the lanes between the first PCIe device 203 and the second PCIe device 205 in the opposite direction. For example, in this embodiment, the lanes between the first PCIe device 203 and the second PCIe device 205 are initially numbered from left to right starting at 0 and ending at 15. After the first PCIe device 203 executes the lane inversion process, the lane number of the original lane number 15 is changed from 15 to 0, that is, the original lane number 15 is changed to lane number 0, and the original lane number 14 The lane number is changed from 14 to 1, that is, the original lane number 14 is changed to lane number 1 and the rest can be similarly derived. Before instructing the first PCIe device to execute the lane inversion process, the CPU 201 further disables the lane number 0 to the lane number (M / 2-1) between the first PCIe device and the second PCIe device. Good. After the lane number 0 to the lane number (M / 2-1) are invalidated, the first PCIe device 203 changes the lane number of the original lane number M to 0, and the original lane number (M-1) The lane inversion process is executed until the lane number of the original lane number M / 2 is changed to (M / 2-1), such as changing the lane number of the original lane number to 1. After executing the lane inversion process, the first PCIe device 203 performs renegotiation on the new current lane width by the second PCIe device 205 in order to obtain a new current lane width value N ′. The method in which the first PCIe device 203 renegotiates the current lane width with the second PCIe device 205 is the same as the above-described renegotiation method. In accordance with the request of the PCIe protocol, the first PCIe device 203 performs negotiation starting from the lowest numbered lane after inversion, and continuously re-executes link negotiation with the second PCIe device 205 in the ascending order of the lane numbers. That is, the first PCIe device 203 executes the negotiation starting from the inverted lane number 0, and then changes to lane number 1 and lane number 2 until the negotiation is performed on the lane number (M / 2-1). Negotiations are executed sequentially. The first PCIe device 203 performs the lane inversion process when N <M / 2, and in this case, the lane number of the failed lane is less than M / 2, that is, the failed lane has the lane number 0 to the lane number. (M / 2-1) and lane number 0 to lane number (M / 2-1) are invalid, but a failure occurs from lane number M / 2 to lane number M. Data transmission is possible. Thereafter, after the first PCIe device 203 performs lane inversion, no failure occurs from the new lane number 0 to the lane number (M / 2-1), and the first PCIe device 203 renegotiates with the second PCIe device 205. The new current lane width value N ′ obtained by this is M / 2. The first PCIe device 203 continues to execute data transmission with the second PCIe device 205 by using a new lane link acquired by renegotiation.

ＰＣＩｅ装置のレーンに故障が発生すると、本発明の方策を利用することは、再ネゴシエーションにより取得されたレーン幅が、故障したレーンのレーン番号に関係なく元のレーン幅の１／２であることを保証することが可能であり、故障したレーンのレーン番号が比較的小さく、このため、ＰＣＩｅ装置の間のデータ伝送の速度が影響を受けるとき、ＰＣＩｅ装置のレーン値が大きく減少する。 When a failure occurs in a lane of a PCIe device, using the measure of the present invention is that the lane width acquired by renegotiation is 1/2 of the original lane width regardless of the lane number of the failed lane. The lane number of the PCIe device is greatly reduced when the lane number of the failed lane is relatively small, and therefore the data transmission speed between the PCIe devices is affected.

本発明の実施例における第１ＰＣＩｅ装置５０３の内部構成図について、図５を参照されたい。図５に示されるように、第１ＰＣＩｅ装置５０３は、検出モジュール５０３１、ＭＳＩモジュール５０３３、ネゴシエートモジュール５０３５、レジスタ５０３７及びレーン反転モジュール５０３９を有する。 Refer to FIG. 5 for an internal configuration diagram of the first PCIe device 503 in the embodiment of the present invention. As shown in FIG. 5, the first PCIe device 503 includes a detection module 5031, an MSI module 5033, a negotiate module 5035, a register 5037, and a lane inversion module 5039.

検出モジュール５０３１は、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４の通信状態をモニタリングするよう構成される。第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０４との間のリンク５０４は複数のシリアライザ／デ・シリアライザ（ｓｅｒｄｅｓ，Ｓｅｒｉａｌｉｚｅｒ／Ｄｅ−Ｓｅｒｉａｌｉｚｅｒ）回路を有し、各ｓｅｒｄｅｓはレーン（ｌａｎｅ）である。リンク５０４の１以上のレーンにおいて故障が発生したと検出すると、検出モジュール５０３１は、レーン故障指示メッセージをＭＳＩモジュール５０３３に送信し、故障したレーンに関する情報を記憶のためレジスタ５０３７に送信する。 The detection module 5031 is configured to monitor the communication state of the link 504 between the first PCIe device 503 and the second PCIe device 505. The link 504 between the first PCIe device 503 and the second PCIe device 504 has a plurality of serializer / deserializer (serdes, Serializer / De-Serializer) circuits, and each serdes is a lane. When detecting that a failure has occurred in one or more lanes of the link 504, the detection module 5031 sends a lane failure indication message to the MSI module 5033 and sends information about the failed lane to the register 5037 for storage.

ＭＳＩモジュール５０３３は、検出モジュール５０３１のレーン故障指示メッセージを受信した後に、ＭＳＩメッセージを中央処理ユニットＣＰＵ５０１に送信するよう構成される。送信されるＭＳＩメッセージは第１ＰＣＩｅ装置５０３のデバイスＩＤを含む。 The MSI module 5033 is configured to send the MSI message to the central processing unit CPU 501 after receiving the lane failure indication message of the detection module 5031. The transmitted MSI message includes the device ID of the first PCIe device 503.

ネゴシエートモジュール５０３５は、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のレーン幅をネゴシエートするよう構成される。２つのＰＣＩｅ装置の間のリンクのレーンは、ある順序により連続的に番号付けされる。２つのＰＣＩｅ装置の間のリンクは、１，２，４，８，１６又は３２個のレーンを有してもよく、本実施例は、具体例として１６個のレーンを利用することによって説明される。例えば、図５に示されるように、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４のレーンは、０から始まって左から右に連続的に番号付けされる。左から右に逐次的にレーン番号０、レーン番号１、レーン番号２、．．．．及びレーン番号１５がある。ＰＣＩｅプロトコルのネゴシエーション機構によると、リンクネゴシエーションが実行されると、最小の番号のレーンから始まってネゴシエーションが実行され、ネゴシエーションが故障したレーンに対して実行されるまで、レーン番号の昇順に上方に連続的に第２ＰＣＩｅ装置５０５とのリンクネゴシエーションが再実行される。故障したレーンに対するネゴシエーションの実行は成功できないため、再ネゴシエーション処理は終了する。第１ＰＣＩｅ装置５０３は更に、ＰＣＩｅプロトコルに規定される２つのＰＣＩｅ装置の間のリンクのレーン数に従って、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンクの現在レーン幅値Ｎを決定する。 The negotiate module 5035 is configured to negotiate the lane width between the first PCIe device 503 and the second PCIe device 505. The lanes of links between two PCIe devices are numbered consecutively in some order. The link between two PCIe devices may have 1, 2, 4, 8, 16, or 32 lanes, and this embodiment is illustrated by utilizing 16 lanes as an example. The For example, as shown in FIG. 5, the lanes of the link 504 between the first PCIe device 503 and the second PCIe device 505 are sequentially numbered from left to right starting with 0. Lane number 0, lane number 1, lane number 2,. . . . And there is lane number 15. According to the PCIe protocol negotiation mechanism, when link negotiation is performed, the negotiation starts with the lowest numbered lane and continues upward in ascending lane number until the negotiation is performed on the failed lane. Thus, the link negotiation with the second PCIe device 505 is executed again. Since the negotiation for the failed lane cannot be successfully executed, the renegotiation process ends. The first PCIe device 503 further determines a current lane width value N of the link between the first PCIe device 503 and the second PCIe device 505 according to the number of lanes of the link between the two PCIe devices specified by the PCIe protocol.

レジスタ５０３７は、第１ＰＣＩｅ装置５０３の現在レーン幅値Ｎ及びレーンネゴシエーション能力値Ｍなどの情報を含む情報を記憶するよう構成される。 The register 5037 is configured to store information including information such as the current lane width value N and the lane negotiation capability value M of the first PCIe device 503.

レーン反転モジュール５０３９は、レーン反転処理を実行する指示を受信した後に、リンク５０４のレーンに対してレーン反転処理を実行するよう構成される。 The lane inversion module 5039 is configured to perform lane inversion processing on the lane of the link 504 after receiving an instruction to perform lane inversion processing.

リンク５０４の１以上のレーンにおいて故障が発生したと検出すると、検出モジュール５０３１は、レーン故障指示メッセージをＭＳＩモジュール５０３３に送信し、故障したレーンに関する情報を記憶のためレジスタ５０３７に送信する。 When detecting that a failure has occurred in one or more lanes of the link 504, the detection module 5031 sends a lane failure indication message to the MSI module 5033 and sends information about the failed lane to the register 5037 for storage.

ネゴシエートモジュール５０３５は、再ネゴシエーションを開始し、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間の現在レーン幅値Ｎを決定する。ネゴシエートモジュール５０３５は、リンク５０４における最小の番号のレーンから始めてネゴシエーションを実行し、すなわち、レーン番号０から始めてネゴシエーションを実行し、レーン番号０に対してネゴシエーションの実行が成功した後、レーン番号０に対してネゴシエーションが実行され、故障したレーンに対してネゴシエーションが実行されるまで、残りが同様に導くことができる。本実施例では、リンク５０４のレーン番号５において故障が発生したと仮定する。ネゴシエートモジュール５０３５が再ネゴシエーションを開始すると、レーン番号０からレーン番号４までに対してネゴシエーションの実行が成功する。この場合、５つのレーンが通信に利用可能であるが、２つのＰＣＩｅ装置の間のリンクは１，２，４，８，１６又は３２個のレーンを有してもよく、従って、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間の現在レーン幅値Ｎは４である。 The negotiate module 5035 starts renegotiation and determines a current lane width value N between the first PCIe device 503 and the second PCIe device 505. The negotiate module 5035 performs the negotiation starting from the lowest numbered lane in the link 504, that is, the negotiation is performed starting from the lane number 0, and the negotiation is successfully performed for the lane number 0. Negotiations are performed on the rest, and the rest can be similarly guided until negotiations are performed on the failed lane. In the present embodiment, it is assumed that a failure has occurred in lane number 5 of the link 504. When the negotiate module 5035 starts renegotiation, the negotiation is successfully executed for lane number 0 to lane number 4. In this case, five lanes are available for communication, but the link between the two PCIe devices may have 1, 2, 4, 8, 16 or 32 lanes, so the first PCIe device The current lane width value N between 503 and the second PCIe device 505 is 4.

ＭＳＩモジュール５０３３により送信されたＭＳＩメッセージを受信した後に、ＣＰＵ５０１は、ＭＳＩメッセージに含まれるデバイスＩＤに従って、第１ＰＣＩｅ装置５０３のレジスタ５０３７から第１ＰＣＩｅ装置の現在レーン幅値Ｎ及びレーンネゴシエーション能力値Ｍを取得する。本実施例では、第１ＰＣＩｅ装置と第２ＰＣＩｅ装置との間のリンクのレーン数は１６であり、このため、ＣＰＵ５０１により取得される第１ＰＣＩｅ装置のレーンネゴシエーション能力値Ｍは１６である。現在レーン幅値Ｎは４である。 After receiving the MSI message transmitted by the MSI module 5033, the CPU 501 obtains the current lane width value N and lane negotiation capability value M of the first PCIe device from the register 5037 of the first PCIe device 503 according to the device ID included in the MSI message. get. In this embodiment, the number of lanes of the link between the first PCIe device and the second PCIe device is 16, and therefore the lane negotiation capability value M of the first PCIe device acquired by the CPU 501 is 16. The current lane width value N is 4.

ＣＰＵ５０１は、取得したレーンネゴシエーション能力値Ｍ及び現在レーン幅値Ｎを比較する。本発明では、ＣＰＵ５０１は、２つのＰＣＩｅ装置の間のリンクのレーン数の規則に従ってＮ及びＭ／２を比較する。 The CPU 501 compares the acquired lane negotiation capability value M and the current lane width value N. In the present invention, the CPU 501 compares N and M / 2 according to the rule of the number of lanes of links between two PCIe devices.

Ｎ＜Ｍ／２であるとき、ＣＰＵ５０１は、レーン反転処理を実行するよう、すなわち、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４のレーンを反対方向に再番号付けするよう第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９に指示する。例えば、図５に示されるように、現在、リンク５０４のレーンは、左から右に０から始めて１５まで連続的に番号付けされる。レーン反転処理を実行するとは、リンク５０４のレーン番号１５のレーン番号が１５から０に変更され、すなわち、レーン番号１５がレーン番号０に変更され、レーン番号１４のレーン番号が１に変更され、すなわち、レーン番号１４がレーン番号１に変更され、残りが同様に導くことができることを意味する。レーン反転処理が実行された後、リンク５０４のレーンは、右から左に０から始めて１５まで連続的に番号付けされる。第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９がレーン反転処理を終了した後、ネゴシエートモジュール５０３５は、それから第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４に対してレーンネゴシエーションを再開し、新たな現在レーン幅値Ｎ’を決定し、ネゴシエーション方法は上記の方法と同じであり、ここでは個別に再説明されない。第１ＰＣＩｅ装置５０３は、ネゴシエーションにより取得された新たな現在レーン幅を利用することによって、第２ＰＣＩｅ装置５０５とのデータ伝送を実行する。Ｎ＜Ｍ／２が故障したレーンのレーン番号が（Ｍ／２−１）未満であることを示すため、再ネゴシエーションにより取得される新たな現在レーン幅値はＭ／２以上である。この場合、リンク５０４のレーンにおいて故障が発生した後に再ネゴシエーションにより取得されるレーン幅値がＭ／２以上であることが保証可能であり、リンク５０４の伝送性能を保証する。 When N <M / 2, the CPU 501 performs the lane inversion process, that is, the first PCIe to renumber the lanes of the link 504 between the first PCIe device 503 and the second PCIe device 505 in the opposite direction. Instruct the lane inversion module 5039 of the device 503. For example, as shown in FIG. 5, currently the lanes of link 504 are sequentially numbered from left to right starting at 0 and ending at 15. When the lane inversion process is executed, the lane number of the lane number 15 of the link 504 is changed from 15 to 0, that is, the lane number 15 is changed to lane number 0, the lane number of the lane number 14 is changed to 1, That is, lane number 14 is changed to lane number 1, and the rest can be derived in the same manner. After the lane inversion process is performed, the lanes of link 504 are sequentially numbered from 0 to 15 starting from 0 to the right. After the lane reversal module 5039 of the first PCIe device 503 completes the lane reversal process, the negotiate module 5035 then resumes lane negotiation for the link 504 between the first PCIe device 503 and the second PCIe device 505, and a new The current lane width value N ′ is determined and the negotiation method is the same as described above and will not be re-explained here separately. The first PCIe device 503 performs data transmission with the second PCIe device 505 by using the new current lane width acquired by the negotiation. Since N <M / 2 indicates that the lane number of the failed lane is less than (M / 2-1), the new current lane width value acquired by renegotiation is M / 2 or more. In this case, it is possible to guarantee that the lane width value acquired by renegotiation after a failure occurs in the lane of the link 504 is M / 2 or more, and the transmission performance of the link 504 is guaranteed.

Ｎ＜Ｍ／２であるとき、ＣＰＵ５０１は更に、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４のレーン番号０からレーン番号（Ｍ／２−１）までをまず無効にし、それからレーン反転処理を実行するよう第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９に指示する。Ｎ＜Ｍ／２であって、ネゴシエーションにより取得された現在レーン幅がＭ／２未満であるため、それは、故障したレーンのレーン番号が（Ｍ／２−１）未満であることを示す。レーン番号Ｍ／２からレーン番号Ｍまでには故障は発生せず、データ伝送が実行可能である。このように、ＣＰＵ５０１は、リンク５０４のレーン番号０からレーン番号（Ｍ／２−１）までを無効にし、レーン反転処理を実行すると、第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９は、レーン番号Ｍ／２のレーン番号がＭ／２から（Ｍ／２−１）に変更されるまで、レーン番号Ｍのレーン番号をＭから０に変更し、レーン番号（Ｍ−１）のレーン番号を（Ｍ−１）から０に変更するなどである。リンクの元のレーン番号０から元のレーン番号（Ｍ／２−１）までが無効にされるため、第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９は、これらの無効にされたレーンに対してレーン反転をもはや実行しない。レーン反転が終了した後、ネゴシエートモジュール５０３５は、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４に対するレーンネゴシエーションを再開し、新たな現在レーン幅値Ｎ’を決定する。この場合、リンク５０４の新たなレーン番号０から新たなレーン番号（Ｍ／２−１）までが通信に利用可能である。すなわち、リンク５０４のレーンにおいて故障が発生した後、再ネゴシエーションにより取得されるレーン幅値はＭ／２に等しく、リンク５０４の伝送性能を保証する。 When N <M / 2, the CPU 501 further invalidates the lane number 0 to lane number (M / 2-1) of the link 504 between the first PCIe device 503 and the second PCIe device 505 first, and then the lane The lane inversion module 5039 of the first PCIe device 503 is instructed to execute the inversion process. Since N <M / 2 and the current lane width obtained by negotiation is less than M / 2, it indicates that the lane number of the failed lane is less than (M / 2-1). No failure occurs from lane number M / 2 to lane number M, and data transmission can be executed. As described above, when the CPU 501 invalidates the lane number 0 to the lane number (M / 2-1) of the link 504 and executes the lane inversion process, the lane inversion module 5039 of the first PCIe device 503 causes the lane number M / Until the lane number of 2 is changed from M / 2 to (M / 2-1), the lane number of lane number M is changed from M to 0, and the lane number of lane number (M-1) is changed to (M- For example, change from 1) to 0. Since the original lane number 0 to the original lane number (M / 2-1) of the link is invalidated, the lane inversion module 5039 of the first PCIe device 503 performs lane inversion for these invalidated lanes. No longer run. After the lane inversion is completed, the negotiate module 5035 resumes lane negotiation for the link 504 between the first PCIe device 503 and the second PCIe device 505, and determines a new current lane width value N ′. In this case, a new lane number 0 to a new lane number (M / 2-1) of the link 504 can be used for communication. That is, after a failure occurs in the lane of the link 504, the lane width value acquired by renegotiation is equal to M / 2, and the transmission performance of the link 504 is guaranteed.

Ｎ≧Ｍ／２であるとき、それは、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンクの現在レーン幅がＭ／２以上であることを示し、この場合、リンク５０４の性能は最大程度まで保持される。ＣＰＵ５０１は処理を実行しない。第１ＰＣＩｅ装置５０３が所定の制限時間内にＣＰＵ５０１の指示を受信しなかった場合、第１ＰＣＩｅ装置５０３は、ネゴシエーションにより取得された現在レーン幅Ｎを利用することによって、第２ＰＣＩｅ装置５０４５とのデータ伝送を実行する。リンク５０４の伝送性能がまた保証される。 When N ≧ M / 2, it indicates that the current lane width of the link between the first PCIe device 503 and the second PCIe device 505 is M / 2 or more, and in this case, the performance of the link 504 is the maximum. Hold up. The CPU 501 does not execute processing. If the 1PCIe device 503 does not receive an indication of CPU501 within a predetermined time limit, the 1PCIe device 503, by using the current lane width N acquired by the negotiation data between the first 2PCIe 504 5 Perform transmission. The transmission performance of the link 504 is also guaranteed.

本発明の本実施例では、第１ＰＣＩｅ装置５０３のレーンネゴシエーション能力値Ｍは１６であり、リンク５０４のレーン番号５において故障が発生し、第１ＰＣＩｅ装置のネゴシエーションモジュール５０３５が第２ＰＣＩｅ装置５０５と再ネゴシエートすることにより取得された現在レーン幅値Ｎは４である。この場合、４＜１６／２であり、すなわち、Ｎ＜Ｍ／２であり、ＣＰＵ５０１は、リンク５０４のレーン番号０からレーン番号７までを無効にし、レーン反転を実行するよう第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９に指示する。第１ＰＣＩｅ装置５０３のレーン反転モジュール５０３９は、レーン番号８のレーン番号が７に変更されるまで、リンク５０４のレーン番号１５のレーン番号を０に変更し、レーン番号１４のレーン番号を１に変更する。リンク５０４の元のレーン番号０から元のレーン番号７までが無効とされたため、レーン反転モジュール５０３９はもはやそれらを番号付けしない。このようにして、第１ＰＣＩｅ装置５０３と第２ＰＣＩｅ装置５０５との間のリンク５０４は現在は８つのレーンを含む。第１ＰＣＩｅ装置５０３のネゴシエートモジュール５０３５は、８である新たな現在レーン幅値Ｎ’を取得するため、第２ＰＣＩｅ装置５０５とのレーン再ネゴシエーションを実行する。第１ＰＣＩｅ装置５０３は、リンク５０４の８つのレーンを利用することによって、第２ＰＣＩｅ装置５０５とのデータ伝送を実行し続ける。 In this embodiment of the present invention, the lane negotiation capability value M of the first PCIe device 503 is 16, a failure occurs in the lane number 5 of the link 504, and the negotiation module 5035 of the first PCIe device re-negotiates with the second PCIe device 505. The current lane width value N obtained by doing this is 4. In this case, 4 <16/2, that is, N <M / 2, and the CPU 501 invalidates the lane numbers 0 to 7 of the link 504 and executes the lane inversion so that the first PCIe device 503 performs the lane inversion. The lane inversion module 5039 is instructed. The lane inversion module 5039 of the first PCIe device 503 changes the lane number of the lane number 15 of the link 504 to 0 and changes the lane number of the lane number 14 to 1 until the lane number of the lane number 8 is changed to 7. To do. Since the original lane number 0 through the original lane number 7 of the link 504 have been invalidated, the lane inversion module 5039 no longer numbers them. In this way, the link 504 between the first PCIe device 503 and the second PCIe device 505 currently includes eight lanes. The negotiation module 5035 of the first PCIe device 503 performs lane renegotiation with the second PCIe device 505 in order to obtain a new current lane width value N ′ of 8. The first PCIe device 503 continues to execute data transmission with the second PCIe device 505 by using the eight lanes of the link 504.

このようにして、本発明において提供される技術的方策を利用することによって、２つのＰＣＩｅ装置の間のリンクのレーンにおいて故障が発生すると、２つのＰＣＩｅ装置が元のリンクの１／２のレーンが故障したレーンの番号に関わらずデータ伝送を実行し続けるため提供することができ、これは、従来技術におけるリンク幅に課された故障レーンのレーン番号の制限を緩和し、これにより、最適なリンク幅が再ネゴシエーション後に実現できる。 Thus, by utilizing the technical measures provided in the present invention, if a failure occurs in the lane of a link between two PCIe devices, the two PCIe devices will be half the lane of the original link. Can be provided to continue to perform data transmission regardless of the number of the failed lane, which relaxes the lane number limitation of the failed lane imposed on the link width in the prior art, thereby providing the optimum Link width can be achieved after renegotiation.

本明細書では、第１及び第２などの関係語があるエンティティ又は処理を他のものと区別するためだけに利用され、これらのエンティティ又は処理の間に何れか実際の関係又はシーケンスが存在することを必ずしも必要又は意味するものでないことが更に留意されるべきである。さらに、“含む”、“含む”という用語又はそれらの他の何れかの変形は非排他的な包含をカバーすることを意図し、これにより、要素のリストを含む処理、方法、物又は装置はこれらの要素を含むだけでなく、明示的に列記されていない他の要素もまた含むか、又は当該処理、方法、物又は装置に固有の要素を更に含む。“〜を含む”に先行する要素は、更なる制約なく、要素を含む処理、方法、物又は装置における更なる同一の要素の存在を排除しない。
当業者は、開示された実施例の上記の説明に従って本発明を実現又は利用してもよい。本明細書に定義される通常の原理は、本発明の精神又は範囲から逸脱することなく他の実施例において実現されてもよい。従って、本発明は、本明細書に説明された実施例に限定されず、本明細書に開示される原理及び新規性に従う最も広い範囲に拡張される。 As used herein, an entity or process having a relational word such as first and second is only used to distinguish it from others, and there is any actual relationship or sequence between these entities or processes. It should be further noted that this is not necessarily necessary or implied. Further, the terms “comprising”, “comprising” or any other variation thereof are intended to cover non-exclusive inclusions, whereby a process, method, article or device including a list of elements is included. In addition to including these elements, other elements not explicitly listed are also included, or further elements specific to the process, method, article or device are included. An element preceding “comprising” does not exclude the presence of additional identical elements in a process, method, article or device that includes the elements without further restriction.
Those skilled in the art may implement or utilize the invention in accordance with the above description of the disclosed embodiments. The normal principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Accordingly, the present invention is not limited to the embodiments described herein, but extends to the widest scope consistent with the principles and novelty disclosed herein.

Claims

A method of handling a Peripheral Component Interconnect Express (PCIe) link failure, comprising:
The PCIe device detects that a failure has occurred in the lane of the link between the PCIe device and a downstream PCIe device, and sends a message signaled interrupt (MSI) message to the central processing unit (CPU). The MSI message has a device ID of the PCIe device;
The PCIe device negotiates with the downstream PCIe device to obtain a current lane width value N;
The CPU obtaining the PCIe device lane negotiation capability value M and the current lane width value N from the PCIe device according to the device ID in the received MSI message;
Said CPU comparing N and M / 2;
If N <M / 2, the CPU instructs the PCIe device to perform lane inversion processing;
The PCIe device performs the lane inversion process on a link between the PCIe device and the downstream PCIe device;
The PCIe device negotiates with the downstream PCIe device to obtain a new current lane width value N ′, and uses N ′ (N ′ ≧ 1) lanes to communicate with the downstream PCIe device. Continuing to perform data transmission;
The CPU invalidating the lane number 0 to the lane number (M / 2-1) of the link between the PCIe device and the downstream PCIe device;
Having a method.

The method further includes:
If it is N ≧ M / 2, the PCIe device, by utilizing the N lanes obtained by the negotiation, including the step of continuing to perform data transmission with the downstream PCIe device, according to claim 1, wherein the method of.

A method of handling a Peripheral Component Interconnect Express (PCIe) link failure, comprising:
Receiving a message signaled interrupt (MSI) message reported by a PCIe device, the MSI message having a device ID of the PCIe device;
Obtaining a lane negotiation capability value M and a current lane width value N of the PCIe device from the PCIe device according to the device ID, the current lane width value N being negotiated with a downstream PCIe device by the PCIe device; A step of obtaining, obtained by
Comparing N and M / 2;
If N <M / 2, instructing the PCIe device to perform lane inversion processing;
Invalidating lane number 0 to lane number (M / 2-1) of the link between the PCIe device and the downstream PCIe device;
Having a method.

4. The method of claim 3 , wherein the lane negotiation capability value M of the PCIe device is equal to the total number of lanes on the link between the PCIe device and the downstream PCIe device.

A method of handling a Peripheral Component Interconnect Express (PCIe) link failure, comprising:
The PCIe device detects that a failure has occurred in a lane of a link between the PCIe device and a downstream PCIe device, and sends a message signaled interrupt MSI message to a central processing unit (CPU), Transmitting the MSI message having a device ID of the PCIe device;
Negotiating with the downstream PCIe device to obtain a current lane width value N;
Receiving an instruction to perform lane inversion processing transmitted by the CPU, and executing the lane inversion processing for a link between the PCIe device and the downstream PCIe device;
Negotiating with the downstream PCIe device to obtain a new current lane width value N ′ and continuing to perform data transmission with the downstream PCIe device by using N ′ lanes;
If the PCIe device does not receive the instruction to execute the lane inversion processing transmitted by the CPU within a predetermined time, the PCIe device uses the N lanes to Continuing to execute the data transmission of
Having a method.

A system for handling Peripheral Component Interconnect Express (PCIe) link failures, comprising:
The system includes a central processing unit (CPU), a PCIe device, and a downstream PCIe device, and the CPU is connected to the PCIe device, and the PCIe device is connected to the downstream PCIe device by using a link. ,
The PCIe device detects whether a failure has occurred in the lane of the link between the PCIe device and the downstream PCIe device, and reports a message signaled interrupt (MSI) message to the CPU when the failure occurs. The MSI message has a device ID of the PCIe device, and the PCIe device is further configured to negotiate with the downstream PCIe device to obtain a current lane width value N;
The CPU obtains the lane negotiation capability value M and the current lane width value N of the PCIe device from the PCIe device according to the device ID in the MSI message, compares N with M / 2, and N <M / When 2, the lane inversion process is instructed to the PCIe device, and the lane number 0 to lane number (M / 2-1) of the link between the PCIe device and the downstream PCIe device is invalidated. is configured to,
The PCIe device further executes the lane inversion processing for the link between the PCIe device and the downstream PCIe device after receiving the instruction to execute the lane inversion processing transmitted by the CPU. A system configured to negotiate with the downstream PCIe device to obtain a current lane width value N ′.