JP2004221807A

JP2004221807A - Distribution routing table management system and router

Info

Publication number: JP2004221807A
Application number: JP2003005244A
Authority: JP
Inventors: Hiroaki Nishi; 宏章西
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2003-01-14
Filing date: 2003-01-14
Publication date: 2004-08-05

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problems with a conventional routing table management system that not only incurs difficulty in the mount because of a high required memory capacity and increase in the cost but also is bottlenecked on the throughput. <P>SOLUTION: Proposed is an NUMA type or a CC-NUMA type and furthermore a COMA type for an architecture of a routing table memory. They are not provided with a shared routing table and manage path information by a distribution routing table distributed and placed to each interface to utilize position locality. The COMA type particularly can much more utilize the position locality because the home position of a path information entry is freely transited through referencing. The CC-NUMA type and the COMA type are provided with a cache to cope with a temporal locality and also to cope with address locality by adopting a mixing work for a hash function of the cache. Moreover, they cope with space locality by changing the hush function for each zone and registering the result to a cache memory to enhance a hit rate of the cache. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、通信に於ける大容量パケットの低遅延交換システムに関わり、特に大容量パケットの低遅延交換システムにおけるパケットの経路決定作業に必要なルーティングテーブルと該ルーティングテーブルの管理運営方式に関わる。パケットとしては、特にＩＰパケット、イーサネット（登録商標）・フレームを受入れる。
【０００２】
【従来の技術】
【非特許文献１】
”Ｍｕｌｔｉ−ｚｏｎｅＣａｃｈｅｓｆｏｒＡｃｃｅｌｅｒａｔｉｎｇＩＰＲｏｕｔｉｎｇＴａｂｌｅＬｏｏｋｕｐｓ，Ｉ．Ｌ．ＣｈｖｅｔｓａｎｄＭ．Ｈ．ＭａｃＧｒｅｇｏｒ，ｐｐ．１２１−１２６，ｉｎＰｒｏｃｅｅｄｉｎｇｓｏｆＨＰＳＲ２００２，Ｍａｙ．２００２”
通信分野においては、広帯域化、高品質化、低遅延化の要求を満たす様々なルータやスイッチの構成技術が提案されている。特に、ルーティングテーブルは今後のネットワークの高機能化、広帯域化、高品質化、低遅延化に係わる部位であり、ルーティングテーブルの動作性能がルータやネットワークの性能を左右する。
【０００３】
図１２は従来の基本的なルータの構成図である。この図を用いて基本的なパケット処理の手順を説明する。ルータの複数あるインタフェース１０１に投入されたパケットは、ルータのパケット処理部１０２において宛先情報が調べられる。この宛先情報を元にルーティングテーブル１２０１を参照し、出力先インタフェースおよびＱｏＳに必要な情報を得る。これらの情報をスイッチングファブリクス１０５に伝えることで、目的地である出力先インタフェースに該パケットを転送する。スイッチングファブリクス１０５としては、クロスバがよく用いられる。このクロスバに付随するアービタが要求を受け付け、ＱｏＳの情報や優先順位等を加味して調停を行う。この調停結果によりクロスバを設定し、入力インタフェースと出力インタフェースとの対応をとる。
【０００４】
ルーティングテーブルはルータで一つの共有ルーティングテーブル１２０２（プライマリルーティングテーブル、デフォルトルーティングテーブル等の名称で呼ばれる場合がある）のみを備える場合と、さらにそれに付随して各インタフェースに分散した分散ルーティングテーブル１２０１（セカンダリルーティングテーブル、シンプルルーティングテーブル等の名称で呼ばれる場合がある）を備える場合がある。
【０００５】
分散ルーティングテーブル１２０１の内容は共有ルーティングテーブル１２０２と同一内容であり、共有ルーティングテーブルへのアクセスを分散させるものである。特にキャッシュの場合はルーティングテーブルキャッシュと呼ばれる。ルーティングテーブルのキャッシュに関する従来技術としては ”Ｍｕｌｔｉ−ｚｏｎｅＣａｃｈｅｓｆｏｒＡｃｃｅｌｅｒａｔｉｎｇＩＰＲｏｕｔｉｎｇＴａｂｌｅＬｏｏｋｕｐｓ，Ｉ．Ｌ．ＣｈｖｅｔｓａｎｄＭ．Ｈ．ＭａｃＧｒｅｇｏｒ，ｐｐ．１２１−１２６，ｉｎＰｒｏｃｅｅｄｉｎｇｓｏｆＨＰＳＲ２００２，Ｍａｙ．２００２” を参照されたい。この論文において始めてＩＰに特化したキャッシュについての考察が行われた。この論文では、ＩＰトラフィックに存在する時間ローカリティと空間ローカリティを生かしたＭｕｌｔｉ−ｚｏｎｅＣａｃｈｅが提案されている。時間ローカリティは、ルータにあるＩＰアドレスから一度アクセスがあると、同じＩＰアドレスのアクセスがその後短時間に何度も発生する可能性が高いことを表す。空間ローカリティは、ルータがあるＩＰアドレスのパケットをある宛先（例えばインタフェース番号）に出力すると、その近辺のＩＰアドレス（下位数ｂｉｔのみ異なる様なＩＰアドレス）を持つパケットも同じ宛先に出力する可能性が高いことを表す。
【０００６】
図１２の技術では、共有ルーティングテーブルや他のルーティングテーブルは同じ内容を持っており、共有ルーティングテーブルの内容が適宜他のルーティングテーブルにコピーされる。また、図１２の従来例では他のインタフェースのルーティングテーブル相互では通信は行われない。
【０００７】
ルータ全体に一つのみ備わる共有ルーティングテーブル（図１２の１２０２）で経路情報を集中管理すると、ルータの処理速度の増大に伴うテーブル参照要求の集中に対応することが出来ないため、インタフェース毎にキャッシュを設けて（ルーティングテーブルキャッシュ）対応する構成が提案されている。
ルーティングテーブルキャッシュと同等の分散したルーティングテーブルを備えるものとして、特開２００２−１６４８９９、特開２０００−８３０５５、特開平１１−２８９４３５、特開平１１−１２２２８９などが提案されている。これらは、全体で一つの共有ルーティングテーブルとインタフェース毎に設けられたルーティングテーブルキャッシュの両方を備える。共有メモリへは共通のバスもしくはスイッチを介した結合となっており、この様なメモリ管理アーキテクチャをＵＭＡ（ＵｎｉｆｏｒｍＭｅｍｏｒｙＡｃｃｅｓｓｍｏｄｅｌ）と呼ぶ。
【０００８】
【発明が解決しようとする課題】
本発明の目的は、ルータにおけるルーティングテーブルおよびその参照機構を改善し、ルータの大容量化、低遅延化、低コスト化を図ることである。これにはルーティングテーブル用に用いられるメモリの単純化と小型化が必要である。メモリの小型化の結果、チップ点数や基板面積を削減することで低コスト化を図ることができ、動作速度が向上するためスループットが改善する。
【０００９】
従来のＵＭＡ型ルーティングテーブル管理構造において、ルーティングテーブル（図１２の１２０１）と共有ルーティングテーブル（１２０２）は実装されている場所は異なるが内容は同一である。この構造は、共有ルーティングテーブルに負荷が集中しないように、インタフェース毎に同一内容を持たせて共有ルーティングテーブルのアクセス集中を分散させるために考案された。
【００１０】
本願発明では従来手法における次のような課題に着目した。
１．共有ルーティングテーブルは各インタフェースにあるルーティングテーブルの和集合となる。従って、共有ルーティングテーブルのサイズは大きなものとなり、内容を参照する際の検索コスト、実装コストが必要となり、実効スループットの向上や低価格化に貢献できない。この共有ルーティングテーブルと同一の内容を各インタフェースで備えると、さらにコストがかかるだけでなく、ネットワークの発展に伴うルーティングテーブルの巨大化への対応が困難となる。
【００１１】
共有ルーティングテーブルとルーティングテーブルキャッシュを両方備える場合、各インタフェースにおけるルーティング情報の差異が大きいと、共有ルーティングテーブルで覚えるべき情報量が膨大となる。一般に、ルーティングテーブルの情報量が増えるとアクセスコストも増大するため、大規模かつ高速なルータでは共有ルーティングテーブルのアクセスコストが問題となり、実行スループットの向上が困難となる。
２．ルーティングテーブルの参照に強いインタフェース依存ローカリティ（位置ローカリティ）がある場合は共有ルーティングテーブルへの参照が少なく、巨大な共有ルーティングテーブルを備えるために必要なハードウエアコストに見合う性能向上を得ることが難しくなる。
３．ルーティングテーブルのキャッシュには時間ローカリティおよび空間ローカリティだけではなく、前述の位置ローカリティ、さらにアドレスローカリティについても考慮が必要である。アドレスローカリティは、有効もしくは頻繁に利用されるＩＰアドレスおよびＭＡＣアドレスが、アドレス空間全体に均一に分布しているのではなく偏在していることを表す。従ってルーティングテーブルのキャッシュにおけるハッシュ関数としては、与えられたＩＰアドレスの部分ビット情報を集めた単純なものではなく、アドレスローカリティに対応したキャッシュのハッシュ関数が必要である。Ｍｕｌｔｉ−Ｚｏｎｅキャッシュは空間ローカリティを有効に活かすが、位置ローカリティ、アドレスローカリティに対する考慮がなされていない。
４．ハッシュの実現に実装コストや処理コストが必要である場合、Ｚｏｎｅ毎に異なるハッシュを持たせることが困難となる。特開２００１−３２０４２０はハッシュにより検索した結果はトラフィックの調整に用い、ＩＰアドレス全体のＣＲＣを求めるというハッシュでＺｏｎｅに対応していない。
５．特開２００１−３２６６７９はルーティングテーブル検索手法に対する発明で、ＵＮＩＸ（登録商標）等のファイルシステムに標準で用いられる複数回間接参照方式のｉノード方式を、ルーティングテーブル参照に適用した発明である。記憶階層下位であるファイルシステムに用いられる様に、対象が大容量（ハードディスクではギガやテラといった容量を対象としている）である場合に有効であるが反面、複数回の参照が必要であるため速度が遅いという問題がある。
【００１２】
【課題を解決するための手段】
上述した課題を解決するために、発明では次のような手段を用いる。
１．共有ルーティングテーブルを備えず、各ポートにルーティングテーブルを分散配置した構造とする。すなわち、図１２の例とは異なり、それぞれのルーティングテーブルが異なる情報の組み合わせを持つこととする。ルーティングテーブルが独自の内容を持つことができるということはローカリティを利用して各テーブルが自分の思うように最適化できるということになる。この構造をＮＵＭＡ（Ｎｏｎ−ＵｎｉｆｏｒｍＭｅｍｏｒｙＡｃｃｅｓｓｍｏｄｅｌ）と呼ぶ。さらに、ＮＵＭＡにキャッシュを備えたＣＣ−ＮＵＭＡ（ＣａｃｈｅＣｏｈｅｒｅｎｔ−ＮＵＭＡ）の構造とする。もしくは、各ポートにルーティングテーブルキャッシュを分散配置した構造とする。この構造をＣＯＭＡ（ＣａｃｈｅＯｎｌｙＭｅｍｏｒｙＡｒｃｈｉｔｅｃｔｕｒｅ）と呼ぶ。具体的なルータ構成としては、複数のインタフェースを有するルータであって、複数の各インタフェースは、入力されたパケットを処理するパケット処理部と、パケットに依存する情報とパケットの宛先の組を情報として格納し、パケットに依存する情報をキーとして参照され、パケットの宛先を検索するルーティングテーブルと、を有し、第１のインタフェースのルーティングテーブルにおいて検索がヒットしなかった場合、第２のインタフェースのルーティングテーブルを参照して、宛先を検索する。この構成により、ネットワークの発展に伴うルーティングテーブルの巨大化の問題が回避できる。パケットに依存する情報とは、たとえば、ＩＰアドレスや、ＭＡＣアドレスである。
【００１３】
各ルーティングテーブルは、ルーティングテーブルの持つ情報の一部の複製を格納するルーティングテーブルキャッシュを有し、あるルーティングテーブルキャッシュにおいて検索がヒットしなかった場合、そのルーティングテーブルキャッシュに対応するルーティングテーブルを参照して、宛先を検索するＣＣ−ＮＵＭＡとしてもよい。
【００１４】
また、各ルーティングテーブルは、ルーティングテーブルキャッシュとして構成されており、パケットが到着したインタフェースにあるルーティングテーブルキャッシュにおいて検索がヒットしなかった場合、それ以外のインタフェースにあるルーティングテーブルキャッシュを参照して宛先を検索し、ヒットした場合には当該ヒットしたデータを第１のルーティングテーブルキャッシュにコピーするＣＯＭＡとしてもよい。
【００１５】
第２のルーティングテーブルが参照結果を返した場合は、参照結果を上記第１のルーティングテーブルに格納することにして、キャッシュの効果を得ることもできる。
【００１６】
また、インタフェース間で入力されたパケットを転送する第１のパスと、インタフェース間でルーティングテーブルの格納情報を転送する第２のパスを独立に有することにすれば、双方の転送速度を向上することができる。
【００１７】
さらに、ルーティングテーブルキャッシュのハッシュ関数に撹拌手段を付加し、メモリ利用効率を向上させることができる。
【００１８】
上記ルーティングテーブルの内容変更時に変更されたルーティングテーブルと同じエントリをもつ他のルーティングテーブルのエントリを無効化するか、もしくは更新メッセージにより他のインタフェースにあるルーティングテーブルに変更を通知することにしてもよい。
【００１９】
本発明の他の観点では、複数のパケット処理部と、各パケット処理部に対応して存在する複数のルーティングテーブルを有し、第１のルーティングテーブルはそれに対応する第１のパケット処理部で処理されたパケットの宛先を検索するために先に用いられ、第１のルーティングテーブルで検索がヒットしなかった場合には、第２のルーティングテーブルで検索が行われ、第２のルーティングテーブルはそれに対応する第２のパケット処理部で処理されたパケットの宛先を検索するために先に用いられ、第２のルーティングテーブルで検索がヒットしなかった場合には、第１のルーティングテーブルで検索が行われるルータを提供する。
【００２０】
また他の観点の方法では、それぞれがパケットを入力される複数のインタフェースを備えるルータのルーティング方法であって、各インタフェースは、入力されたパケットを処理するパケット処理部、入力されたパケットのアドレスに基づいて宛先を検索するためのルーティングテーブル、該ルーティングテーブルが格納するテーブルの少なくとも一部を格納するルーティングテーブルキャッシュを有することとし、あるインタフェースに入力されたパケットのアドレスに基づいて、そのインタフェースの上記ルーティングテーブルキャッシュで宛先を検索する第１のステップ、第１のステップで宛先が検索できなかったとき、当該インタフェースのルーティングテーブルで宛先を検索する第２のステップ、第２のステップで宛先が検索できなかったとき、他のインタフェースのルーティングテーブルで宛先を検索する第３のステップ、を有する。
２．ルーティングテーブルの構造をＮＵＭＡとし、各インタフェースが独自に管理する分散ルーティングテーブルを備えると、位置ローカリティを活かすことができる。結果、共有ルーティングテーブルで必要となるメモリ量よりも少ないメモリ量で済む。また、ＣＣ−ＮＵＭＡとすれば、インタフェースの自他を問わず各インタフェースが管理する内容を参照した際にもその内容をキャッシュするため、同様に位置ローカリティを有効に活かすことができ、参照速度の向上が期待できる。ルーティングテーブルの管理構造をＣＯＭＡとすると、各ルーティング情報は決まった場所に留まるのではなく、よりアクセスが頻繁に発生するところに移動するため、理想的なルーティング情報の配置が行われる。従って、管理が複雑になるが、位置ローカリティを有効に活かすことができる。
３．アドレスローカリティを有効に活かすためには、通常用いられるあるビットを単純に抽出するハッシュ関数ではなく、ビット抽出ではなく攪拌を行うハッシュ関数が必要である。例えば、ＩＰアドレスで１０．１１３．５３．１５２というアドレスについて、２進数で表記すると００００１０１０．０１１１０００１．００１１０１０１．１００１１０００となる。これを全て用いてメモリを引くと３２ビットのアドレス空間が必要となるため実装が困難である。そこで、あるビットとして４の倍数である場所にあるビットを抽出するハッシュ関数を用いると、ハッシュ値は０１０１００１１となりこの値を元にメモリを引くというハッシュが考えられる。しかしながら、アドレスのローカリティを考えるとこの様にして得られたハッシュ値は同じような値が得られる確率が高くメモリを効率利用できない。そこで攪拌手段が必要となる。複雑な攪拌はハードウエアおよび処理コストが必要となるため、単純なＣＲＣ（ＣｙｃｌｉｃＲｅｄｕｎｄａｎｃｙＣｈｅｃｋ）の除余項を用いたハッシュ関数を用いる。また、同じキャッシュ構造を複数備えることなく、単一のキャッシュで空間ローカリティを有効に活かすために、Ｚｏｎｅ毎に異なるハッシュ構造を備えＤｕａｌ−Ｐｏｒｔメモリを利用するなどで対応する。
４．ＣＲＣは生成多項式により良い攪拌結果が得られ、変換元および変換先のｂｉｔ数を自由に取ることができ（変換後のｂｉｔ数の方が変換前のｂｉｔ数より少ない必要がある）、実装コストと処理コストが共に小さい。また、ＣＲＣは並列化が容易で高速に行うことができる。また、ＣＲＣはＩＰアドレスの攪拌ハッシュ関数として用い、ルーティングテーブルキャッシュに利用する。
５．ルーティングテーブルキャッシュを検索する際の参照は間接参照ではなく、通常のキャッシュ同様ハッシュの一回参照により求めることで高速化を図る。キャッシュの参照方式はＳｅｔＡｓｓｏｃｉａｔｉｖｅとする。
本発明の一形態によれば、通信を行うルータは帯域および実時間性を保障する前記アービタとそれに付随もしくは融合された交換器を備える。
【００２１】
【発明の実施の形態】
以下、本発明の実施の形態を、図面を参照しながら説明する。尚、以下の実施の形態ではルータを例に用いるが、本発明は特にルータに限らず同等のローカリティが存在する場合に適用することで大容量化、低遅延化、低コスト化の効果が期待できる。
【００２２】
【実施例１】
本発明によるＣＣ−ＮＵＭＡ型ルーティングテーブル構造および動作の例を述べる。
【００２３】
図１にＣＣ−ＮＵＭＡ型ルーティングテーブルの構成例を示す。ＣＣ−ＮＵＭＡ型では、各インタフェース１０１にパケットに関する様々な処理を行うパケット処理部１０２、およびルーティングテーブルキャッシュ１０３、分散ルーティングテーブル１０４を備える。また、各インタフェースの外にパケットの交換を行うスイッチングファブリクス１０５を備える。ルーティングテーブルキャッシュ１０３を備えない場合は、ＮＵＭＡ型ルーティングテーブル構造と呼ぶ。
【００２４】
あるインタフェースが他のインタフェースにある分散ルーティングテーブル１０４（リモート分散ルーティングテーブル）を参照する場合、パケットを授受するために利用する信号線を共用する方法と、独自の信号線を別途備える方法が存在する。共用する場合はルーティングテーブル情報を交換する別のファブリクス１０６が必要とならず、新たなハードウエアコストや基板上の配線コストが削減できる。しかしながらパケットの通信とルーティング情報の通信の衝突によるバンド幅の低下が発生する可能性がある。共用しない場合は、図１に示すように別途ルーティングテーブル情報を交換するファブリクス１０６を備える。ファブリクスの例としてはクロスバやリング（双方向リング、単方向リング）などがある。
【００２５】
各インタフェースは、独自に分散ルーティングテーブルとルーティングテーブルキャッシュを備えており、キャッシュを備えることで時間ローカリティに対応する。キャッシュにより短時間に起こる複数回の同一アドレスへのアクセスはほぼ全てキャッシュにヒットし、アドレス参照の高速化が予想できるためである。また、同時に位置ローカリティに対応する。インタフェース毎に特徴あるアクセスパターンが存在する場合も、それぞれのインタフェースが独自に備える分散ルーティングテーブルやルーティングテーブルキャッシュが、該当するインタフェースのアクセスパターンに従ってエントリを最適化できると予想できるためである。
【００２６】
ルーティングテーブルキャッシュの動作は、扱うアドレスがレイヤ３のＩＰアドレスか、レイヤ２のＭＡＣアドレスかにより若干異なる。
１．レイヤ３スイッチング（レイヤ３フォワーディング）を行う場合
レイヤ３スイッチングではＩＰアドレスを用いてルーティングを行う。ＩＰアドレスは、時間ローカリティ、位置ローカリティ、空間ローカリティ、アドレスローカリティの全てが有効である。時間ローカリティ、位置ローカリティについては分散ルーティングテーブルとルーティングテーブルキャッシュにより対応する。空間ローカリティを活かすために、Ｚｏｎｅ毎に異なるＣＲＣ攪拌ハッシュを用いる。Ｚｏｎｅ毎とは、あるｐｒｅｆｉｘ長で区切ったときその区切りよりも長いか短いかで区別して、それぞれを別のＺｏｎｅとし、異なるＣＲＣを用いた攪拌を行うというものである。ＣＲＣは実装コストが小さく高速に動作する。ＣＲＣで攪拌を行うとアドレスがランダムに分布するようになり、ＩＰアドレス空間の粗密が攪拌の結果まばらになる。従ってキャッシュの特定アドレスに登録が集中することでキャッシュ能力が低下することを防ぎ、キャッシュのアドレスに均一にアクセスするようになるので、キャッシュのメモリを有効に利用できる。
【００２７】
図２に、ルーティングテーブルキャッシュ１０３および分散ルーティングテーブル１０４のアクセスにおけるフローチャートを示す。ここではＩＰｖ４を例に示すが、ＩＰｖ６においても同様である。また、ＩＰｖ４アドレスのマスク長について、２２を境界にＺｏｎｅを区別し、２２よりも少ない場合はハッシュＡを用いてＺｏｎｅ−Ａキャッシュに，多い場合はハッシュＢを用いてＺｏｎｅ−Ｂキャッシュに登録される場合について説明する。これらの処理は、制御部１０７によりソフトウエアまたはハードウエアで制御される。
【００２８】
ＩＰアドレス１９２．１６８．３４．５６に対するアクセスがあるとすると、まずＳ２０１でＺｏｎｅ別のハッシュ値を求める。キャッシュがＤｕａｌ−Ｐｏｒｔ構成となっている場合は、２つのＺｏｎｅ別ハッシュ値を同時に検索できる。すなわち、Ｐｒｅｆｉｘの区切りにより２つのＺｏｎｅに分ける事が出来るので、それぞれに別のハッシュ生成と検索構造を持たせる。Ｓ２０２においてルーティングテーブルキャッシュを参照する。ヒットした場合はＳ２０６において、パケットのフォワーディングを行う。ミスヒットした場合は、Ｓ２０３において同じインタフェースに存在する分散ルーティングテーブル（ローカル分散ルーティングテーブル）を参照する。ミスヒットした場合はＳ２０４において他のインタフェースの分散ルーティングテーブル（リモート分散ルーティングテーブル）を参照する。それでもミスヒットした場合は、Ｓ２０５において宛先不明時の処理を行う。具体的にはデフォルトルートにパケットをフォワーディングするか破棄する等の処理が行われる。
【００２９】
ローカル分散ルーティングテーブル、もしくはリモート分散ルーティングテーブルにヒットした場合は、Ｓ２０７においてパケットのフォワーディングが行われる。同時に、キャッシュへの登録も行われる。登録の際、宛先だけでなくマスク値も参照する。
【００３０】
インターネットではある同じＩＰアドレスのプレフィックスを持ったものを集まりとして扱う。この同じｐｒｅｆｉｘがどこまでかを指定するのがマスク値である。例えば、ＷＩＮＤＯＷＳＸＰ（登録商標）やＮＴ系だと、コマンドラインからｉｐｃｏｎｆｉｇ命令を実行するとＩＰアドレスと一緒にサブネットマスクというマスク値が出てくる。また、以下でＺＯＮＥとは、このマスク値がどれだけの長さであるか、すなわちｐｒｅｆｉｘをどこまでと見るのかにより分類した概念である。
【００３１】
ヒットしたエントリが１９２．１６８．０．０／１６であった場合は、マスク長が１６で２２よりも少ないことからハッシュＡを用いてＺｏｎｅ−Ａキャッシュに１９２．１６８．３４．５６のアドレスと宛先情報が登録される。ヒットしたエントリが１９２．１６８．３４．０／２４であった場合は、マスク長が２４で２２よりも多いことからハッシュＢを用いてＺｏｎｅ−Ｂキャッシュに登録される。
２．レイヤ２スイッチングを行う場合
レイヤ２スイッチングではＭＡＣアドレスを用いてルーティングを行う。ＭＡＣアドレスは、時間ローカリティ、空間ローカリティ、アドレスローカリティが有効である。この場合の動作も図２のフローチャートを用いて説明が可能である。
【００３２】
ＭＡＣアドレスにはマスク値が存在しないためＺｏｎｅを区別しない。従って、単純にＣＲＣにより攪拌を行う。ＭＡＣアドレス００：１１：ＡＡ：２２：ＢＢ：３３に対するアクセスがある場合、まずＳ２０１においてハッシュ値を求める。Ｓ２０２においてルーティングテーブルキャッシュを参照する。ヒットした場合はＳ２０６において、パケットのフォワーディングを行う。ミスヒットすると、Ｓ２０３において同じインタフェースに存在する分散ルーティングテーブル（ローカル分散ルーティングテーブル）を参照する。ミスヒットした場合はＳ２０４において他のインタフェースの分散ルーティングテーブル（リモート分散ルーティングテーブル）を参照する。それでもミスヒットした場合は、Ｓ２０５において宛先不明時の処理を行う。具体的にはデフォルトルートにパケットをフォワーディングするか、破棄するか等の処理が行われる。
【００３３】
ローカル分散ルーティングテーブル、もしくはリモート分散ルーティングテーブルにヒットした場合は、Ｓ２０７においてパケットのフォワーディングと同時にハッシュ値に従い、ルーティングテーブルキャッシュへの登録を行う。
【００３４】
【実施例２】
本発明によるＣＣ−ＮＵＭＡ型ルーティングテーブル上で経路情報処理を行う際の手順について述べる。ＢＧＰ（ＢｏｒｄｅｒＧａｔｅｗａｙＰｒｏｔｏｃｏｌ）やＯＳＰＦ（ＯｐｅｎＳｈｏｒｔｅｓｔＰａｔｈＦｉｒｓｔ）等の経路情報処理を行う際、相手に自分が持っている経路情報を伝える必要があるためにルーティングテーブルを読み出す必要がある。共有ルーティングテーブル方式では該当する共有ルーティングテーブルの内容を読み出すのみでよいが、分散ルーティングテーブル方式では情報が分散しているためこれをまとめる手順が別途必要となる。そこで、次の方針を採る。経路制御処理自体は各分散ルーティングテーブルにアクセス可能な専用ハードウエアやプロセッサで処理する。
１．あるインタフェースが経路制御処理で受け取った情報は、ローカル分散ルーティングテーブルのみで管理する。従ってリモート分散ルーティングテーブルにはなんら通知を行わない。
２．あるインタフェースが経路制御処理に伴う情報を送る場合でリモート分散ルーティングテーブルの内容を必要とする際は、該当するリモート分散ルーティングテーブルの内容を取得し通知する。
【００３５】
この手順に従えば、全分散ルーティングテーブルの内容を包括した情報を入手できる。ＣＯＭＡ型ルーティングテーブルにおいても同様で、基本的にはローカルに存在するホームのみで経路情報を扱い、リモート分散ルーティングテーブルのホームを参照する必要があるときのみ、この内容を取得し通知する。
【００３６】
これらの授受の手順に合わせてキャッシュインジェクション（キャッシュの投機的登録）も可能である。キャッシュに登録される情報は完全一致であるため通常経路制御処理で交換されるマスクが付加されたアドレスを直接登録することができない。そこで、マスクされている部位に適当な値を埋め込んだもの、もしくはＩＰｖ４やＩＰｖ６においてマスクなしで指定されているアドレスをキャッシュする。
【００３７】
【実施例３】
実施例２による経路制御処理により分散ルーティングテーブルに新たなデータが加わるなどして既存データのアップデートが行われた場合には、その情報をシェアしている他のインタフェースに存在するルーティングテーブルキャッシュの内容を更新する必要がある。この手順について、各共有ルーティングテーブルのエントリにビットマップ情報を加えて管理する場合を述べる。ビットマップ情報を加えて管理する場合は管理情報が増大するが、きめ細かい管理を行うため更新時の通信量が削減できる。
【００３８】
図３は本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、ビットマップで共有状況を管理する場合の共有ルーティングテーブルおよび分散ルーティングテーブルのエントリ構造を示した構造図である。
【００３９】
図３に該当する分散ルーティングテーブルのエントリの構造例を示す。エントリにはＩＰアドレス、宛先インタフェース番号、ＱｏＳ情報、Ｓｈａｒｅｄビットマップ情報、その他パトリシア木構造等で管理するため、その管理情報やマスク値の情報といった付加情報が含まれている。ＩＰアドレスは宛先および送り先を融合した形で管理している。ＱｏＳ情報は例えば占有可能な帯域幅や、ＱｏＳのクラス（どの様な帯域保証を行うか）の情報を記載する。Ｓｈａｒｅｄビットマップ情報は、当該エントリの情報が他のインタフェースと共有されているかどうかを示しており、例えば８個のインタフェースが有り、２番目と７番目のインタフェースが同じエントリを持つ場合はＳｈａｒｅｄビットマップは０１００００１０となる。
【００４０】
図４にこの手法におけるプロトコルを示す。ルーティングテーブルキャッシュにミスヒットすると（図４の４０１）、同じインタフェースにある分散ルーティングテーブルに参照要求を出す（図４の４０２）。ヒットした場合はそのエントリのＳｈａｒｅｄビットマップについて、該当部分（同じインタフェース番号に相当する部分）をマークする。ここで、Ｓｈａｒｄとは、複数のインタフェースで同じエントリをシェアしているということを表す。
【００４１】
ミスヒットした場合は別のインタフェースに存在する分散ルーティングテーブルに対し、参照要求を出す（図４の４０３）。ある共有ルーティングテーブルにエントリが存在した場合は、そのエントリのＳｈａｒｅｄビットマップについて、該当部分（その参照要求の発行元インタフェース番号に相当する部分）をマークする（図４の４０４）。ビットマップとなっているため、どのインタフェースがキャッシュしているかを把握できる。例えば、図４においては、インタフェース１の参照要求であるため、ビットマップの１ｂｉｔ目をマークする。エントリを更新した場合は（図４の４０５）、Ｓｈａｒｅｄビットマップに従い、該当するインタフェースのルーティングテーブルキャッシュに対してＩｎｖａｌｉｄａｔｅメッセージを発行しキャッシュを無効化する（図４の４０６）。同時に、Ｓｈａｒｅｄビットマップのマークを消去する。
【００４２】
図５は本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、ホームの位置を示す情報を有するルーティングテーブルキャッシュのエントリ構造を示した構造図である。
【００４３】
図５に示すようにどのインタフェースに存在するデータをキャッシュしているのかを表すホームＩＤを管理する。キャッシュエントリを消去する場合（キャッシュエントリの追い出し等）はホームＩＤに従い、該当するインタフェースの分散ルーティングテーブルのＳｈａｒｅｄビットマップを変更し、エントリを更新しても無駄なＩｎｖａｌｉｄａｔｅメッセージを生成しないようにする。
【００４４】
ここではＩｎｖａｌｉｄａｔｅメッセージによるキャッシュの無効化について述べたが、Ｕｐｄａｔｅメッセージによるキャッシュの更新を行ってもよい。
【００４５】
【実施例４】
実施例３とは異なり、各共有ルーティングテーブルのエントリに識別子を加えて管理する場合について述べる。この手法は更新時の通信量が増大するが管理情報が少なくなる。
【００４６】
図６は本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、単一ビットで共有状況を管理する場合の分散ルーティングテーブルのエントリ構造を示した構造図である。
【００４７】
この場合の共有ルーティングテーブルのエントリの構造例を図６に示す。Ｓｈａｒｅｄビットマップではなく、Ｓｈａｒｅｄフラグを持つ。
【００４８】
図７にこの手法におけるプロトコルを示す。ルーティングテーブルキャッシュにミスヒットすると（図７の７０１）、同じインタフェースにある分散ルーティングテーブルに参照要求を出す（図７の７０２）。ヒットした場合はそのエントリのＳｈａｒｅｄフラグをマークする。ミスヒットした場合は別のインタフェースに存在する分散ルーティングテーブルに対し、参照要求を出す（図７の７０３）。ある共有ルーティングテーブルにエントリが存在した場合は、そのエントリのＳｈａｒｅｄフラグをマークする（図７の７０４）。フラグであるため、どのインタフェースがキャッシュしているかは把握できない。エントリを更新した場合は（図７の７０５）、Ｓｈａｒｅｄフラグがマークされている場合はＩｎｖａｌｉｄａｔｅメッセージをブロードキャストしてキャッシュを無効化する（図７の７０６）。Ｉｎｖａｌｉｄａｔｅメッセージを受け取ったインタフェースで該当するエントリを持つ場合はキャッシュのエントリをＩｎｖａｌｉｄａｔｅする。持たない場合は無視する。同時に、Ｓｈａｒｅｄフラグのマークを消去する。Ｓｈａｒｅｄフラグをローカルインタフェース用とリモートインタフェース用の２つ備えることで、ローカルで解決するエントリの更新についてはブロードキャストを避けることができる。
【００４９】
またルーティングテーブルキャッシュ側については、実施例３同様にどのインタフェースに存在するデータをキャッシュしているのかを表すホームＩＤを管理する。キャッシュエントリを消去する場合（キャッシュエントリの追い出し等）はホームＩＤに従い、該当するインタフェースの分散ルーティングテーブルのＳｈａｒｅｄビットマップを変更し、エントリを更新しても無駄なＩｎｖａｌｉｄａｔｅメッセージのブロードキャストを生成しないようにする。
【００５０】
ここではＩｎｖａｌｉｄａｔｅメッセージによるキャッシュの無効化について述べたが、Ｕｐｄａｔｅメッセージによるキャッシュの更新を行ってもよい。
【００５１】
【実施例５】
図８は本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、特に管理情報を持たないルーティングテーブルキャッシュのエントリ構造を示した構造図である。
【００５２】
実施例３および実施例４において、ルーティングテーブルキャッシュ側についても、図８に示すように管理情報を付加しない手法について述べる。
【００５３】
キャッシュエントリを消去する場合で実施例３と共に用いる場合は（キャッシュエントリの追い出し等）は共有ルーティングテーブルのＳｈａｒｅｄ情報の削除要求がブロードキャストされる。実施例４と共に用いる場合は、キャッシュエントリの追い出し時、特に何もしない。
【００５４】
その他、各共有ルーティングテーブル、ルーティングテーブルキャッシュに管理情報を付加しない手法も考えられる。この手法は管理情報が必要ないが、エントリを更新する場合は必ず他のインタフェースに更新内容をブロードキャストする。
【００５５】
図９に、ＣＣ−ＮＵＭＡのキャッシュレス構造としてＮＵＭＡ型ルーティングテーブルの構成例を示す。ＮＵＭＡ型のルーティングテーブルメモリは各メモリの要領削減に貢献する一方で、キャッシュを持たないため低遅延化は難しい。
【００５６】
【実施例６】
本発明によるＣＯＭＡ型ルーティングテーブル構造の例を述べる。
【００５７】
図１０にＣＯＭＡ型ルーティングテーブルの構成例を示す。ＣＯＭＡ型では、各インタフェース１０１にパケットに関する様々な処理を行うパケット処理部１０２、およびルーティングテーブルキャッシュ１０３を備える。また、各インタフェースの外にパケットの交換を行うスイッチングファブリクス１０５を備える。このスイッチングファブリクスをルーティング情報の交換に共用する場合は、ルーティングテーブル情報を交換する別のファブリクス１０６が必要とならない。共用しない場合は別途ルーティングテーブル情報を交換するファブリクス１０６を備える。
【００５８】
各インタフェースは、独自にルーティングテーブルキャッシュを備えており、このため時間ローカリティおよび位置ローカリティに対応可能である。特に、ＣＯＭＡ型の場合はＣＣ−ＭＵＭＡ型と異なり各データのホーム（あるアドレスが格納されている場所）の移動を許す。ここでデータとは即ちエントリであり、図３でいう横一列の情報の組である。ＣＣ−ＮＵＭＡ型では、各インタフェースにおける経路制御プロトコルにより分散ルーティングテーブルのエントリが設定され、このエントリ（キャッシュではなくオリジナルのエントリを特にホームと呼ぶ）が他のインタフェースの分散ルーティングテーブルに移動しない。ＣＯＭＡ型では、ホームの参照を頻繁に行うリモートルーティングテーブルキャッシュはホームをキャッシュし（これをコピーと呼ぶ）さらに同じエントリ（ホームのコピー）のアクセスを続け、逆にローカルルーティングテーブルキャッシュ（ホーム）のアクセスが滞ると、これまでホームであったエントリが追い出され、コピーがホームとなる。この様に、各エントリがより頻繁に参照されるインタフェースの近くに配置されるようになるため、エントリの配置が利用と共に最適化され結果としてアクセスコストが少なくなる。
【００５９】
この特徴を活かすため、ＣＯＭＡ型でインタフェースの数が多い場合はファブリクス１０６を階層構造とする。
【００６０】
図１１は階層構造をもつファブリクスの例であり２進木構造となっている。各枝にはディレクトリ１１０１が存在する。この２進木構造は、より大きな整数ｎを用いたｎ進木構造でも構わない。以下に、ＣＯＭＡを用いたルーティングテーブルの管理プロトコルを示す。各ディレクトリは、エントリの次の状態を保持する。
【００６１】
ａ）Ｉｎｖａｌｉｄ：該当するエントリにはデータが存在しないことを示す。略称Ｓ状態
ｂ）Ｅｘｃｌｕｓｉｖｅ：他にコピーが存在しない。略称Ｅ状態
ｃ）Ｓｈａｒｅｄ：他にコピーが存在する可能性がある。略称Ｓ状態
ｄ）Ｒｅａｄｉｎｇ：上位階層に読み出し要求を出して待ち状態である。略称Ｒ状態
ｅ）Ａｎｓｗｅｒｉｎｇ：下位階層に読み出しもしくは書き込み要求を出して待ち状態である。略称Ａ状態
ｆ）Ｗｒｉｔｉｎｇ：上位階層に書き込み要求を出して待ち状態である。略称Ｗ状態。
【００６２】
まず、読み出し時の動作について述べる。ＥおよびＳ状態のエントリは、自由に読み出しできる。読み出しにミスした場合は、該当するエントリを加えてＲ状態とし、上位ディレクトリに問い合わせる。読み出しにヒットした場合はＡ状態とし下位階層に問い合わせる。存在しない場合はＲ状態とし、さらに上位階層に問い合わせる。この動作を再帰的に繰り返すことで、やがてホームもしくはコピーにたどり着く。このエントリの情報をＲおよびＡのエントリを元に階層を戻り、各エントリをＳｈａｒｅｄ状態に変えながらアクセスがあったインタフェースまで送る。
【００６３】
次に、書き込み時、すなわち更新時の動作について述べる。Ｅ状態のエントリは自由に書き込み可能である。Ｓ状態もしくは書き込みミスの場合はホームもしくはコピーを検索しデータを更新する必要がある。Ｓ状態の場合はエントリを更新した後Ｗ状態とし、上位ディレクトリに問い合わせる。書き込みミスの場合は直接上位ディレクトリへの問い合わせを行う。その階層にエントリが存在した場合はＡ状態とし下位階層に問い合わせる。存在しない場合はＷ状態とし、さらに上位階層に問い合わせる。この動作を再帰的に繰り返すことで、ホーム、もしくはコピーにたどり着く。このエントリの情報をＷおよびＡのエントリを元に階層を戻り、Ａ状態のエントリのみＳｈａｒｅｄ状態に変えながらアクセスがあったインタフェースまで送る。ホームもしくはコピーにたどり着かず、最上位階層に達したときは自分がＥｘｃｌｕｓｉｖｅとして保持すべき情報であるため、Ｗ状態をＳ状態に変えながら階層を下り、Ｅ状態として新たにエントリを加える。
【００６４】
【実施例７】
あるインタフェースのルーティングテーブルキャッシュのエントリが一杯になり、それ以上登録できなくなった場合、次の破棄を回避する手法がある。
１．他のあいているエントリをブロードキャストで探す
ブロードキャストメッセージを用いて、他のルーティングテーブルキャッシュの空きエントリを探し移管する。空きのあるルーティングテーブルキャッシュはエントリを予約確保し、応答メッセージを返す。追加して登録できるエントリがないインタフェースは、最も早く帰ってきた応答メッセージに従ってエントリを渡す。このメッセージもブロードキャストされ、移管対象となったルーティングテーブルキャッシュはエントリを予約領域に登録し、移管対象とならなかったルーティングテーブルキャッシュは予約を解除する。
２．各インタフェースに順に移管要求を出す
ブロードキャストメッセージを用いる手法は移管作業が迅速に行える半面、ブロードキャストを用いるため通信が混雑する可能性がある。そこで、各インタフェースに順に移管要求を出し逐次確認をとる手法がある。移管作業に時間が掛かる可能性があるが、ブロードキャストメッセージを用いる必要がない。
３．メモリを別途用意し、そこに保存する。
【００６５】
ＵＭＡやＮＵＭＡ，ＣＣ−ＮＵＭＡと構造上の差異が小さいが、このメモリはあくまでもＣＯＭＡのエントリ不足を補うためのもので、スワップスペースとして利用される。
【００６６】
【発明の効果】
本発明によるＮＵＭＡ，ＣＣ−ＮＵＭＡ，ＣＯＭＡ型ルーティングメモリ管理方式を用いることにより、ルーティングテーブルメモリ量の削減が可能となる。また、ＣＣ−ＮＵＭＡ、ＣＯＭＡ型といったキャッシュを用いる方式では、各種ローカリティに対応した高速なルーティングテーブル参照を可能とする。
【００６７】
本発明によるルーティングメモリ管理方式はスケーラブルであり、特に大規模かつスループットの高いルータを構築する際に重要なものである。
【００６８】
高度情報化社会においては、通信速度においてもルーティングテーブル容量においてもより大容量のルータが必要である。この様なルータの実効スループットを制限する要因の１つである、ルーティングテーブル参照時間の増大を緩和することで性能向上を図ることが可能となる。
【図面の簡単な説明】
【図１】本発明におけるルーティングテーブルメモリの管理構造の例として、ＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造を表したブロック図である。
【図２】本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造上でアドレス検索を行う手順を示した状態遷移図である。
【図３】本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、ビットマップで共有状況を管理する場合の分散ルーティングテーブルのエントリ構造を示した構造図である。
【図４】本発明におけるビットマップを用いたＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造上で更新作業が行われた場合の処理の流れをしめしたフロー図である。
【図５】本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、ホームの位置を示す情報を有するルーティングテーブルキャッシュのエントリ構造を示した構造図である。
【図６】本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、単一ビットで共有状況を管理する場合の分散ルーティングテーブルのエントリ構造を示した構造図である。
【図７】本発明における単一ビットを用いたＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造上で更新作業が行われた場合の処理の流れをしめしたフロー図である。
【図８】本発明におけるＣＣ−ＮＵＭＡ型ルーティングテーブル管理構造について、特に管理情報を持たないルーティングテーブルキャッシュのエントリ構造を示した構造図である。
【図９】本発明におけるルーティングテーブルメモリの管理構造の例として、ＮＵＭＡ型ルーティングテーブル管理構造を表したブロック図である。
【図１０】本発明におけるルーティングテーブルメモリの管理構造の例として、ＣＯＭＡ型ルーティングテーブル管理構造を表したブロック図である。
【図１１】本発明におけるＣＯＭＡ型ルーティングテーブル管理機構における階層構造を持つファブリクスの例として、２進木構造の例を示した構造図である。
【図１２】従来のＵＭＡ型ルーティングテーブル管理機構を表したブロック図である。
【符号の説明】
１０１インタフェース
１０２パケット処理部
１０３ルーティングテーブルキャッシュ
１０４分散ルーティングテーブル
１０５パケット交換用スイッチングファブリクス
１０６管理情報交換用スイッチングファブリクス
Ｓ２０１ハッシュの演算状態
Ｓ２０２ルーティングテーブルキャッシュの参照状態
Ｓ２０３ローカル分散ルーティングテーブルの参照状態
Ｓ２０４リモート分散ルーティングテーブルの参照状態
Ｓ２０５宛先不明時の処理状態
Ｓ２０６パケットのフォワーディング状態
Ｓ２０７パケットのフォワーディングとキャッシュ登録を行う状態
４０１ルーティングテーブルキャッシュにミスヒットしたアクション
４０２ローカル分散ルーティングテーブルに参照要求を出すアクション
４０３リモート分散ルーティングテーブルに参照要求を出すアクション
４０４Ｓｈａｒｅｄビットマップにマーキングするアクション
４０５エントリの更新アクション
７０６Ｉｎｖａｌｉｄａｔｅメッセージの発行アクション
７０１ルーティングテーブルキャッシュにミスヒットしたアクション
７０２ローカル分散ルーティングテーブルに参照要求を出すアクション
７０３リモート分散ルーティングテーブルに参照要求を出すアクション
７０４Ｓｈａｒｅｄフラグにマーキングするアクション
７０５エントリの更新アクション
７０６Ｉｎｖａｌｉｄａｔｅメッセージの発行アクション
１１０１ディレクトリ
１２０１ルーティングテーブル
１２０２共有ルーティングテーブル。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a low-latency switching system for large-capacity packets in communication, and more particularly to a routing table required for packet routing work in a low-latency switching system for large-capacity packets and a management and operation method of the routing table. As the packet, in particular, an IP packet and an Ethernet (registered trademark) frame are accepted.
[0002]
[Prior art]
[Non-patent document 1]
"Multi-zone Caches for Accelerating IP Routing Table Lookups, IL Chvetsand MH MacGregor, pp. 121-126, in Proceedings of HPSR 2002, May. 200.
In the field of communications, various router and switch configuration technologies have been proposed that meet the demands for wider bandwidth, higher quality, and lower delay. In particular, the routing table is a part related to the future enhancement of functions, bandwidth, quality, and delay of the network, and the operation performance of the routing table affects the performance of the router and the network.
[0003]
FIG. 12 is a configuration diagram of a conventional basic router. A basic packet processing procedure will be described with reference to FIG. The destination information of the packet input to the plurality of interfaces 101 of the router is examined in the packet processing unit 102 of the router. By referring to the routing table 1201 based on the destination information, information necessary for the output destination interface and the QoS is obtained. By transmitting this information to the switching fabric 105, the packet is transferred to the output destination interface which is the destination. As the switching fabric 105, a crossbar is often used. An arbiter attached to the crossbar accepts the request, and performs arbitration in consideration of QoS information, priority, and the like. A crossbar is set based on the arbitration result, and the input interface and the output interface are correlated.
[0004]
The routing table includes only one shared routing table 1202 (which may be referred to as a primary routing table, a default routing table, or the like) in a router, and a distributed routing table 1201 (secondary) distributed to each interface accompanying the router. Routing table, simple routing table, etc.).
[0005]
The contents of the distributed routing table 1201 are the same as the contents of the shared routing table 1202, and distribute the access to the shared routing table. Especially in the case of a cache, it is called a routing table cache. As a conventional technique relating to caching of a routing table, see "Multi-zone Caches for Accelerating IP Routing Table Lookups, IL Chvets and MH MacGregor, pp. 121-126, in Proceeding Rs. I want to be. For the first time in this dissertation, IP-specific caches were considered. This paper proposes a multi-zone cache that makes use of temporal locality and spatial locality existing in IP traffic. The time locality indicates that once access is made from an IP address in a router, access to the same IP address is likely to occur many times in a short time thereafter. The spatial locality is that when a router outputs a packet with a certain IP address to a certain destination (for example, an interface number), there is a possibility that a packet having an IP address in the vicinity (an IP address in which only the lower few bits are different) is also output to the same destination. Is higher.
[0006]
In the technique of FIG. 12, the shared routing table and other routing tables have the same contents, and the contents of the shared routing table are copied to other routing tables as appropriate. Further, in the conventional example of FIG. 12, communication is not performed between the routing tables of other interfaces.
[0007]
If route information is centrally managed by a shared routing table (1202 in FIG. 12) provided only in the entire router, it is not possible to cope with the concentration of table reference requests due to an increase in the processing speed of the router. (Routing table cache) and a corresponding configuration has been proposed.
JP-A-2002-164899, JP-A-2000-83055, JP-A-11-289435, and JP-A-11-122289 have been proposed as having a distributed routing table equivalent to a routing table cache. These have both a single shared routing table as a whole and a routing table cache provided for each interface. The shared memory is connected via a common bus or switch, and such a memory management architecture is called a UMA (Uniform Memory Access model).
[0008]
[Problems to be solved by the invention]
An object of the present invention is to improve a routing table in a router and its reference mechanism, and to increase the capacity, reduce the delay, and reduce the cost of the router. This requires simplification and miniaturization of the memory used for the routing table. As a result of miniaturization of the memory, cost reduction can be achieved by reducing the number of chips and the board area, and the operation speed is improved, so that the throughput is improved.
[0009]
In the conventional UMA type routing table management structure, the routing table (1201 in FIG. 12) and the shared routing table (1202) are mounted in different places but have the same contents. This structure has been devised to distribute the access concentration of the shared routing table by giving the same contents to each interface so that the load is not concentrated on the shared routing table.
[0010]
The present invention focused on the following problems in the conventional method.
1. The shared routing table is a union of the routing tables in each interface. Therefore, the size of the shared routing table becomes large, and a search cost and an implementation cost when referring to the contents are required, which cannot contribute to an improvement in the effective throughput and a reduction in the price. When each interface has the same contents as the shared routing table, not only costs are increased, but also it becomes difficult to cope with the enlargement of the routing table accompanying the development of the network.
[0011]
When both the shared routing table and the routing table cache are provided, if the difference between the routing information in each interface is large, the amount of information to be remembered in the shared routing table becomes enormous. Generally, as the amount of information in the routing table increases, the access cost also increases. Therefore, in a large-scale and high-speed router, the access cost of the shared routing table becomes a problem, and it becomes difficult to improve the execution throughput.
2. When there is a strong interface-dependent locality (position locality) in referencing the routing table, there are few references to the shared routing table, and it is difficult to obtain a performance improvement corresponding to the hardware cost required for providing a huge shared routing table. .
3. In the caching of the routing table, not only the temporal locality and the spatial locality but also the above-mentioned positional locality and further, the address locality need to be considered. The address locality indicates that valid or frequently used IP addresses and MAC addresses are not uniformly distributed over the entire address space but are unevenly distributed. Therefore, the hash function in the cache of the routing table is not a simple one that collects partial bit information of a given IP address, but a cache hash function corresponding to address locality is required. The Multi-Zone cache effectively utilizes the spatial locality, but does not consider the position locality and the address locality.
4. When implementation costs and processing costs are required to realize a hash, it is difficult to provide different hashes for each Zone. Japanese Patent Application Laid-Open No. 2001-320420 uses a search result based on a hash for adjusting traffic, and obtains a CRC of the entire IP address.
5. Japanese Patent Application Laid-Open No. 2001-326679 is an invention for a routing table search technique, in which an inode method of a multiple-time indirect reference method used as a standard in a file system such as UNIX (registered trademark) is applied to routing table reference. It is effective when the target is a large capacity (a hard disk targets a capacity such as giga or tera) as used in a file system that is lower in the storage hierarchy. Is slow.
[0012]
[Means for Solving the Problems]
In order to solve the above-mentioned problems, the present invention uses the following means.
1. The structure is such that a common routing table is not provided and a routing table is distributed and arranged for each port. That is, unlike the example of FIG. 12, each routing table has a different combination of information. The fact that a routing table can have its own content means that each table can be optimized as desired using locality. This structure is called a NUMA (Non-Uniform Memory Access model). Furthermore, a structure of a CC-NUMA (Cache Coherent-NUMA) having a cache in the NUMA is adopted. Alternatively, a structure is adopted in which a routing table cache is distributed and arranged for each port. This structure is called COMA (Cache Only Memory Architecture). As a specific router configuration, a router having a plurality of interfaces, each of which includes a packet processing unit for processing an input packet, and a set of information dependent on the packet and a destination of the packet as information. A routing table for storing and referencing the packet-dependent information as a key, and searching for a destination of the packet. If a search is not found in the routing table of the first interface, the routing of the second interface is performed. Search the destination by referring to the table. With this configuration, it is possible to avoid a problem that the routing table is enlarged due to the development of the network. The information depending on the packet is, for example, an IP address or a MAC address.
[0013]
Each routing table has a routing table cache that stores a copy of a part of the information held by the routing table. When a search is not found in a certain routing table cache, the routing table corresponding to the routing table cache is referred to. In this case, CC-NUMA for searching for a destination may be used.
[0014]
Each routing table is configured as a routing table cache. If a search is not found in the routing table cache on the interface where the packet arrives, the destination is referred to by referring to the routing table cache on the other interface. If a search is made and a hit is found, the hit data may be copied to the first routing table cache as COMA.
[0015]
When the second routing table returns a reference result, the effect of the cache can be obtained by storing the reference result in the first routing table.
[0016]
In addition, if the first path for transferring the input packet between the interfaces and the second path for transferring the storage information of the routing table between the interfaces are provided independently, the transfer speed of both can be improved. Can be.
[0017]
Further, a stirring means can be added to the hash function of the routing table cache to improve the memory use efficiency.
[0018]
An entry in another routing table having the same entry as the routing table changed when the contents of the routing table are changed may be invalidated, or the change may be notified to a routing table in another interface by an update message. .
[0019]
According to another aspect of the present invention, there are provided a plurality of packet processing units and a plurality of routing tables corresponding to each of the packet processing units, and the first routing table is processed by the corresponding first packet processing unit. Is used first to find the destination of the packet that has been sent, and if the search is not hit in the first routing table, a search is made in the second routing table, and the second routing table is Is used first to search for the destination of the packet processed by the second packet processing unit. If the search is not hit in the second routing table, the search is performed in the first routing table. Provide a router.
[0020]
According to another aspect of the present invention, there is provided a routing method for a router including a plurality of interfaces each receiving a packet, wherein each of the interfaces includes a packet processing unit that processes an input packet, and an address of the input packet. A routing table for searching a destination based on the routing table, and a routing table cache for storing at least a part of a table stored in the routing table, and based on an address of a packet input to an interface, A first step of searching for a destination in the routing table cache; a second step of searching for a destination in the routing table of the interface when the destination cannot be searched in the first step; and a second step of searching for a destination in the routing table of the interface. When bought, a third step, to find the destination in the routing tables of other interfaces.
2. If the structure of the routing table is NUMA and a distributed routing table that each interface independently manages is provided, location locality can be utilized. As a result, a memory amount smaller than the memory amount required for the shared routing table is sufficient. Also, if CC-NUMA is used, the content managed by each interface is cached even when referring to the content managed by each interface regardless of the interface itself, so that the location locality can be effectively utilized, and the reference speed can be similarly increased. Improvement can be expected. If the management structure of the routing table is set to COMA, each piece of routing information does not stay in a fixed place but moves to a place where access frequently occurs, so that ideal routing information is arranged. Therefore, the management becomes complicated, but the position locality can be effectively utilized.
3. In order to make effective use of address locality, a hash function that performs agitation rather than bit extraction is required instead of a hash function that simply extracts a certain bit that is usually used. For example, an IP address of 10.113.53.152 is expressed in binary notation as 00001010.011110001.00101011.10011000. If a memory is drawn using all of them, a 32-bit address space is required, which makes mounting difficult. Therefore, when a hash function for extracting a bit at a location that is a multiple of 4 as a certain bit is used, the hash value becomes 01010011, and a hash in which a memory is subtracted based on this value can be considered. However, considering the locality of the address, the hash value obtained in this way has a high probability of obtaining a similar value, and the memory cannot be used efficiently. Therefore, a stirring means is required. Since complicated stirring requires hardware and processing costs, a hash function using a simple CRC (Cyclic Redundancy Check) remainder is used. Further, in order to effectively utilize the spatial locality in a single cache without providing a plurality of the same cache structures, a different hash structure is provided for each zone and a dual-port memory is used.
4. In the CRC, a better stirring result is obtained by the generator polynomial, the number of bits at the source and the destination can be freely taken (the number of bits after conversion needs to be smaller than the number of bits before conversion), and the implementation cost And the processing cost are both small. CRC can be easily parallelized and performed at high speed. The CRC is used as a hash function of the IP address and used for a routing table cache.
5. Speeding up is achieved by searching for the routing table cache not by indirect reference but by obtaining a single reference to the hash as in a normal cache. The reference method of the cache is set associative.
According to one aspect of the present invention, a router for communication includes the arbiter for ensuring bandwidth and real-time performance and an associated or integrated switch.
[0021]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the following embodiments, a router is used as an example. However, the present invention is not limited to a router, and is expected to have an effect of increasing capacity, reducing delay, and reducing cost by applying the present invention when equivalent localities exist. it can.
[0022]
Embodiment 1
An example of the CC-NUMA type routing table structure and operation according to the present invention will be described.
[0023]
FIG. 1 shows a configuration example of a CC-NUMA type routing table. In the CC-NUMA type, each interface 101 includes a packet processing unit 102 that performs various processes on a packet, a routing table cache 103, and a distributed routing table 104. Further, a switching fabric 105 for exchanging packets is provided outside each interface. When the routing table cache 103 is not provided, it is called a NUMA type routing table structure.
[0024]
When an interface refers to a distributed routing table 104 (remote distributed routing table) in another interface, there are a method of sharing a signal line used for transmitting and receiving a packet and a method of separately providing a unique signal line. . In the case of common use, another fabric 106 for exchanging routing table information is not required, and new hardware costs and wiring costs on a board can be reduced. However, there is a possibility that a decrease in bandwidth occurs due to collision between packet communication and routing information communication. When not shared, a fabric 106 for exchanging routing table information separately is provided as shown in FIG. Examples of fabrics include crossbars and rings (bidirectional rings, unidirectional rings).
[0025]
Each interface has its own distributed routing table and routing table cache. By providing a cache, it supports time locality. This is because a plurality of accesses to the same address that occur in a short time due to the cache almost all hit the cache, and a high-speed address reference can be expected. At the same time, it corresponds to the position locality. This is because, even when there is an access pattern peculiar to each interface, it can be expected that a distributed routing table or a routing table cache independently provided for each interface can optimize entries according to the access pattern of the corresponding interface.
[0026]
The operation of the routing table cache differs slightly depending on whether the address to be handled is a layer 3 IP address or a layer 2 MAC address.
1. When performing layer 3 switching (layer 3 forwarding)
In Layer 3 switching, routing is performed using an IP address. For the IP address, all of the time locality, the position locality, the space locality, and the address locality are valid. The time locality and the location locality are handled by the distributed routing table and the routing table cache. To take advantage of spatial locality, a different CRC agitation hash is used for each Zone. Each zone means that when a zone is divided by a certain prefix length, the zone is distinguished by whether the zone is longer or shorter than the zone, each zone is set to a different zone, and stirring is performed using a different CRC. CRC operates at high speed with a small mounting cost. When the stirring is performed by the CRC, the addresses are randomly distributed, and the density of the IP address space becomes sparse as a result of the stirring. Therefore, it is possible to prevent the cache performance from deteriorating due to the concentration of registration at a specific address of the cache, and to uniformly access the cache address, so that the cache memory can be used effectively.
[0027]
FIG. 2 shows a flowchart in accessing the routing table cache 103 and the distributed routing table 104. Here, IPv4 is shown as an example, but the same applies to IPv6. Further, the mask length of the IPv4 address is discriminated from the zone at the boundary of 22. If the mask length is smaller than 22, the hash is registered in the Zone-A cache using the hash A, and if the mask length is larger, the zone is registered in the Zone-B cache using the hash B. Will be described. These processes are controlled by the control unit 107 by software or hardware.
[0028]
Assuming that there is access to the IP address 192.168.34.56, first, in step S201, a hash value for each zone is obtained. If the cache has a dual-port configuration, two zone-specific hash values can be searched simultaneously. That is, since it can be divided into two Zones by the break of Prefix, each has a different hash generation and search structure. In S202, the routing table cache is referred to. If there is a hit, the packet is forwarded in S206. If there is a mishit, the distributed routing table (local distributed routing table) existing on the same interface is referred to in S203. If there is a mishit, the distributed routing table (remote distributed routing table) of another interface is referred to in S204. If there is still a miss hit, a process for when the destination is unknown is performed in S205. Specifically, processing such as forwarding or discarding the packet to the default route is performed.
[0029]
If the local distributed routing table or the remote distributed routing table is hit, the packet is forwarded in S207. At the same time, registration in the cache is also performed. At the time of registration, reference is made not only to the destination but also to the mask value.
[0030]
On the Internet, those having the same IP address prefix are treated as a group. It is the mask value that specifies how far the same prefix is. For example, in the case of WINDOWS XP (registered trademark) or the NT system, when an ipconfig instruction is executed from a command line, a mask value called a subnet mask appears together with an IP address. In the following, ZONE is a concept classified according to how long the mask value is, that is, how far the prefix is to be viewed.
[0031]
When the hit entry is 192.168.0.0/16, since the mask length is 16 and less than 22, the address of 192.168.34.56 is stored in the Zone-A cache using the hash A. The destination information is registered. When the hit entry is 192.1684.34.0/24, the mask length is 24 and is larger than 22, so that the entry is registered in the Zone-B cache using the hash B.
2. When performing layer 2 switching
In layer 2 switching, routing is performed using a MAC address. As for the MAC address, temporal locality, spatial locality, and address locality are valid. The operation in this case can also be described using the flowchart in FIG.
[0032]
Since there is no mask value in the MAC address, Zone is not distinguished. Therefore, stirring is performed simply by CRC. If there is an access to the MAC address 00: 11: AA: 22: BB: 33, a hash value is obtained in step S201. In S202, the routing table cache is referred to. If there is a hit, the packet is forwarded in S206. If a mishit occurs, a distributed routing table (local distributed routing table) existing on the same interface is referred to in S203. If there is a mishit, the distributed routing table (remote distributed routing table) of another interface is referred to in S204. If there is still a miss hit, a process for when the destination is unknown is performed in S205. Specifically, processing such as forwarding or discarding the packet to the default route is performed.
[0033]
If the local distributed routing table or the remote distributed routing table is hit, in step S207, the packet is forwarded and registered in the routing table cache according to the hash value.
[0034]
Embodiment 2
A procedure for performing route information processing on the CC-NUMA type routing table according to the present invention will be described. When performing route information processing such as BGP (Border Gateway Protocol) or OSPF (Open Shortest Path First), it is necessary to read the routing table because it is necessary to convey the own route information to the other party. In the shared routing table method, it is only necessary to read the contents of the corresponding shared routing table. However, in the distributed routing table method, since information is dispersed, a separate procedure for summarizing the information is required. Therefore, the following policy is adopted. The route control processing itself is performed by dedicated hardware or a processor capable of accessing each distributed routing table.
1. Information received by a certain interface in the routing process is managed only in the local distributed routing table. Therefore, no notification is made to the remote distributed routing table.
2. When an interface sends information associated with the route control processing and needs the contents of the remote distributed routing table, the contents of the corresponding remote distributed routing table are acquired and notified.
[0035]
According to this procedure, information comprehensively including the contents of the entire distributed routing table can be obtained. The same applies to the COMA type routing table. Basically, the routing information is handled only by the locally existing home, and the content is acquired and notified only when it is necessary to refer to the home of the remote distributed routing table.
[0036]
Cache injection (speculative registration of cash) is also possible in accordance with these transfer procedures. Since the information registered in the cache is a perfect match, it is not possible to directly register an address to which a mask exchanged in the normal route control processing is added. Therefore, a value obtained by embedding an appropriate value in a masked portion, or an address specified without a mask in IPv4 or IPv6 is cached.
[0037]
Embodiment 3
When the existing data is updated by adding new data to the distributed routing table by the route control processing according to the second embodiment, the contents of the routing table cache existing in another interface sharing the information. Need to be updated. This procedure will be described for the case where bitmap information is added to entries of each shared routing table and managed. When the management is performed by adding the bitmap information, the management information increases, but the communication amount at the time of updating can be reduced due to the fine management.
[0038]
FIG. 3 is a structural diagram showing an entry structure of a shared routing table and a distributed routing table in a case where a sharing situation is managed by a bitmap, with respect to the CC-NUMA type routing table management structure in the present invention.
[0039]
FIG. 3 shows an example of the structure of an entry of the distributed routing table corresponding to FIG. The entry includes IP address, destination interface number, QoS information, Shared bitmap information, and other additional information such as management information and mask value information for management using a Patricia tree structure or the like. The IP address is managed by integrating the destination and the destination. The QoS information describes, for example, occupied bandwidth and information of QoS class (how to guarantee the bandwidth). The shared bitmap information indicates whether the information of the entry is shared with another interface. For example, if there are eight interfaces and the second and seventh interfaces have the same entry, the shared bitmap information Becomes 01000010.
[0040]
FIG. 4 shows a protocol in this method. If there is a mishit in the routing table cache (401 in FIG. 4), a reference request is issued to the distributed routing table on the same interface (402 in FIG. 4). If a hit is found, the corresponding portion (the portion corresponding to the same interface number) is marked in the shared bitmap of the entry. Here, “Shad” indicates that the same entry is shared by a plurality of interfaces.
[0041]
If there is a mishit, a reference request is issued to the distributed routing table existing on another interface (403 in FIG. 4). When an entry exists in a certain shared routing table, a corresponding part (a part corresponding to the issuer interface number of the reference request) is marked in the Shared bitmap of the entry (404 in FIG. 4). Because it is a bitmap, it is possible to know which interface is caching. For example, in FIG. 4, since it is a reference request for the interface 1, the first bit of the bitmap is marked. When the entry is updated (405 in FIG. 4), an Invalidate message is issued to the routing table cache of the corresponding interface according to the Shared bitmap to invalidate the cache (406 in FIG. 4). At the same time, the mark of the shared bitmap is deleted.
[0042]
FIG. 5 is a structural diagram showing an entry structure of a routing table cache having information indicating a home position in the CC-NUMA type routing table management structure according to the present invention.
[0043]
As shown in FIG. 5, a home ID indicating which interface existing data is cached is managed. When erasing a cache entry (such as eviction of a cache entry), the shared bitmap of the distributed routing table of the corresponding interface is changed in accordance with the home ID so that a useless Invalidate message is not generated even if the entry is updated.
[0044]
Here, the invalidation of the cache by the Invalidate message has been described, but the cache may be updated by the Update message.
[0045]
Embodiment 4
Different from the third embodiment, a case in which an identifier is added to an entry of each shared routing table and managed is described. This method increases the amount of communication at the time of updating, but reduces the amount of management information.
[0046]
FIG. 6 is a structural diagram showing an entry structure of a distributed routing table in a case where a shared state is managed by a single bit in a CC-NUMA type routing table management structure according to the present invention.
[0047]
FIG. 6 shows an example of the structure of the entry of the shared routing table in this case. It has a Shared flag instead of a Shared bitmap.
[0048]
FIG. 7 shows a protocol in this method. When a miss occurs in the routing table cache (701 in FIG. 7), a reference request is issued to the distributed routing table on the same interface (702 in FIG. 7). If there is a hit, the shared flag of the entry is marked. If there is a mishit, a reference request is issued to the distributed routing table existing on another interface (703 in FIG. 7). If an entry exists in a certain shared routing table, the shared flag of the entry is marked (704 in FIG. 7). Because it is a flag, it is not possible to know which interface is caching. When the entry is updated (705 in FIG. 7), if the Shared flag is marked, an Invalidate message is broadcast to invalidate the cache (706 in FIG. 7). If the interface receiving the Invalidate message has a corresponding entry, the cache entry is invalidated. If you don't have it, ignore it. At the same time, the shared flag mark is deleted. By providing two Shared flags for the local interface and the remote interface, it is possible to avoid broadcasting for updating entries to be resolved locally.
[0049]
On the routing table cache side, a home ID indicating which interface existing data is cached is managed as in the third embodiment. When erasing a cache entry (such as flushing a cache entry), the shared bitmap of the distributed routing table of the corresponding interface is changed according to the home ID so that even if the entry is updated, useless broadcast of Invalidate message is not generated. I do.
[0050]
Here, the invalidation of the cache by the Invalidate message has been described, but the cache may be updated by the Update message.
[0051]
Embodiment 5
FIG. 8 is a structural diagram showing a CC-NUMA type routing table management structure according to the present invention, particularly showing an entry structure of a routing table cache having no management information.
[0052]
In the third and fourth embodiments, a method in which the management information is not added to the routing table cache side as shown in FIG. 8 will be described.
[0053]
When the cache entry is deleted and used together with the third embodiment (e.g., eviction of a cache entry), a request to delete Shared information in the shared routing table is broadcast. When used together with the fourth embodiment, nothing is particularly performed when a cache entry is evicted.
[0054]
In addition, a method of not adding management information to each shared routing table and routing table cache is also conceivable. This method does not require management information, but always updates the entry to another interface when updating the entry.
[0055]
FIG. 9 shows a configuration example of a NUMA type routing table as a CC-NUMA cacheless structure. While the NUMA type routing table memory contributes to the reduction in the size of each memory, it is difficult to reduce the delay because it does not have a cache.
[0056]
Embodiment 6
An example of a COMA type routing table structure according to the present invention will be described.
[0057]
FIG. 10 shows a configuration example of the COMA type routing table. In the COMA type, each interface 101 includes a packet processing unit 102 that performs various processes related to a packet, and a routing table cache 103. Further, a switching fabric 105 for exchanging packets is provided outside each interface. When this switching fabric is shared for exchanging routing information, another fabric 106 for exchanging routing table information is not required. If not shared, a fabrics 106 for exchanging routing table information is provided separately.
[0058]
Each interface has its own routing table cache so that it can accommodate temporal and location localities. In particular, in the case of the COMA type, unlike the CC-MUMA type, the movement of the home (location where a certain address is stored) of each data is permitted. Here, the data is an entry, that is, a set of information in a horizontal row shown in FIG. In the CC-NUMA type, an entry in the distributed routing table is set by the routing protocol in each interface, and this entry (the original entry, not the cache, particularly the home) is not moved to the distributed routing table in another interface. In the COMA type, the remote routing table cache that frequently refers to the home caches the home (this is called a copy) and continues to access the same entry (a copy of the home). If the access is delayed, the entry that was previously home is evicted, and the copy becomes home. In this way, each entry is located closer to the interface that is more frequently referenced, so that the placement of the entries is optimized with use, resulting in lower access costs.
[0059]
In order to take advantage of this feature, when the number of interfaces is large in the COMA type, the fabric 106 has a hierarchical structure.
[0060]
FIG. 11 shows an example of a fabric having a hierarchical structure, which has a binary tree structure. Each branch has a directory 1101. This binary tree structure may be an n-ary tree structure using a larger integer n. The following describes a routing table management protocol using COMA. Each directory holds the next state of the entry.
[0061]
a) Invalid: Indicates that there is no data in the corresponding entry. Abbreviated S state
b) Exclusive: No other copy exists. Abbreviation E state
c) Shared: There may be other copies. Abbreviated S state
d) Reading: A reading request is issued to the upper hierarchy, and a waiting state is set. Abbreviation R state
e) Answering: A waiting state is issued after issuing a read or write request to the lower hierarchy. Abbreviation A state
f) Writing: A write request is issued to the upper hierarchy, and the state is in a waiting state. Abbreviation W state.
[0062]
First, the operation at the time of reading will be described. The entries in the E and S states can be read freely. If the reading is missed, the corresponding entry is added to make the state R, and the upper directory is inquired. If there is a hit in reading, the state is set to A and an inquiry is made to the lower hierarchy. If not, the state is set to the R state, and an inquiry is made to an upper layer. By repeating this operation recursively, the user eventually arrives at home or copy. The information of this entry is returned to the hierarchy based on the R and A entries, and each entry is sent to the accessed interface while changing each entry to the Shared state.
[0063]
Next, the operation at the time of writing, that is, at the time of updating will be described. The entry in the E state can be freely written. In the case of an S state or a write error, it is necessary to search for a home or copy and update the data. In the case of the S state, the entry is updated, the state is changed to the W state, and the upper directory is inquired. In the case of a writing error, an inquiry is directly made to the upper directory. If an entry exists in that hierarchy, the state is set to A and an inquiry is made to the lower hierarchy. If not, the state is set to the W state, and an inquiry is made to an upper layer. By repeating this operation recursively, the home or copy is reached. The information of this entry is returned to the hierarchy based on the entries of W and A, and only the entry in the A state is sent to the accessed interface while changing to the Shared state. When the user arrives at the highest level without arriving at home or copy, the information is to be held as Exclusive. Therefore, the user goes down the level while changing the W state to the S state, and adds a new entry as the E state.
[0064]
Embodiment 7
If the routing table cache entry for an interface becomes full and cannot be registered any more, there is a method for avoiding the next discard.
1. Find other open entries by broadcast
The broadcast message is used to search for a free entry in another routing table cache and transferred. The free routing table cache reserves the entry and returns a response message. An interface that has no entry that can be registered additionally passes the entry according to the response message that has returned the earliest. This message is also broadcast, and the routing table cache, which has been transferred, registers the entry in the reserved area, and the routing table cache, which has not been transferred, releases the reservation.
2. Send transfer request to each interface in order
In the method using a broadcast message, the transfer operation can be performed quickly, but the communication may be congested because the broadcast is used. Therefore, there is a method in which a transfer request is issued to each interface in order and a sequential confirmation is taken. The transfer operation can be time-consuming, but does not require the use of broadcast messages.
3. Prepare a separate memory and save it there.
[0065]
Although the difference in structure from UMA, NUMA, and CC-NUMA is small, this memory is only for compensating for the lack of entries in COMA, and is used as swap space.
[0066]
【The invention's effect】
By using the NUMA, CC-NUMA, and COMA type routing memory management system according to the present invention, the amount of routing table memory can be reduced. Further, a method using a cache such as the CC-NUMA or COMA type enables a high-speed routing table reference corresponding to various localities.
[0067]
The routing memory management method according to the present invention is scalable, and is especially important when constructing a large-scale and high-throughput router.
[0068]
In the advanced information society, a router having a larger capacity is required in terms of both communication speed and routing table capacity. The performance can be improved by alleviating the increase in the routing table reference time, which is one of the factors that limit the effective throughput of the router.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a CC-NUMA type routing table management structure as an example of a management structure of a routing table memory in the present invention.
FIG. 2 is a state transition diagram showing a procedure for performing an address search on a CC-NUMA type routing table management structure according to the present invention.
FIG. 3 is a structural diagram showing an entry structure of a distributed routing table in a case where a sharing status is managed by a bitmap, with respect to a CC-NUMA type routing table management structure according to the present invention.
FIG. 4 is a flowchart showing a processing flow when an update operation is performed on a CC-NUMA type routing table management structure using a bitmap according to the present invention.
FIG. 5 is a structural diagram showing an entry structure of a routing table cache having information indicating a home position in the CC-NUMA type routing table management structure according to the present invention.
FIG. 6 is a structural diagram showing an entry structure of a distributed routing table in a case where a shared state is managed by a single bit in a CC-NUMA type routing table management structure in the present invention.
FIG. 7 is a flowchart showing a processing flow when an update operation is performed on a CC-NUMA type routing table management structure using a single bit in the present invention.
FIG. 8 is a structural diagram showing an entry structure of a routing table cache having no management information, particularly regarding a CC-NUMA type routing table management structure according to the present invention.
FIG. 9 is a block diagram showing a NUMA type routing table management structure as an example of a management structure of a routing table memory in the present invention.
FIG. 10 is a block diagram showing a COMA type routing table management structure as an example of a management structure of a routing table memory in the present invention.
FIG. 11 is a structural diagram showing an example of a binary tree structure as an example of a fabric having a hierarchical structure in the COMA type routing table management mechanism of the present invention.
FIG. 12 is a block diagram showing a conventional UMA type routing table management mechanism.
[Explanation of symbols]
101 Interface
102 Packet processing unit
103 Routing table cache
104 Distributed routing table
105 Switching fabric for packet switching
106 Switching fabric for management information exchange
S201 Hash operation state
S202 Reference state of routing table cache
S203 Reference state of local distributed routing table
S204 Reference state of remote distributed routing table
S205 Processing status when destination is unknown
S206 Packet forwarding state
S207 Packet forwarding and cache registration status
401 Action missed in routing table cache
402 Action to issue reference request to local distributed routing table
403 Action to issue reference request to remote distributed routing table
404 Action to Mark Shared Bitmap
405 Update entry action
706 Invalidate Message Issuing Action
701 Action missed in routing table cache
702 Action to issue reference request to local distributed routing table
703 Action to send reference request to remote distributed routing table
704 Action to mark the Shared flag
705 Update entry action
706 Invalidate Message Issuing Action
1101 directory
1201 routing table
1202 Shared routing table.

Claims

A router having a plurality of interfaces,
Each of the plurality of interfaces is
A packet processing unit that processes the input packet;
A routing table that stores a set of the information dependent on the packet and the destination of the packet as information, is referred to using the information dependent on the packet as a key, and searches for a destination of the packet,
A router that searches for a destination by referring to the routing table of the second interface when the search is not hit in the routing table of the first interface.

Each of the routing tables has a routing table cache for storing a copy of a part of the information held by the routing table,
2. The router according to claim 1, wherein when a search is not found in a certain routing table cache, a destination is searched by referring to a routing table corresponding to the routing table cache.

Each of the above routing tables is configured as a routing table cache,
If no search is found in the routing table cache at the interface where the packet arrived, the destination is searched by referring to the routing table caches at the other interfaces. 2. The router according to claim 1, wherein the data is copied to a routing table cache of the router.

2. The router according to claim 1, wherein when the second routing table returns a reference result, the reference result is stored in the first routing table.

2. The router according to claim 1, further comprising a first path for transferring a packet input between the interfaces and a second path for transferring information stored in the routing table between the interfaces.

4. The router according to claim 2, wherein a stirrer is used for the hash function of the routing table cache.

2. The method according to claim 1, wherein when the content of the routing table is changed, an entry of another routing table having the same entry as the changed routing table is invalidated, or the change is notified to a routing table in another interface by an update message. Router.

2. The router according to claim 1, wherein an entry in the routing table includes a bitmap or a flag for indicating that another routing table holds the same content as the content of the entry.

2. The router according to claim 1, wherein the information dependent on the packet is an IP address or a MAC address.

A plurality of packet processing units, and a plurality of routing tables that exist corresponding to each packet processing unit,
The first routing table is used first to search for the destination of the packet processed by the corresponding first packet processing unit. If the search is not hit in the first routing table, the first routing table is used. A search is performed in the second routing table,
The second routing table is used first to search for the destination of the packet processed by the corresponding second packet processing unit. If the search is not hit in the second routing table, the second routing table is used. The router whose search is performed in the first routing table.

A routing method for a router having a plurality of interfaces each receiving a packet,
A packet processing unit for processing an input packet; a routing table for searching for a destination based on an address of the input packet; and a routing table for storing at least a part of a table stored in the routing table. Have a cache,
A first step of searching a destination in the routing table cache of an interface based on an address of a packet input to the interface;
A second step of searching for a destination in the routing table of the interface when the destination cannot be searched in the first step;
A third step of searching for a destination in a routing table of another interface when the destination cannot be searched in the second step;
A routing method comprising:

12. The routing method according to claim 11, wherein when a destination can be searched in the second or third step, data of the destination is registered in the routing table cache searched in the first step.