JP2024518019A

JP2024518019A - Method and system for predictive analytics based buffer management - Patents.com

Info

Publication number: JP2024518019A
Application number: JP2023558338A
Authority: JP
Inventors: ベン－エズラ，ヨセフ; ベン－ハイム，ヤニヴ
Original assignee: ニューフォトニクスリミテッド
Priority date: 2021-05-12
Filing date: 2022-05-10
Publication date: 2024-04-24
Also published as: EP4282144A1; WO2022238998A1; CN117157958A

Abstract

通信ネットワークにおいてトラフィックを管理するための方法、およびコンピュータプログラム製品であって、方法は、ポートを通してスイッチにより送信されることになる複数のトラフィックユニットを受信するステップであって、ポートは、関連付けられるキューを有する、受信するステップと、複数のトラフィックユニットから特徴を抽出するステップと、複数のトラフィックユニットについてのクラスを取得するために、第１のエンジンに特徴を提供するステップと、クラスについてのトラフィックモデルと関連付けられる第２のエンジンを使用して、将来時間についての、および、複数のトラフィックユニットを送信するスイッチの物理的ロケーションについての、クラスについての予測されるトラフィック量の指示を取得するステップと、予測されるトラフィック量の指示に対応するサイズのキューを割り振るステップと、バッファに少なくとも１つのトラフィックユニットを割り当てるステップとを含む、方法およびコンピュータプログラム製品。【選択図】図５A method and computer program product for managing traffic in a communications network, the method including the steps of receiving a plurality of traffic units to be transmitted by a switch through a port, the port having an associated queue, extracting features from the plurality of traffic units, providing the features to a first engine to obtain a class for the plurality of traffic units, obtaining an indication of a predicted traffic volume for the class for a future time and for a physical location of the switch that will transmit the plurality of traffic units using a second engine associated with a traffic model for the class, allocating a queue of a size corresponding to the indication of the predicted traffic volume, and allocating at least one traffic unit to a buffer.

Description

関連出願の相互参照
本出願は、「ＦｉｎｄｉｎｇＳｅｇｍｅｎｔｓｏｆＲｅｌｅｖａｎｔＯｂｊｅｃｔｓＢａｓｅｄｏｎＦｒｅｅ－ＦｏｒｍＴｅｘｔＤｅｓｃｒｉｐｔｉｏｎ」という題名の、２０２２年５月１２日に出願された米国仮特許出願第６３／１８７，９１６号の継続出願であり、その米国仮特許出願の利益を主張するものであり、その米国仮特許出願は、ここに、否定を発生させることなく、その米国仮特許出願の全体が、参照により組み込まれている。 CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. Provisional Patent Application No. 63/187,916, entitled "Finding Segments of Relevant Objects Based on Free-Form Text Description," filed May 12, 2022, and claims the benefit of that U.S. Provisional Patent Application, which is hereby incorporated by reference in its entirety without creating a denial.

本開示は、一般的にはバッファ管理に、特に、データセンタの中の、および、データセンタ同士間のバッファ管理のための方法およびシステムに関する。 The present disclosure relates generally to buffer management, and more particularly to methods and systems for buffer management within and between data centers.

データセンタは、絶えず増えている大量の利用可能なアプリケーションにより発生する（増えているエンドユーザの数にも起因する）作業負荷、および、クラウドにおけるデータ移行全体を処理するために使用される。データセンタネットワーキングを設計することにおける課題は、様々なクラウドアプリケーションの性能において主要な役割を果たす。データセンタオペレータは、様々なタイプの複数のアプリケーションに利用可能な帯域幅を利用することにおいて、極度の難題に直面している。アプリケーションの各々がそれ自体の要件を有し、それは、経時的に変動することもある様々なスループット、サービスの品質（ＱｏＳ）要件、容認可能なレイテンシなどである。 Data centers are used to handle the workload generated by the ever-increasing amount of available applications (also due to the growing number of end users) and overall data migration in the cloud. The challenges in designing data center networking play a major role in the performance of various cloud applications. Data center operators face extreme challenges in utilizing the available bandwidth for multiple applications of various types. Each of the applications has its own requirements, such as different throughput, quality of service (QoS) requirements, acceptable latency, etc., that may vary over time.

一般的には、通信チャネルまたは送信受信端が、それらの最大限の容量まで負荷をかけられるならば、パケットなどのトラフィックユニットは、送信される前であれ、受信された後であれ、チャネルまたはデスティネーションが、それらのトラフィックユニットを収容し扱うことができるまで、バッファ内に記憶されることがある。しかしながら、バッファが満杯であるとき、パケットは失われることがあり、そのことによって、深刻な問題が結果的に生じることがある。 Generally, if a communication channel or a transmitting/receiving end is loaded to its maximum capacity, traffic units such as packets may be stored in buffers, either before being transmitted or after being received, until the channel or destination can accommodate and handle them. However, when the buffer is full, packets may be lost, which can result in serious problems.

開示される主題の１つの例示的な実施形態は、通信ネットワークにおいてトラフィックを管理するための方法であって、ポートを通してスイッチにより送信されることになる複数のトラフィックユニットを受信するステップであって、ポートは、関連付けられるキューを有する、受信するステップと、特徴を複数のトラフィックユニットから抽出するステップと、複数のトラフィックユニットについてのクラスを取得するために、第１のエンジンに特徴を提供するステップと、クラスについてのトラフィックモデルと関連付けられる第２のエンジンを使用して、将来時間についての、および、複数のトラフィックユニットを送信するスイッチの物理的ロケーションについての、クラスについての予測されるトラフィック量の指示を取得するステップと、予測されるトラフィック量の指示に対応するサイズのキューを割り振るステップと、バッファに少なくとも１つのトラフィックユニットを割り当てるステップとを含む、方法である。方法の中で、トラフィックユニットは、パケットである。方法は、送信されることになる予備的な複数のトラフィックユニットを受信するステップと、複数の特徴ベクトルを取得するために、特徴を、予備的な複数のトラフィックユニットの各々から抽出するステップと、複数の特徴ベクトルを、複数のクラスへとクラスタリングするステップと、複数のトラフィックユニットを受信し、複数のクラスからのクラスを出力するように、第１のエンジンを訓練するステップとをさらに含み得る。方法は、特定のクラスに割り当てられる複数の特徴ベクトルのサブセットを基に第２のエンジンを訓練することを、第２のエンジンが、トラフィックモデルによって、クラスについての予測されるトラフィック量についての指示を提供するように適合させられるように行うステップをさらに含み得る。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間である。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、現在時間における、および、将来時間における利用可能なバッファサイズに基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、将来時間におけるクラスの優先度の輻輳キューの数に基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、将来時間におけるキューの正規化されたデキューレートに基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、複数のトラフィックユニットと関連付けられるアプリケーションまたはサイトの優先度に基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、クラスと関連付けられる係数に基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間であり、トラフィック量は、スイッチの物理的ロケーションに基づいて予測される。方法の中で、予測されるトラフィック量は、任意選択で、以下の式によって予測される：
Ｔ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）＝
α_ｃ・１／Ｎｐ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）・γ_ｉ ^ｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）・（Ｂ－Ｂｏｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ））
ここで、ｉは、ポートのインデックスであり、ｃは、複数のトラフィックユニットのクラスであり、ｔは、現在時間であり、τは、将来時間までの時間差であり、α_ｃは、クラスｃに割り当てられる係数であり、Ｌｏｃａｔｉｏｎは、データセンタの中のスイッチの物理的ロケーションであり、Ｎｐ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、スイッチについての時間ｔにおけるクラスの優先度の輻輳キューの数であるＮｐ（ｔ，ｌｏｃａｔｉｏｎ）、および、スイッチについての時間ｔ＋τにおけるクラスの優先度ｐの輻輳キューの数であるＮｐ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の変動または組み合わせであり、Ｂ－Ｂ_ｏｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、スイッチについての時間ｔにおける残りのバッファであるＢ－Ｂ_ｏｃ（ｔ，ｌｏｃａｔｉｏｎ）、および、スイッチについての時間ｔ＋τにおける残りのバッファであるＢ－Ｂ_ｏｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の変動または組み合わせであり、γ^ｉ _ｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、スイッチについての時間ｔにおけるクラスｃのキューの、ポートごとの正規化されたデキューレートであるγ^ｉ _ｃ（ｔ，ｌｏｃａｔｉｏｎ）、および、スイッチについての時間ｔ＋τにおけるクラスｃの第ｉ番目のキューの、ポートごとの正規化されたデキューレートであるγ^ｉ _ｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の変動または組み合わせである。方法の中で、キューは、任意選択で、先入れ先出し様式において空にされない。方法の中で、複数個のトラフィックユニットは、任意選択で、キューから同時的にデキューされる。 One exemplary embodiment of the disclosed subject matter is a method for managing traffic in a communication network, the method including receiving a plurality of traffic units to be transmitted by a switch through a port, the port having an associated queue, extracting features from the plurality of traffic units, providing the features to a first engine to obtain a class for the plurality of traffic units, obtaining an indication of a predicted traffic volume for the class for a future time and for a physical location of the switch transmitting the plurality of traffic units using a second engine associated with a traffic model for the class, allocating a queue of a size corresponding to the indication of the predicted traffic volume, and allocating at least one traffic unit to a buffer. In the method, the traffic unit is a packet. The method may further include receiving a preliminary plurality of traffic units to be transmitted, extracting features from each of the preliminary plurality of traffic units to obtain a plurality of feature vectors, clustering the plurality of feature vectors into a plurality of classes, and training the first engine to receive the plurality of traffic units and output a class from the plurality of classes. The method may further include training a second engine based on a subset of the feature vectors assigned to a particular class such that the second engine is adapted by the traffic model to provide an indication of a predicted traffic volume for the class. In the method, the predicted traffic volume is optionally predicted for a future time t+τ, where t is the current time and τ is a time interval. In the method, the predicted traffic volume is optionally predicted for a future time t+τ, where t is the current time and τ is a time interval, and the traffic volume is predicted based on available buffer sizes at the current time and at the future time. In the method, the predicted traffic volume is optionally predicted for a future time t+τ, where t is the current time and τ is a time interval, and the traffic volume is predicted based on a number of congested queues of the priority of the class at the future time. In the method, the predicted traffic volume is optionally predicted for a future time t+τ, where t is the current time and τ is a time interval, and the traffic volume is predicted based on a normalized dequeue rate of the queue at the future time. In the method, a predicted traffic volume is optionally predicted for a future time t+τ, where t is a current time, and τ is a time interval, and the traffic volume is predicted based on a priority of an application or site associated with a plurality of traffic units. In the method, a predicted traffic volume is optionally predicted for a future time t+τ, where t is a current time, and τ is a time interval, and the traffic volume is predicted based on a coefficient associated with a class. In the method, a predicted traffic volume is optionally predicted for a future time t+τ, where t is a current time, and τ is a time interval, and the traffic volume is predicted based on a physical location of a switch. In the method, a predicted traffic volume is optionally predicted by the following formula:
T ⁱ _c (t, t + τ, location) =
α _c · 1 / Np '(t, t + τ, location) · γ _i ^c '(t, t + τ, location) · (B - Boc '(t, t + τ, location))
where i is the port index, c is a class of traffic units, t is the current time, τ is the time difference to a future time, α _c is a coefficient assigned to class c, Location is the physical location of the switch in the data center, Np'(t,t+τ,location) is the variation or combination of Np(t,location), the number of congestion queues of class priority p at time t for the switch, and Np(t+τ,location), the number of congestion queues of class priority p at time t+τ for the switch, B-B _oc '(t,t+τ,location) is the variation or combination of B-B _oc (t,location), the remaining buffers at time t for the switch, and B-B _oc ₍ t+τ,location), the remaining buffers at time t+τ for the switch, ^and '(t,t+τ,location) is the variation or combination of γ ⁱ _c (t,location), the per-port normalized dequeue rate of a queue of class c at time t for the switch, and γ ⁱ _c (t+τ,location), the per-port normalized dequeue rate of the i-th queue of class c at time t+τ for the switch. In the method, the queue is optionally not emptied in a first-in-first-out manner. In the method, multiple traffic units are optionally dequeued from the queue simultaneously.

開示される主題の別の例示的な実施形態は、プログラム命令を保持する非一時的コンピュータ可読媒体であって、それらの命令は、プロセッサにより読み出されるとき、ポートを通してスイッチにより送信されることになる複数のトラフィックユニットを受信することであって、ポートは、関連付けられるキューを有する、受信することと、特徴を、複数のトラフィックユニットから抽出することと、複数のトラフィックユニットについてのクラスを取得するために、第１のエンジンに特徴を提供することと、クラスについてのトラフィックモデルと関連付けられる第２のエンジンを使用して、将来時間についての、および、複数のトラフィックユニットを送信するスイッチの物理的ロケーションについての、クラスについての予測されるトラフィック量の指示を取得することと、予測されるトラフィック量の指示に対応するサイズのキューを割り振ることと、バッファに少なくとも１つのトラフィックユニットを割り当てることとを遂行することをプロセッサに行わせる、非一時的コンピュータ可読媒体である。コンピュータプログラム製品の中で、トラフィックユニットは、任意選択で、パケットである。コンピュータプログラム製品の中で、プログラム命令は、任意選択で、送信されることになる予備的な複数のトラフィックユニットを受信することと、複数の特徴ベクトルを取得するために、特徴を、予備的な複数のトラフィックユニットの各々から抽出することと、複数の特徴ベクトルを、複数のクラスへとクラスタリングすることと、複数のトラフィックユニットを受信し、複数のクラスからのクラスを出力するように、第１のエンジンを訓練することとを遂行することをプロセッサにさらに行わせる。コンピュータプログラム製品の中で、プログラム命令は、任意選択で、特定のクラスに割り当てられる複数の特徴ベクトルのサブセットを基に第２のエンジンを訓練することを、第２のエンジンが、トラフィックモデルによって、クラスについての予測されるトラフィック量についての指示を提供するように適合させられるように行うことを遂行することをプロセッサにさらに行わせる。コンピュータプログラム製品の中で、予測されるトラフィック量は、任意選択で、将来時間ｔ＋τについて予測され、ｔは、現在時間であり、τは、時間区間である。コンピュータプログラム製品の中で、予測されるトラフィック量は、任意選択で、現在時間における、および、将来時間における利用可能なバッファサイズ；将来時間におけるクラスの優先度の輻輳キューの数；将来時間におけるキューの正規化されたデキューレート；複数のトラフィックユニットと関連付けられるアプリケーションまたはサイトの優先度；クラスと関連付けられる係数；ならびに、スイッチの物理的ロケーションからなる列挙から選択される、１つまたは複数の項目に基づいて予測される。 Another exemplary embodiment of the disclosed subject matter is a non-transitory computer-readable medium bearing program instructions that, when read by a processor, cause the processor to: receive a plurality of traffic units to be transmitted by a switch through a port, the port having an associated queue; extract features from the plurality of traffic units; provide the features to a first engine to obtain a class for the plurality of traffic units; obtain an indication of a predicted traffic volume for the class for a future time and for a physical location of the switch that transmits the plurality of traffic units using a second engine associated with a traffic model for the class; allocate a queue of a size corresponding to the indication of the predicted traffic volume; and assign at least one traffic unit to a buffer. In the computer program product, the traffic unit is optionally a packet. In the computer program product, the program instructions optionally further cause the processor to receive a preliminary plurality of traffic units to be transmitted, extract features from each of the preliminary plurality of traffic units to obtain a plurality of feature vectors, cluster the plurality of feature vectors into a plurality of classes, and train a first engine to receive the plurality of traffic units and output a class from the plurality of classes. In the computer program product, the program instructions optionally further cause the processor to train a second engine based on a subset of the plurality of feature vectors assigned to a particular class, such that the second engine is adapted by the traffic model to provide an indication of a predicted traffic volume for the class. In the computer program product, the predicted traffic volume is optionally predicted for a future time t+τ, where t is the current time and τ is a time interval. In the computer program product, the predicted traffic volume is optionally predicted based on one or more items selected from the enumeration: available buffer size at the current time and at a future time; number of congested queues for a class priority at a future time; normalized dequeue rate of the queue at a future time; priority of an application or site associated with a number of traffic units; a coefficient associated with the class; and physical location of the switch.

本開示される主題は、対応するまたは類する番号または記号が、対応するまたは類する構成要素を指示する図面と併せて挙げられる、以下の詳細な説明から、より完全に理解および把握されることになる。別段に指示されない限り、図面は、本開示の例示的な実施形態または態様を提供し、本開示の範囲を制限しない。図面においては、次の通りである。 The subject matter disclosed will be more fully understood and appreciated from the following detailed description taken in conjunction with the drawings in which corresponding or like numbers or symbols indicate corresponding or like components. Unless otherwise indicated, the drawings provide exemplary embodiments or aspects of the disclosure and do not limit the scope of the disclosure. In the drawings:

図１は、スパインリーフタイプのデータセンタの一般化された図である。FIG. 1 is a generalized diagram of a spine-leaf type data center. 図２は、近似公平廃棄（ＡＦＤ：ＡｐｐｒｏｘｉｍａｔｅＦａｉｒＤｒｏｐｐｉｎｇ）スキームの概略図である。FIG. 2 is a schematic diagram of an Approximate Fair Dropping (AFD) scheme. 図３は、データセンタにおいて配備される従前のスタティックスイッチアーキテクチャの概略図である。FIG. 3 is a schematic diagram of a conventional static switch architecture deployed in a data center. 図４は、動的しきい値処理スキームの性能グラフである。FIG. 4 is a performance graph of the dynamic thresholding scheme. 図５は、本開示のいくつかの例示的な実施形態による、送信を分類し、バッファサイズを決定するためのエンジンを訓練するための方法におけるステップのフローチャートである。FIG. 5 is a flowchart of steps in a method for training an engine for classifying transmissions and determining buffer sizes according to some example embodiments of the present disclosure. 図６は、本開示のいくつかの実施形態による、キューしきい値を決定するための方法のフローチャートである。FIG. 6 is a flowchart of a method for determining a queue threshold according to some embodiments of the present disclosure. 図７は、本開示のいくつかの実施形態による、キューしきい値を決定するためのシステムのブロック図である。FIG. 7 is a block diagram of a system for determining queue thresholds according to some embodiments of the present disclosure.

すべてのコンピュータ化されたネットワークにおいて、特に、多数のアプリケーションおよびユーザに起因して重く負荷をかけられることがあるデータセンタにおいて、すべてのバッファが満杯であるとき、パケットは失われることがある。パケット損失は、重大な問題を起こすことがある。問題の深刻さは、送出／受信アプリケーションのタイプ、送信されるデータのタイプ、アプリケーションまたはサービスのサービスレベルアグリーメント（ＳＬＥ）など、複数の要因によることがある。 In all computerized networks, especially in data centers that can be heavily loaded due to a large number of applications and users, packets can be lost when all buffers are full. Packet loss can cause significant problems. The severity of the problem can depend on several factors, including the type of sending/receiving application, the type of data being sent, and the Service Level Agreement (SLE) of the application or service.

下記の論考は、トラフィックユニットおよびパケットに、互換的に言及するが、論考は、他のトラフィックユニットにもまた等しく適用可能であるということが把握されるであろう。 The discussion below refers interchangeably to traffic units and packets, but it will be understood that the discussion is equally applicable to other traffic units as well.

よって、本開示の１つの技術的問題は、重要な送信が失われることにより引き起こされる深刻な損害の危険が低減されるように、送信性能を最適化するために、データセンタにおいてポートのキューおよびバッファを管理することについての必要性に関係する。 Thus, one technical problem of the present disclosure relates to the need to manage port queues and buffers in a data center to optimize transmission performance so that the risk of serious damage caused by the loss of critical transmissions is reduced.

パケット損失を回避するためのいくつかの知られている技法は、利用可能なバッファ空間から複数個のバッファを割り振る。いくつかの実施形態において、１つまたは複数のバッファが、エレファントフローとしても知られている、あらかじめ決定された数よりも多いパケットを含む送信などの、多量の送信のために割り振られることがあり、１つまたは複数のバッファが、マイスフローと呼称される、より小さい送信のために割り振られることがある。 Some known techniques for avoiding packet loss allocate multiple buffers from the available buffer space. In some embodiments, one or more buffers may be allocated for large transmissions, such as transmissions that contain more than a predetermined number of packets, also known as elephant flows, and one or more buffers may be allocated for smaller transmissions, referred to as mice flows.

さらなる知られている技法は、利用可能なプールから静的にバッファを割り振り、それゆえに、送信バーストに効果的および効率的に応答することに失敗することがある。さらなる技法は、送信時間における負荷および要件による、動的割り振りを使用する。しかしながら、それでもなお妥当な解決法を提供せず、なぜならば、状況は、性能が極端に悪化するように、送信中に極端に変化することがあるからである。 Further known techniques allocate buffers statically from an available pool and therefore may fail to respond effectively and efficiently to transmission bursts. Further techniques use dynamic allocation depending on the load and requirements at the time of transmission. However, this still does not provide a reasonable solution because conditions may change drastically during transmission such that performance may become drastically worse.

本開示の１つの技術的解決法は、キューしきい値、すなわち、各キューの量をセットし、それによって、利用可能なプールから動的に、キューを記憶するバッファを割り振ることに関係する。しかしながら、キューしきい値が決定され、バッファ割り振りが遂行されるのは、現在の状況および負荷によってのみではなく、なぜならば、このことは、例えば、現在の送信、および、その送信中に行われる、または、少なくとも開始することがあるｒ送信に起因して変化することがあるからである。むしろ、キューしきい値セッティング、および、利用可能なプールからのバッファの割り振りは、送信が行われることが予想される時間中の、より遅い時間において予想される、予測される負荷によって行われる。バッファ割り振りは、さらには、送出もしくは受信アプリケーションに関係することがある送信されるパケットのクラスもしくはタイプ、予想される送信の量、ＳＬＥ、送信スイッチの物理的ロケーション、または、他の要因を考慮に入れることがある。 One technical solution of the present disclosure involves setting queue thresholds, i.e., the amount of each queue, and thereby dynamically allocating buffers to store the queue from the available pool. However, queue thresholds are determined and buffer allocation is performed not only according to the current situation and load, since this may change due to, for example, a current transmission and transmissions that may occur or at least start during that transmission. Rather, queue threshold settings and allocation of buffers from the available pool are driven by the predicted load, which is expected at a later time during the time that the transmission is expected to occur. Buffer allocation may further take into account the class or type of packets to be transmitted, the amount of expected transmissions, SLE, the physical location of the transmitting switch, or other factors that may be related to the sending or receiving application.

本開示の別の技術的解決法は、送信の中の複数のパケットを分類してそれらのクラスを取得するために、分類器などの第１のエンジンを訓練し、それにより、対応するキューしきい値、および、割り振られるバッファサイズが、このクラスによるものであるようにすることを含む。例えば、アプリケーションに関係付けられる複数のパケットを含む送信は、大量のデータを送信するクラスに分類され得るものであり、大量のデータを送信するクラスに分類されるアプリケーションに関係付けられる送信よりも大きいバッファに割り当てられ得るものであり、逆もしかりである。 Another technical solution of the present disclosure includes training a first engine, such as a classifier, to classify packets in a transmission to obtain their class, such that the corresponding queue threshold and allocated buffer size are according to the class. For example, a transmission including packets associated with an application may be classified into a class that transmits a large amount of data and may be allocated a larger buffer than a transmission associated with an application classified into the class that transmits a large amount of data, and vice versa.

本開示のさらに別の技術的解決法は、各々のそのようなクラスについての、トラフィックモデルとも呼称される、第２のエンジンを訓練することを含み、エンジンは、特定のクラスについて、送信が行われることが予想される将来時間における、利用可能なプールからの適切なキューしきい値を計算するように適合させられる。 Yet another technical solution of the present disclosure includes training a second engine, also referred to as a traffic model, for each such class, which is adapted to calculate, for a particular class, an appropriate queue threshold from the available pool at a future time when a transmission is expected to take place.

本開示の１つの技術的効果は、キューしきい値の最適化、および、バッファ割り振りの有効性の増加であり、なぜならば、そのことは、それらのことがいくらかの時間、より早期である際の要件に対してではなく、送信が行われている時間における要件に対して調整されるからである。キューしきい値をセットし、アプリケーションのタイプ、および、効力のある要件に対して妥当である割合のバッファを割り振ることにより、より少数のパケットが廃棄され、サービスレベルは増大し得る。 One technical effect of the present disclosure is the optimization of queue thresholds and increased effectiveness of buffer allocations because they are adjusted to the requirements at the time the transmission is occurring, rather than to requirements some time earlier. By setting queue thresholds and allocating a proportion of buffers that is appropriate for the type of application and requirements in force, fewer packets are dropped and service levels can be increased.

次に図１を参照して、トップオブザラック（ｔｏｐｏｆｔｈｅｒａｃｋ）スイッチ１０８と１つまたは複数のアグリゲートスイッチ１１６とを各々が含む複数のポイントオブデリバリ（ＰＯＤ）１００からなる、スパインリーフタイプのデータセンタの部分的な一般図を示す。データセンタは、さらには、複数のコアスイッチ１２０を含み得る。各ＰＯＤ１００は、ラック１（１０４）、ラック２（１０４’）など、複数のラックを含み得る。 Referring now to FIG. 1, a partial generalized diagram of a spine-leaf type data center is shown, which consists of multiple points of delivery (PODs) 100, each of which includes a top of the rack switch 108 and one or more aggregate switches 116. The data center may further include multiple core switches 120. Each POD 100 may include multiple racks, such as rack 1 (104), rack 2 (104'), etc.

ラック１（１０４）は、サーバ１１２、サーバ１１３など複数のサーバを含み得る。ラック１（１０４）は、さらには、トップオブラック（ＴｏＲ）スイッチ１０８を含み得る。ＴｏＲスイッチ１０８は、サーバ１１２、１１３などのうちのいずれかにデータを提供すること、ならびに、サーバ１１２、１１３などのうちのいずれかから、デスティネーションに提供されることになるデータを受信することを担う。 Rack 1 (104) may include multiple servers, such as server 112, server 113, etc. Rack 1 (104) may further include a top-of-rack (ToR) switch 108. The ToR switch 108 is responsible for providing data to any of the servers 112, 113, etc., as well as receiving data from any of the servers 112, 113, etc., to be provided to a destination.

ＰＯＤ１００は、ラック１（１０４）およびラック２（１０４’）などの２つ以上のラックの中のサーバ１１２、１１３のうちのいずれかにデータを提供すること、ならびに、２つ以上のラックの中のサーバ１１２、１１３のうちのいずれかから、デスティネーションに提供されることになるデータを受信することについて各々が担う、１つまたは複数のアグリゲートスイッチ１１６を含み得る。 The POD 100 may include one or more aggregate switches 116, each responsible for providing data to any of the servers 112, 113 in two or more racks, such as rack 1 (104) and rack 2 (104'), and for receiving data from any of the servers 112, 113 in the two or more racks to be provided to a destination.

データセンタは、１つまたは複数のＰＯＤ１００の２つ以上のアグリゲートスイッチ１１６間の、よって、データセンタの中の複数個のラックと複数個のサーバとの間の、通信およびデータ転送を可能にする１つまたは複数のコアスイッチ１２０を含み得る。 A data center may include one or more core switches 120 that enable communication and data transfer between two or more aggregate switches 116 of one or more PODs 100, and thus between multiple racks and multiple servers in the data center.

データセンタは、２つ以上のＰＯＤ１００間の通信およびデータ転送を可能にするためのデータセンタコアスイッチ（図示されず）、ならびに、データセンタと、サイバー空間内の任意のところの他のデータセンタまたはサーバとの間の通信を可能にするためのデータセンタエッジスイッチ（図示されず）を含み得る。 The data center may include data center core switches (not shown) to enable communication and data transfer between two or more PODs 100, and data center edge switches (not shown) to enable communication between the data center and other data centers or servers anywhere in cyberspace.

開示される構造は、単に例示的であり、様々なサーバを接続し、それらのサーバが、それらのサーバの間で、または、それらのサーバのうちのいずれかと、サイバー空間内の別のソースもしくはデスティネーションコンピューティングプラットフォームとの間で、データを送出および受信することを可能にする、任意の他の構造が使用され得るということが理解されるであろう。 It will be understood that the disclosed structure is merely exemplary and that any other structure that connects various servers and enables them to send and receive data between themselves or between any of them and another source or destination computing platform in cyberspace may be used.

スイッチ階層におけるスイッチの地位が高いほど、スイッチは、より多くのタイプのデータを、より多くのタイプのアプリケーションに、または、より多くのタイプのアプリケーションから、より分散した時間においてなどで、提供することを求められ得るということが理解されるであろう。例えば、データセンタコアスイッチ１２０は、ＴｏＲスイッチ１０８よりも多様な送信に対応し得る。 It will be appreciated that the higher a switch is in the switch hierarchy, the more types of data the switch may be required to provide to or from more types of applications, at more distributed times, etc. For example, data center core switch 120 may accommodate a greater variety of transmissions than ToR switch 108.

各アプリケーションおよび送信事情は異なるので、それらの必要とするものも異なる。いくつかのアプリケーションは、少量のデータを送信することを必要とするが、そのことを、可能な限り小さいレイテンシにおいて行うことを必要とし、一方で、バックアップ送信などの他のものは、大量を送信することを必要とするが、より長いレイテンシが許容されることがある。また、各送信されるパケットの決定的重要度が、変動することがある。例えば、音楽配信アプリケーションにおいて、いくつかのパケットの損失は、バンキングアプリケーションにおけるほど深刻ではなく、バンキングアプリケーションにおけるよりも許容されやすいことがある。 Each application and transmission situation is different, and so are their needs. Some applications need to send small amounts of data, but need to do so with as little latency as possible, while others, such as backup transmissions, need to send large amounts, but can tolerate longer latencies. Also, the criticality of each transmitted packet may vary. For example, in a music distribution application, the loss of a few packets may be less severe and more tolerable than in a banking application.

上記で開示された技術的問題は、ＴｏＲスイッチ１０８、アグリゲートスイッチ１１６、コアスイッチ１２０、データセンタコアスイッチ、およびデータセンタエッジスイッチなどの、図１と関連して示される、または論考される、スイッチのうちのいずれかに当てはまり得る。データが、送信されることを求められ、チャネルがビジーであるとき、送信されることになるデータがバッファに入れられることがあり、デスティネーションがビジーであるときの、受信されることになるデータについても同様である。 The technical problems disclosed above may apply to any of the switches shown or discussed in connection with FIG. 1, such as the ToR switch 108, the aggregate switch 116, the core switch 120, the data center core switch, and the data center edge switch. When data is required to be transmitted and the channel is busy, the data to be transmitted may be buffered, and similarly for data to be received when the destination is busy.

よって、１つの既存の技術は、バッファのための利用可能な空間を、２つに分割することに関係する。２つとは、各送信の第１のパケットを記憶するために使用され得る、マイスフローキューと呼称される、小さい送信を扱うための第１のバッファ、ならびに、大きい送信、特に、マイスフローキューに割り当てられる第１のパケットを除いた、送信と関連付けられるすべてのパケットを扱うための、エレファントフローキューと呼称される第２のバッファである。バッファおよびキューという用語は互換的に使用されるということが理解されるであろう。 Thus, one existing technique involves splitting the available space for buffers into two: a first buffer for handling small transmissions, called a miceflow queue, which may be used to store the first packet of each transmission, and a second buffer, called an elephantflow queue, for handling large transmissions, specifically all packets associated with a transmission except for the first packet assigned to the miceflow queue. It will be understood that the terms buffer and queue are used interchangeably.

廃棄問題を扱うための知られている機構が、動的パケット優先度設定（ＤＰＰ：ＤｙｎａｍｉｃＰａｃｋｅｔＰｒｉｏｒｉｔｉｚａｔｉｏｎ）を伴う近似公平廃棄（ＡＦＤ）である。ＡＦＤは、積極的なエレファントフローのバッファ使用を制限することにより、マイスフロー、特に、アグリゲートされたマイスフローであるマイクロバーストを吸収するために、バッファ空間を保存することに重点を置く。スキームは、さらには、下記で詳述されるように、複数のエレファントフロー間の帯域幅割り振りの公平性を実施し得る。ＤＰＰは、マイスフローおよびエレファントフローを２つの異なるキューへと分離することを、バッファ空間が独立的にそれらのフローに割り振られ得、異なるキュースケジューリングがそれらのフローに適用され得るように行う能力を提供する。 A known mechanism to address the discard problem is Approximate Fair Discard (AFD) with Dynamic Packet Prioritization (DPP). AFD focuses on conserving buffer space for mice flows, especially for absorbing microbursts that are aggregated mice flows, by limiting the buffer usage of aggressive elephant flows. The scheme may further enforce fairness of bandwidth allocation among multiple elephant flows, as detailed below. DPP provides the ability to separate mice flows and elephant flows into two different queues, such that buffer space can be allocated to them independently and different queue scheduling can be applied to them.

ＡＦＤアルゴリズムの特徴は、複数のエレファントフロー間の、それらのフローのデータレートに基づく、帯域幅の公平な割り振りである。この特徴は、２つの主な要素、すなわち、データレート測定および公平レート算出を有する。 The AFD algorithm is characterized by a fair allocation of bandwidth among multiple elephant flows based on their data rates. This has two main components: data rate measurement and fair rate calculation.

データレート測定は、各エレファントフローの到着レートを、イングレス（すなわち、進入）ポート上で測定し、それを、エグレス（すなわち、退出）ポート上のバッファ管理機構に渡すことに関係する。 Data rate measurement involves measuring the arrival rate of each elephant flow on the ingress (i.e., incoming) port and passing it to a buffer management mechanism on the egress (i.e., outgoing) port.

公平レート算出は、エグレスポートキューの占有度に基づいて、フィードバック機構を使用して、エグレスキューについてのフローごとの公平レートを動的に計算することに関係する。 Fair rate computation involves dynamically calculating per-flow fair rates for egress queues using a feedback mechanism based on the occupancy of the egress port queues.

エレファントフローのパケットがエグレスキューに進入するとき、ＡＦＤアルゴリズムは、フローの測定された到着レートを、計算された、フローごとの公平共有レートと比較する。 When a packet of an elephant flow enters an egress queue, the AFD algorithm compares the measured arrival rate of the flow with the calculated fair share rate per flow.

到着レートが公平レートよりも少ないならば、パケットは、キューイングされ、最終的にエグレスリンクに送信されることになる。 If the arrival rate is less than the fair rate, packets will be queued and eventually sent to the egress link.

到着レートが公平レートを上回るならば、パケットは、その到着レートが公平レートを上回る量に比例して、そのフローからランダムに廃棄されることになる。廃棄確率は、よって、公平レート、および、測定されたフローレートを使用して計算される。フローが公平レートを上回るほど、そのフローの廃棄確率は高く、それゆえに、すべてのエレファントフローは、公平レートを達成する。 If the arrival rate exceeds the fair rate, packets will be randomly dropped from that flow in proportion to the amount by which its arrival rate exceeds the fair rate. The drop probability is then calculated using the fair rate and the measured flow rate. The more a flow exceeds the fair rate, the higher the drop probability for that flow, and therefore all elephant flows will achieve their fair rate.

パケットを廃棄することにより、ネットワーク輻輳を合図し、アプリケーションホストに関するＴＣＰ輻輳機構を関与させる、フロー認識早期放棄（ｆｌｏｗ－ａｗａｒｅｅａｒｌｙ－ｄｉｓｃａｒｄ）機構であるＡＦＤは、重み付けランダム早期放棄または重み付けランダム早期検出（ＷＲＥＤ）などの、より早期の方法に対する改善である。ＷＲＥＤは、重み付けランダム早期放棄を、クラスベースのキューに適用するが、クラスの中のフロー認識を利用せず、そのことによって、パケット損失の影響を受けやすいマイスフローを含む、すべてのパケットが、同じ廃棄確率にしたがわなければならず、それゆえに、マイスフローからのパケットは、エレファントフローからのパケットと同じほど、廃棄される公算が大きい。エレファントフローは、引き下がるための輻輳合図として廃棄を使用し得るが、廃棄は、マイスフローへの有害な影響を有し得る。加えて、同じ廃棄確率は、（短いラウンドトリップ時間に起因して）より高いレートを伴うエレファントフローが、より多くの帯域幅を取得することを引き起こすことがある。 AFD, a flow-aware early-discard mechanism that signals network congestion by discarding packets and engages TCP congestion mechanisms on application hosts, is an improvement over earlier methods such as weighted random early discard or weighted random early detection (WRED). WRED applies weighted random early discard to class-based queues, but does not utilize flow awareness within classes, which means that all packets, including mice flows that are susceptible to packet loss, must be subject to the same discard probability, and therefore packets from mice flows are as likely to be discarded as packets from elephant flows. Elephant flows may use discards as a congestion signal to back off, but discards may have a detrimental effect on mice flows. In addition, the same discard probability may cause elephant flows with higher rates (due to shorter round-trip times) to obtain more bandwidth.

それゆえに、エグレス帯域幅は、同じ輻輳リンクを通過する複数のエレファントフロー間で、均等に分割されないことがある。結果として、マイスフローについてのフロー完了時間は悪変し、エレファントフローは、リンク帯域幅およびバッファリソースへの公平なアクセスを有さない。 Therefore, the egress bandwidth may not be divided evenly among multiple elephant flows traversing the same congested link. As a result, flow completion times for mice flows vary and elephant flows do not have fair access to link bandwidth and buffer resources.

しかしながら、ＡＦＤは、廃棄判断をなす前に、フローサイズおよびデータ到着レートを考慮に入れる。廃棄アルゴリズムは、マイスフローを保護するように、および、帯域幅コンテンション中の複数のエレファントフロー間の公平性をもたらすように設計される。 However, AFD takes into account the flow size and data arrival rate before making the discard decision. The discard algorithm is designed to protect mice flows and to provide fairness among multiple elephant flows during bandwidth contention.

次に図２を参照して、１対のキュー、すなわち、マイスフローキュー２０４およびレギュラー（エレファントフロー）キュー２０８とともに、ＡＦＤスキームの概略を示す。図２の例において、マイスフローキューは、多くともＮ個のパケット（Ｎ＝５）のストリームに制限され得る。４つのパケットを含むストリーム２１２などの、パケットの短いストリームは、マイスフローキュー２０４内に排他的に記憶され、可能なときに先へと送信され得る。マイスフローキュー２０４の中に記憶されていることのプロセス内にある、ストリーム２１６についても同様である。しかしながら、ストリーム２２０は、より長い。よって、第１のＮ個のパケットが、実際、マイスフローキュー２０４に割り当てられ、一方で、パケットＮ＋１（２２４）において開始する、パケットの残部は、レギュラーキュー２０８に割り当てられる。両方のキューは、エグレスポート２２８を通して、それらのキュー内に記憶されたパケットを出力する。 2, an AFD scheme is shown with a pair of queues: a miceflow queue 204 and a regular (elephant flow) queue 208. In the example of FIG. 2, the miceflow queue may be limited to streams of at most N packets (N=5). A short stream of packets, such as stream 212 containing four packets, may be stored exclusively in the miceflow queue 204 and sent onward when possible. The same is true for stream 216, which is in the process of being stored in the miceflow queue 204. However, stream 220 is longer. Thus, the first N packets are indeed assigned to the miceflow queue 204, while the remainder of the packets, starting with packet N+1 (224), are assigned to the regular queue 208. Both queues output the packets stored in them through egress port 228.

よって、従前のネットワーク管理によって、単に、バッファ管理ポリシーのあらかじめ定義されたセットを配備することが可能となり、それらのポリシーのパラメータは、特定のネットワーク条件に適合させられ得る。新しい管理ポリシーの組み込みは、複雑な制御、および、データプランコード変更、および時には、実現するハードウェアの再設計を要する。 Thus, traditional network management simply allows for the deployment of a predefined set of buffer management policies whose parameters can be adapted to specific network conditions. The incorporation of new management policies requires complex control and data plan code changes, and sometimes redesign of the implementing hardware.

しかしながら、ソフトウェア定義ネットワーキングにおける現在の開発は、これらの課題をほとんど無視し、パケット分類器の柔軟かつ効率的な表現に専念しており、バッファ管理の態様には十分に対応していない。 However, current developments in software-defined networking have largely ignored these challenges, concentrating on flexible and efficient representation of packet classifiers, and have not adequately addressed aspects of buffer management.

従前から、キューは、先入れ先出し（ＦＩＦＯ）処理順序を実施し、知られているように、単一キュー（ＳＱ）アーキテクチャ、重み付けスループットの目的、および、ＦＩＦＯ処理についての決定論的な最適アルゴリズムは存しない。 Traditionally, queues implement a first-in, first-out (FIFO) processing order, and as is known, there are no deterministic optimal algorithms for single-queue (SQ) architectures, weighted throughput objectives, and FIFO processing.

次に図３を参照して、中央パケット処理および分類エンジンを有する、データセンタにおいて配備される従前のスタティックスイッチアーキテクチャの概略図を示す。 Referring now to FIG. 3, a schematic diagram of a conventional static switch architecture deployed in a data center having a central packet processing and classification engine is shown.

アーキテクチャは、入来するパケットストリームを分類する集中型分類エンジン３０４を用いる。パケットがスイッチにおいて受信されるとき、このエンジンは、デスティネーションおよびソースアドレスを調査し、それらのアドレスを、ネットワークセグメントおよびアドレスのテーブルと比較し、パケットについてのクラスを決定する。 The architecture uses a centralized classification engine 304 that classifies the incoming packet stream. When a packet is received at the switch, the engine looks up the destination and source addresses, compares them to a table of network segments and addresses, and determines a class for the packet.

決定されたクラスに対応して、パケットは、キューイングエンジン１（３０８）、キューイングエンジン２（３１２）、キューイングエンジンＮ（３１６）など、分類によって、キューイングエンジンのうちの１つにフォワーディングされ、ここでＮは、クラスの数である。加えて、集中型分類エンジン３０４は、不良パケットが四散することを、それらのパケットをフォワーディングしないことにより防止する。 Corresponding to the determined class, the packet is forwarded to one of the queuing engines according to the classification, such as queuing engine 1 (308), queuing engine 2 (312), queuing engine N (316), etc., where N is the number of classes. In addition, the centralized classification engine 304 prevents bad packets from scattering by not forwarding them.

各キューイングエンジンは、関連性があるパケットを、静的バッファ１（３２０）、静的バッファ２（３２４）などの、そのキューイングエンジンの静的バッファ内に、および具体的には、各パケットと関連付けられるポートによって、関連性があるキューの中に置く。例えば、静的バッファ１（３２０）は、ポート１のＱ０と関連付けられる第１のキュー３２８、ポート１のＱＮと関連付けられる第Ｎ番目のキュー３３２等を有する。 Each queuing engine places relevant packets in its static buffers, such as static buffer 1 (320), static buffer 2 (324), etc., and in the relevant queues, specifically by port associated with each packet. For example, static buffer 1 (320) has a first queue 328 associated with Q0 of port 1, an Nth queue 332 associated with QN of port 1, etc.

よって、静的バッファは、各キューについて一定のバッファサイズによって区分される。パケットがスイッチ内で処理される際、それらのパケットは、バッファ内に保管される。 So, static buffers are partitioned with a fixed buffer size for each queue. As packets are processed within the switch, they are stored in the buffer.

この配置構成において、動的バッファが、別個の仮想バッファプールへと分けられ、各仮想バッファは、各ポートに割り当てられる。各仮想バッファにおいて、パケットは、論理ＦＩＦＯキューへと組織化される。 In this arrangement, the dynamic buffers are split into separate virtual buffer pools, with each virtual buffer assigned to each port. In each virtual buffer, packets are organized into logical FIFO queues.

デスティネーションセグメントが輻輳しているならば、スイッチは、帯域幅が、輻輳したセグメント上で利用可能になることを待って、パケットを保管する。静的バッファにおいて、バッファが満杯であると、追加的な入来するパケットは、廃棄されることになる。ゆえに、コンピュータネットワーク上の任意のアプリケーションをサポートするために、パケット損失比率を低減することが重要である。この目標を達成するために、バッファサイズは、増大されることがあり、コアネットワークは、非常に大きい静的バッファを有することがあるが、このことは、システムに、コスト、動作的な複雑さ、より決定論的でも推進的でもないアプリケーション性能、ならびに、より長いキューイング遅延を少なからず加え得る。よって、この配置構成は、さらには、不十分な結果をもたらす。 If the destination segment is congested, the switch stores the packet, waiting for bandwidth to become available on the congested segment. In a static buffer, if the buffer is full, additional incoming packets will be discarded. Therefore, to support any application on a computer network, it is important to reduce the packet loss rate. To achieve this goal, the buffer size may be increased, and the core network may have very large static buffers, but this may add significant cost, operational complexity, less deterministic and predictive application performance, as well as longer queuing delays to the system. Thus, this arrangement may even produce poor results.

それゆえに、先進的な動的バッファ管理スキームは、（１）低いキューイング遅延、（２）オーバーフローおよびアンダーフローを防止するためのキュー長さの制御、ならびに、（３）より低いパケット損失比率をサポートすべきである。 Therefore, an advanced dynamic buffer management scheme should support (1) low queuing delay, (2) queue length control to prevent overflows and underflows, and (3) lower packet loss ratios.

そのようなスキームにおいて、動的しきい値（ＤＴ）と呼称される各キューのサイズは、バッファ内の残りの空間に比例することがある、キューに適用可能なしきい値によって決定されることを必要とする。スキームは、平均キュー長さ、ならびに、キュー長さの最小および最大しきい値の値などのパラメータを使用することがある。 In such a scheme, the size of each queue, called the dynamic threshold (DT), needs to be determined by a threshold applicable to the queue, which may be proportional to the remaining space in the buffer. The scheme may use parameters such as the average queue length, and minimum and maximum threshold values for the queue length.

輻輳レベルが低いとき、しきい値の値は、輻輳制御の活動化を遅延させるために自動的に増大されることがあり、輻輳レベルが高いとき、しきい値の値は、より早期に輻輳制御を起動させるために自動的に減少されることがある。 When congestion levels are low, the threshold value may be automatically increased to delay activation of congestion control, and when congestion levels are high, the threshold value may be automatically decreased to trigger congestion control earlier.

平均キュー長さが、最小しきい値未満であるとき、パケットのどれも廃棄されず、キュー長さが、最小しきい値と最大しきい値との間であるとき、パケットは、線形に上昇する確率において廃棄され得るものであり、キュー長さが、最大しきい値を上回るとき、すべてのパケットが廃棄される。よって、そのようなスキームは、キューに満杯にさせないことにより、輻輳を回避し得る。 When the average queue length is below a minimum threshold, none of the packets are dropped, when the queue length is between the minimum and maximum thresholds, packets may be dropped with a linearly increasing probability, and when the queue length is above the maximum threshold, all packets are dropped. Thus, such a scheme may avoid congestion by not allowing the queue to fill up.

さらなるスキームにおいて、ネットワークデバイスは、過渡輻輳中の廃棄を回避するために、複数の優先度キューにわたってバッファを共有することがある。 In a further scheme, a network device may share buffers across multiple priority queues to avoid drops during transient congestion.

時間の大部分においてコスト効果は高いが、低優先度トラフィックは、高優先度トラフィックに対する増大されるパケット損失を引き起こし得る。同様に、長いフローは、バッファが、入来するバーストを吸収することを、それらのバーストが同じキューを共有しないとしても防止し得る。それゆえに、バッファ共有技法は、バッファ空間を静的に割り振ることなく、複数の（優先度）キューにわたる隔離を保証することができない。 While cost-effective most of the time, low-priority traffic can cause increased packet loss relative to high-priority traffic. Similarly, long flows can prevent buffers from absorbing incoming bursts even if those bursts do not share the same queue. Hence, buffer sharing techniques cannot guarantee isolation across multiple (priority) queues without statically allocating buffer space.

輻輳制御（ＣＣ）アルゴリズムおよびスケジューリング技法は、ＤＴの短所を軽減し得るが、それらのアルゴリズムおよび技法は、それらの短所に十分に対処することはできない。 Congestion control (CC) algorithms and scheduling techniques can mitigate the shortcomings of DT, but these algorithms and techniques cannot fully address them.

実際、ＣＣは、バーストのために、より多くの空間を残して、間接的にバッファ利用を減少し得るものであり、一方で、スケジューリングは、ある決まった優先度キューの、単一のポートを共有するそれらの優先度キューにわたる、優先的な処置を可能とし得る。それにもかかわらず、これらの技法の各々は、別個のネットワーク変数を、以下のように、検知および制御することがある。 In fact, CC may indirectly reduce buffer utilization, leaving more space for bursts, while scheduling may allow preferential treatment of certain priority queues over those priority queues that share a single port. Nonetheless, each of these techniques may sense and control separate network variables, as follows:

第１に、ＣＣは、単に、フローごとの性能（例えば、損失または遅延）を検知し得るが、共有されるバッファの状態、および、複数の競合するフローにわたる相対的優先度は念頭にない。なおも悪いことに、ＣＣは、所与のフローのレートを制御するが、他のフローが送出しているレートに影響を及ぼし得ない。よって、ＣＣは、同じデバイスを共有する複数のフローにわたるバッファコンフリクトを解決し得ない。 First, the CC can only sense per-flow performance (e.g., loss or delay), but has no regard for the state of the shared buffer and the relative priority across multiple competing flows. Worse yet, the CC controls the rate of a given flow, but cannot affect the rate at which other flows are sending out. Thus, the CC cannot resolve buffer conflicts across multiple flows sharing the same device.

第２に、スケジューリングは、単に、キューごとの占有度を検知し、特定のポートを経るパケットの送信（デキュー）を、それらのパケットがエンキューされた後に、および、それらのパケットがエンキューされた場合にのみ制御し得る。結果として、スケジューリングは、同じポートを共有しない複数のキューにわたるバッファコンフリクトを解決し得ない。 Second, scheduling can only sense per-queue occupancy and control the transmission (dequeue) of packets through a particular port after, and only if, those packets are enqueued. As a result, scheduling cannot resolve buffer conflicts across multiple queues that do not share the same port.

コストを低減し、利用を最大化するために、ネットワークデバイスは、しばしば、共有バッファチップを頼りとし、複数のキューにわたるそのチップの割り振りは、バッファ管理アルゴリズム、例えばＤＴにより、動的に調整される。 To reduce costs and maximize utilization, network devices often rely on shared buffer chips, whose allocation across multiple queues is dynamically adjusted by buffer management algorithms, e.g., DT.

ＤＴは、依然として占有されないバッファ空間に比例して、キューごとにバッファを動的に割り振る。結果として、バッファを共有するキューが多いほど、それらのキューの各々が占有することを可能とされるバッファは少ない。ＤＴは、そのＤＴの広い配備にもかかわらず、以下の３つの主たる理由のために、マルチテナントデータセンタ環境の要件を満たさない。 DT dynamically allocates buffers per queue in proportion to the buffer space still unoccupied. As a result, the more queues that share a buffer, the less buffer each of those queues is allowed to occupy. Despite its wide deployment, DT does not meet the requirements of multi-tenant data center environments for three main reasons:

第１に、ＤＴは、アプリケーション性能のために特に重要なものである、バーストを、確実には吸収し得ない。第２に、ＤＴは、何らの隔離保証も供することができず、そのことは、トラフィックの性能が、高優先度トラフィックでさえ、そのトラフィックが通過する各デバイス上の瞬時負荷に依存的であるということを意味する。第３に、ＤＴは、トラフィック需要における突然の変化に反応することができず、なぜならば、そのＤＴは、（スループットを改善するために）バッファを高度に利用されている様態に保つことを、このことがほとんど利益をもたらさないとしても行うからである。 First, DT cannot reliably absorb bursts, which is especially important for application performance. Second, DT cannot provide any isolation guarantees, which means that traffic performance, even high-priority traffic, is dependent on the instantaneous load on each device through which the traffic passes. Third, DT cannot react to sudden changes in traffic demand because it keeps buffers highly utilized (to improve throughput) even though this brings little benefit.

なおも悪いことに、バッファ空間の一部をキューに割り振る、より先進的な手法は、バーストを吸収することなどの、より良好な使用に投入され得る、貴重なバッファ空間を実際上は無駄にする。 Worse yet, more advanced approaches that allocate a portion of the buffer space to queues effectively waste valuable buffer space that could be put to better use, such as absorbing bursts.

ＤＴは、各キューの瞬時最大長さ、すなわち、そのキューのしきい値Ｔ^ｉ _ｃ（ｔ）を、残りのバッファ空間、および、構成可能パラメータαによって、例えば、以下の式：Ｔ^ｉ _ｃ（ｔ）＝α^ｉ _ｃ・（Ｂ－Ｑ（ｔ））によって動的に適合させ、ここで、
Ｔ^ｉ _ｃ（ｔ）は、ポートｉにおけるクラスｃのキューしきい値、すなわち、割り振られるキューサイズであり、
ｃは、送信と関連付けられるクラスであり、
α^ｉ _ｃは、ポートｉにおけるクラスｃのパラメータであり、
Ｂは、総合的なバッファ空間であり、
Ｑ（ｔ）は、時間ｔにおける総合的なバッファ占有度である。 DT dynamically adapts the instantaneous maximum length of each queue, i.e., the threshold T ⁱ _c (t) for that queue, depending on the remaining buffer space and a configurable parameter α, e.g., according to the following formula: T ⁱ _c (t)=α ⁱ _c ·(B−Q(t)), where
T ⁱ _c (t) is the queue threshold, i.e., the allocated queue size, of class c at port i;
c is the class associated with the transmission,
α ⁱ _c is the parameter of class c at port i,
B is the total buffer space,
Q(t) is the total buffer occupancy at time t.

キューのαパラメータは、そのキューの最大長さ、および、他のキューに対するそのキューの相対的長さに影響を及ぼす。 The alpha parameter of a queue affects the maximum length of the queue and the length of the queue relative to other queues.

よって、オペレータは、低優先度トラフィッククラスと比較して、高優先度トラフィッククラスについて、より高いα値をセットする公算が大きい。 Thus, operators are likely to set higher alpha values for high priority traffic classes compared to low priority traffic classes.

しかしながら、αを構成するための体系的な手立ては、その手立ての重要性にもかかわらず存在せず、そのことは、異なるデータセンタベンダおよびオペレータが、異なるα値を使用することがあるということを意味する。データセンタオペレータが、トラフィックをクラスへとグループ分けするということを想定すると、各クラスは、クラス横断遅延隔離を達成するために、各ポートにおいて単一のキューを排他的に使用する。実例として、ストレージ、ＶｏＩＰ、およびＭａｐＲｅｄｕｃｅは、別個のトラフィッククラスに属することがある。さらには、各トラフィッククラスが、高優先度または低優先度のものであるということを想定すると、クラスを高優先度および低優先度に類別することは、高負荷の時間における、他のものに対する、ある決まったクラスの優先度設定をたやすくする。 However, there is no systematic way to configure α, despite its importance, which means that different data center vendors and operators may use different α values. Suppose that the data center operator groups traffic into classes, where each class exclusively uses a single queue on each port to achieve cross-class delay isolation. As an illustration, storage, VoIP, and MapReduce may belong to separate traffic classes. Furthermore, suppose that each traffic class is of high or low priority, categorizing classes into high and low priority facilitates prioritization of certain classes over others during times of high load.

この優先度設定は、共有バッファの使用と関わりがあり、スケジューリングに影響を及ぼさない。 This priority setting is related to the use of shared buffers and does not affect scheduling.

オペレータは、複数個の低優先度クラス、および、複数個の高優先度クラスを構成し得る。クラウド環境において、サービスレベルアグリーメント（ＳＬＡ）にしたがわなければならないトラフィックは、高優先度であることになる。 An operator may configure multiple low priority classes and multiple high priority classes. In a cloud environment, traffic that must comply with a service level agreement (SLA) will be high priority.

次に図４を参照して、ＤＴスキームの性能グラフ４００を示す。時間ｔ０において、入来するバーストＱ２が、バッファ占有度を急激に変化させる。過渡状態（ｔ０…ｔ２）において、Ｑ１のしきい値は、そのＱ１の長さよりも低い。よって、すべてのそのＱ１の入来するパケットは、Ｑ２のためにバッファを解放するために廃棄される。それでもなお、Ｑ２は、そのＱ２の公平な定常状態割り振りに達する前（時間ｔ１…ｔ２）に廃棄を経験する。 Now referring to FIG. 4, a performance graph 400 of the DT scheme is shown. At time t0, an incoming burst Q2 causes an abrupt change in the buffer occupancy. In the transient state (t0...t2), the threshold of Q1 is lower than the length of Q1. Hence, all incoming packets of Q1 are dropped to free up the buffer for Q2. Nevertheless, Q2 experiences drops before reaching its fair steady-state allocation (times t1...t2).

高優先度バースト（Ｑ２についての）は、バッファが定常状態に達する前に廃棄されたということが確認される。これらの廃棄は、（ｉ）バーストが到着したときに、より多くの利用可能なバッファが存した（定常状態割り振り）、または、（ｉｉ）バッファが、バーストのための空き場所をつくるために、より高速に空にされ得た（過渡状態割り振り）ならば、回避され得た。 It is observed that high priority bursts (for Q2) were discarded before the buffers reached steady state. These discards could have been avoided if (i) there had been more buffers available when the bursts arrived (steady state allocation), or (ii) the buffers could have been emptied more quickly to make room for the bursts (transient state allocation).

よって、ＤＴは、以下の非効率性をあらわにする。
１．ＤＴは、最小バッファ保証を供さない：ＤＴは、静的パラメータ（α）によって、他のものに勝るキューまたはクラスの優位を強制する。それにもかかわらず、αは、保証を供さず、なぜならば、実際のキューごとのしきい値は、定常状態においてでさえ、気ままなようにおよび制御不可能に低い値に達し得る、総体的な残りのバッファに依存するからである。
２．ＤＴは、バースト許容保証を供さない：そのＤＴの定常状態割り振りの予測不可能性に加えて、ＤＴの過渡状態割り振りは、制御不可能である。このことは、特に、そのことがバースト吸収に至るときに問題となる。この限界についての主な理由は、ＤＴは、バッファ空間を、経時的なそのバッファ空間の予想される占有度を無視して、スカラ量として知覚するということである。 Thus, DT exposes the following inefficiencies:
1. DT does not provide minimum buffer guarantees: DT enforces the dominance of queues or classes over others through a static parameter (α). Nevertheless, α provides no guarantee because the actual per-queue threshold depends on the overall remaining buffer, which can reach arbitrarily and uncontrollably low values even in steady state.
2. DT does not provide burst tolerance guarantees: In addition to the unpredictability of its steady-state allocation, its transient state allocation is uncontrollable. This is especially problematic when it comes to burst absorption. The main reason for this limitation is that DT perceives buffer space as a scalar quantity, ignoring the expected occupancy of that buffer space over time.

向上させられた動的スキームは、キューレベルおよびバッファレベルの両方の情報に依存して、各キューが使用し得るバッファ空間を制限する。 The improved dynamic scheme relies on both queue-level and buffer-level information to limit the buffer space each queue can use.

特に、以下のように、各キューの最大長さであるしきい値が定義され、バッファ量が割り振られ得る。
Ｔ^ｉ _ｃ（ｔ）＝α_ｃ・１／Ｎ_ｐ（ｔ）・γ^ｉ _ｃ（ｔ）・（Ｂ－Ｂ_ｏｃ（ｔ））
ここで、
・ｃは、送信と関連付けられるクラスである。
・Ｔ^ｉ _ｃ（ｔ）は、しきい値サイズ、すなわち、クラスｃの第ｉ番目のポートに割り当てられるキューの長さである。
・α_ｃは、キューが属するクラスに割り当てられる値である。
・Ｎ_ｐ（ｔ）は、クラスが属する優先度（低または高）の輻輳（空でない）キューの数である。空でないキューが少数存在するならば、より大きいしきい値が割り振られ得るものであり、逆もまたしかりである。
・γ^ｉ _ｃ（ｔ）は、ポートと関連付けられるキューの、ポートごとの正規化されたデキューレート、すなわち、特定のキューのクリアリングレートである。
・Ｂは、総合的なバッファであり、Ｂ_ｏｃ（ｔ）は、占有されるバッファであり、それゆえに、Ｂ－Ｂ_ｏｃ（ｔ）は、残りのバッファである。 In particular, a threshold may be defined that is the maximum length of each queue and an amount of buffering may be allocated as follows:
T ⁱ _c (t) = α _c · 1 / N _p (t) · γ ⁱ _c (t) · (B - B _oc (t))
here,
c is the class associated with the transmission.
T ⁱ _c (t) is the threshold size, ie, the length of the queue assigned to the ith port of class c.
α _c is the value assigned to the class to which the queue belongs.
_Np (t) is the number of congested (non-empty) queues of the priority (low or high) to which the class belongs. If there are a small number of non-empty queues, a larger threshold can be assigned, and vice versa.
γ ⁱ _c (t) is the per-port normalized dequeue rate of the queue associated with the port, ie, the clearing rate of the particular queue.
B is the total buffer, B _oc (t) is the occupied buffer, and hence B−B _oc (t) is the remaining buffer.

この式は、全般的に高い負荷であるが、より小さい空間がそのキューについて十分であるような、特定のクラスについては負荷がほとんどないという状況、高い負荷であるが関連性があるキューのデキューイングレートは高く、より小さい空間がそのキューについて十分であるような状況などの状況を扱うことがある。 This formula may handle situations such as high load overall, little load for a particular class, where a smaller space would suffice for that queue, high load but a relevant queue has a high dequeuing rate, and a smaller space would suffice for that queue.

Ｎｐ（ｔ）は、定常状態割り振りを限度内にとどめる。キューごとのしきい値は、Ｎｐで除算される。割り振りに対するこの因子の重要さは２点あり、（ｉ）その因子は、クラスごとの、および、優先度ごとの占有度を限度内にとどめること、ならびに、（ｉｉ）その因子によって、同じ優先度の複数のクラスにわたる重み付け公平性が可能となることである。 Np(t) bounds the steady-state allocation. Per-queue thresholds are divided by Np. The importance of this factor to the allocation is twofold: (i) it bounds per-class and per-priority occupancy, and (ii) it allows weighted fairness across multiple classes of the same priority.

γ^ｉ _ｃ（ｔ）は、過渡状態の持続期間を指示する。バッファは、各キューに、そのキューのデキューイングレート（γ）に比例して割り振られる。上側限度と組み合わされるγ因子は、過渡状態の持続期間を変化させ得る。実際、優先度ごとにいくらかの量のバッファを与えられると、そのバッファは、複数のキューへと、それらのキューの排出レートに比例して分けられ、そのことによって、バッファが空にされるのにかかる時間を効果的に最小化する。要するに、１つの定常状態割り振りから別の定常状態割り振りに移行するために必要とされる時間が低減される。 γ ⁱ _c (t) indicates the duration of the transient state. A buffer is allocated to each queue in proportion to the dequeuing rate (γ) of that queue. The γ factor combined with the upper bound can vary the duration of the transient state. In effect, given some amount of buffer per priority, the buffer is split among multiple queues in proportion to the draining rates of those queues, thereby effectively minimizing the time it takes for the buffer to be emptied. In essence, the time required to transition from one steady-state allocation to another is reduced.

上記のスキームは、よって、スループットを改善し、キューイング遅延を低減し、一方で、全般的に高い負荷であるが、特定のクラスについては負荷がほとんどないという状況、高い負荷であるが、関連性があるキューのデキューイングレートは高いという状況などの状況を扱うことにより、所与のバーストの吸収を確実にする。 The above scheme thus improves throughput and reduces queuing delays while ensuring absorption of a given burst by handling situations such as high overall load but little load for a particular class, high load but high dequeuing rate for the relevant queue, etc.

しかしながら、上記の式から明確に確認されるように、すべての、時間に関係付けられる因子は、パケットが受信されるときに、現在時間に基づいて算出されるが、送信が開始し進行中であると、因子は変化することがあり、それらの因子の値は、よって、より関連性がなく有用でないことがあり、それゆえに、欠陥のある結果をもたらすことがある。この式は、よって、その上、十分な解をもたらさない。 However, as can be clearly seen from the above formula, all time-related factors are calculated based on the current time when the packet is received, but once the transmission has started and is underway, the factors may change and the values of those factors may therefore be less relevant and useful, hence leading to flawed results. This formula therefore does not yield a satisfactory solution as well.

それゆえに、本開示によれば、バッファについての動的しきい値、すなわち、キュー長さは、そのことが行われることが予想される時間、および、特定の送信スイッチによって決定され得る。例えば、しきい値は、以下の式によって計算され得る：
Ｔ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）＝
α_ｃ・１／Ｎ_ｐ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）・γ^ｉ _ｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）・（Ｂ－Ｂ_ｏｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ））
ここで、
・ｉは、トラフィックユニットが行先に定められるポートのインデックスである。
・ｃは、複数のトラフィックユニットのクラスである。
・ｔは、現在時間である。
・τは、将来時間までの時間差である。
・α_ｃは、クラスｃに割り当てられる係数である。
・Ｌｏｃａｔｉｏｎは、例えば、トップオブラックスイッチ識別子とコアスイッチ識別子との組み合わせとして提供される、データセンタの中のスイッチの物理的ロケーションである。
・Ｎｐ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、特定のスイッチについての時間ｔにおけるクラスの優先度ｐの輻輳キューの数であるＮｐ（ｔ，ｌｏｃａｔｉｏｎ）、および、特定のスイッチについての時間ｔ＋τにおけるクラスの優先度ｐの輻輳キューの数であるＮｐ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の変動または組み合わせである。例えば、値は、数、それらの数の平均などのうちのいずれかに等しいことがある。
・Ｂは、総合的なバッファであり、Ｂ_ｏｃ（ｔ，ｌｏｃａｔｉｏｎ）は、時間ｔにおける占有されるバッファであり、Ｂ_ｏｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）は、時間ｔ＋τにおける占有されるバッファであり、Ｂ－Ｂ_ｏｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、特定のスイッチについての時間ｔにおける残りのバッファであるＢ－Ｂ_ｏｃ（ｔ，ｌｏｃａｔｉｏｎ）、および、特定のスイッチについての時間ｔ＋τにおける残りのバッファであるＢ－Ｂ_ｏｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の変動または組み合わせである。例えば、値は、数、それらの数の平均などのうちのいずれかに等しいことがある。
・γ^ｉ _ｃ’（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）は、特定のスイッチについての時間ｔにおけるクラスｃの第ｉ番目のキューの、ポートごとの正規化されたデキューレートであるγ^ｉ _ｃ（ｔ，ｌｏｃａｔｉｏｎ）、および、γ^ｉ _ｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）、特定のスイッチについての時間ｔ＋τにおけるクラスｃの第ｉ番目のキューの、ポートごとの正規化されたデキューレートの変動または組み合わせである。例えば、値は、数、それらの数の平均などのうちの任意のものいずれかに等しいことがある。 Therefore, in accordance with the present disclosure, a dynamic threshold for a buffer, i.e., queue length, may be determined by the time at which it is expected to occur and the particular transmit switch. For example, the threshold may be calculated by the following formula:
T ⁱ _c (t, t + τ, location) =
α _c · 1/N _p '(t, t + τ, location) · γ ⁱ _c '(t, t + τ, location) · (B - B _oc '(t, t + τ, location))
here,
i is the index of the port to which the traffic unit is destined.
c is a class of traffic units.
t is the current time.
・τ is the time difference to the future.
α _c is the coefficient assigned to class c.
Location is the physical location of the switch within a data center, provided, for example, as a combination of a top-of-rack switch identifier and a core switch identifier.
Np'(t,t+τ,location) is the variation or combination of Np(t,location), the number of congestion queues of class priority p at time t for a particular switch, and Np(t+τ,location), the number of congestion queues of class priority p at time t+τ for a particular switch. For example, the value may be equal to any of the numbers, the average of those numbers, etc.
B is the overall buffer, B _oc (t, location) is the occupied buffer at time t, B _oc (t+τ, location) is the occupied buffer at time t+τ, and B-B _oc '(t, t+τ, location) is the variation or combination of B-B _oc (t, location), which is the remaining buffer at time t for a particular switch, and B-B _oc (t+τ, location), which is the remaining buffer at time t+τ for a particular switch. For example, the values may be equal to any of the numbers, the average of those numbers, etc.
γ ⁱ _c ′(t,t+τ,location) is the normalized dequeue rate per port of the i th queue of class c at time t for a particular switch, γ ⁱ _c (t,location), and γ ⁱ _c (t+τ,location), the variance or combination of the normalized dequeue rate per port of the i th queue of class c at time t+τ for a particular switch. For example, the values may be equal to any of the following: numbers, averages of those numbers, etc.

Ｎｐ（ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）の値は、様々な時間、スイッチなどにおける複数の送信を基に訓練される、人工知能エンジン、例えばニューラルネットワークなどのエンジンから取得され得る。 The values of Np(t+τ,location), Boc(t+τ,location), and γ ⁱ _c (t+τ,location) may be obtained from an artificial intelligence engine, e.g., a neural network, that is trained based on multiple transmissions at various times, switches, etc.

いくつかの実施形態において、τの好まれる値が、例えば時間、日、月などを含むｔ、および、関連性がある時間におけるそのスイッチの特定の挙動をも必然的に伴い得るスイッチのロケーションに基づいて、さらに学習されることがある。一般的に言えば、τは、しきい値をセットすることが、廃棄されるパケットを消失させる、または低減することに役立ち得るように、大部分の送信についての時間フレームである、数ミリ秒、数秒、および数分の間で変動することがある。スイッチのロケーションは、特定のスイッチ識別子として表されることがある。 In some embodiments, a preferred value of τ may be further learned based on t, which may include, for example, hours, days, months, etc., and the location of the switch, which may also entail the specific behavior of that switch at the relevant time. Generally speaking, τ may vary between milliseconds, seconds, and minutes, which is the time frame for most transmissions, such that setting a threshold may help eliminate or reduce dropped packets. The location of the switch may be represented as a specific switch identifier.

本開示のいくつかの実現形態において、キューのクリアリングレートであるγ^ｉ _ｃ（ｔ＋τ，ｌｏｃａｔｉｏｎ）は、従来の方法においてよりも高いことがあり、なぜならば、複数個のパケットが同時的に出力され得るので、そのレートは、先入れ先出し（ＦＩＦＯ）機構により限定されないからである。 In some implementations of the present disclosure, the queue clearing rate γ ⁱ _c (t+τ,location) can be higher than in conventional methods because the rate is not limited by a first-in, first-out (FIFO) mechanism since multiple packets can be output simultaneously.

次に図５を参照して、本開示のいくつかの例示的な実施形態による、送信を分類し、バッファサイズを決定するためのエンジンを訓練するための方法におけるステップのフローチャートを示す。 Referring now to FIG. 5, a flowchart of steps in a method for training an engine for classifying transmissions and determining buffer sizes is shown, in accordance with some example embodiments of the present disclosure.

方法は、特定のスイッチまたは特定のデータセンタと関連付けられるか否かを問わず、任意のコンピューティングプラットフォームにより遂行され得る。 The method may be performed by any computing platform, whether or not associated with a particular switch or a particular data center.

ステップ５０４において、特徴が、パケットなどのトラフィックユニットを含む、入来する訓練トラフィックデータから抽出され得る。トラフィックユニットは、ポートを通してスイッチにより送信されることになり、ポートは、関連付けられるキューを有する。特徴は、例えば、ソースアドレス、デスティネーションアドレス、パケット到着レート、時間および日付、特定のスイッチおよび特定のデータセンタ、優先度レベル（例えば、ｔ並列コンピューティングに関連するサービスレベルアグリーメントまたはＭａｐＲｅｄｕｃｅは、高優先度を必然的に伴うことになる）などを含み得る。特徴は、あらかじめ決定された時間区間の中で到着する、１つのパケット、または、パケットのシーケンスから抽出され得る。 In step 504, features may be extracted from the incoming training traffic data, including traffic units such as packets. The traffic units will be transmitted by the switch through ports, which have associated queues. Features may include, for example, source addresses, destination addresses, packet arrival rates, time and date, a particular switch and a particular data center, priority levels (e.g., service level agreements associated with t-parallel computing or MapReduce will entail high priority), etc. Features may be extracted from a packet or a sequence of packets that arrive within a predetermined time interval.

ステップ５０８において、特徴ベクトルが、別個の複数のクラスタへとクラスタリングされ得るものであり、各クラスタは、他のクラスタに割り当てられる特徴ベクトルの特性よりも、そのクラスタに割り当てられる他の特徴ベクトルの特性により類似する特性を有する。Ｋ平均、ガウス混合モデル、サポートベクトルマシン、密度ベース、分布ベース、セントロイドベース、または階層的ベースなどの任意の現在知られているクラスタリングアルゴリズム、または、知られるようになることになるクラスタリングアルゴリズムが適用され得るが、それらに限定されない。 In step 508, the feature vectors may be clustered into separate clusters, each cluster having properties that are more similar to the properties of the other feature vectors assigned to it than to the properties of the feature vectors assigned to the other clusters. Any currently known or to become known clustering algorithm may be applied, such as, but not limited to, K-means, Gaussian mixture models, support vector machines, density-based, distribution-based, centroid-based, or hierarchical-based.

別のベクトルを与えられると、分類器が最も適切なクラスタを出力するように、分類器が、次いで、特徴ベクトル、および、各ベクトルに割り当てられるクラスタを基に訓練され得る。 A classifier can then be trained on the feature vectors and the clusters assigned to each vector, so that given another vector, the classifier will output the most appropriate cluster.

ステップ５１２において、クラスタの各々に関連性があるデータセットが、クラスタに割り当てられるトラフィックデータを基に生成され得る。データセットは、各特定のクラスに割り当てられる特徴ベクトルを含み得る。データセットは、訓練データセット、検証データセット、および試験データセットへと分割され得る。 In step 512, a data set relevant to each of the clusters may be generated based on the traffic data assigned to the cluster. The data set may include a feature vector assigned to each particular class. The data set may be divided into a training data set, a validation data set, and a test data set.

ステップ５１６において、例えばニューラルネットワーク、ディープニューラルネットワークなどとして実現される、トラフィックモデルとも呼称される、人工知能（ＡＩ）エンジンが、訓練データセットを基に訓練され得る。トラフィックモデルは、特定のロケーションにおけるスイッチについて、将来時間＋τにおいて、特定のキューＮｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、利用可能なメモリＢｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、および排出レートγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）への負荷を予測するために使用される。１つまたは複数の特徴ベクトルを含む、訓練データセットからの各データセットは、グラウンドトゥルースと呼称される、関連性がある値と関連付けられ得る。例えば、複数のτ値、および、スイッチを識別するロケーション値について、訓練データセットにおける各ベクトルは、Ｎｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）の（予測分析を使用して）算出された値と関連付けられ得る。値は、スイッチおよび時間ｔについて蓄積された大量の情報から算出され得る。 In step 516, an artificial intelligence (AI) engine, also referred to as a traffic model, implemented as, for example, a neural network, a deep neural network, etc., may be trained on the training dataset. The traffic model is used to predict the load on a particular queue Np(t,t+τ,location), available memory Boc(t,t+τ,location), and drain rate ^γi _c (t,t+τ,location) at a future time +τ for a switch at a particular location. Each dataset from the training dataset, including one or more feature vectors, may be associated with relevant values, referred to as ground truth. For example, for multiple τ values and location values that identify a switch, each vector in the training dataset may be associated with calculated (using predictive analytics) values of Np(t,t+τ,location), Boc(t,t+τ,location), and ^γi _c (t,t+τ,location). The value can be calculated from a large amount of information accumulated about the switch and time t.

各トラフィックモデルは、よって、各特徴ベクトル、および、１つまたは複数のτ値についての、Ｎｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）の値を出力するように、１つのクラスタの特徴ベクトルを基に訓練され、そのことは、前記値が、時間ｔの後の時間区間τにおけるものであるということを示す。エンジンはまた、最も良好な値または値組み合わせをもたらすτ値を出力するように訓練され得る。そのような値は、例えば、小さいτだけ送信を延期することによって、より大きいキューを取得し、パケットを失うことのリスクを回避することが可能になり得るということを示し得る。各々のそのようなエンジンは、それが時間およびスイッチロケーションを受信するので、時空間エンジンである。 Each traffic model is thus trained on the feature vectors of one cluster to output values of Np(t,t+τ,location), Boc(t,t+τ,location), and ^γi _c (t,t+τ,location) for each feature vector and one or more values of τ, which indicate that said values are at a time interval τ after time t. The engine can also be trained to output the τ value that results in the best value or combination of values. Such a value may indicate, for example, that by postponing transmission by a small τ, it may be possible to obtain a larger queue and avoid the risk of losing packets. Each such engine is a spatio-temporal engine, since it receives the time and the switch location.

エンジンが、訓練され、トラフィックモデルを表すと、エンジンは、モデルのハイパーパラメータのチューニングのために、検証データセットを使用して検証され、オーバーフィッティングを回避するために、試験データセットを使用して試験され得る。 Once the engine is trained and represents a traffic model, the engine may be validated using a validation dataset for tuning the model's hyperparameters and tested using a test dataset to avoid overfitting.

さらなる実施形態において、同じエンジンが、クラス指示によって、様々なクラスの特徴ベクトルの組み合わせを処理し、関連性があるクラスによって動作するように適合させられ得るということが理解されるであろう。 It will be appreciated that in further embodiments, the same engine can be adapted to process combinations of feature vectors of various classes and operate according to the classes that are related by class designation.

次に図６を参照して、本開示のいくつかの例示的な実施形態による、キューしきい値、すなわち、特定のキューに割り当てられることになるバッファの量を決定するための方法におけるステップのフローチャートを示す。 Referring now to FIG. 6, a flowchart of steps in a method for determining a queue threshold, i.e., the amount of buffer to be allocated to a particular queue, is shown, in accordance with some example embodiments of the present disclosure.

方法は、送信に容認不可能な遅延を加えないように、高速で結果を取得するために、特定のスイッチに属する、または、特定のスイッチと通信しているコンピューティングプラットフォームなどの、任意のコンピューティングプラットフォームにより遂行され得る。 The method may be performed by any computing platform, such as a computing platform belonging to or in communication with a particular switch, to obtain results quickly without adding unacceptable delays to the transmission.

ステップ６０４において、特徴が、上記のステップ５０４と同様に、入来するトラフィックについて抽出され得る。 In step 604, features may be extracted for the incoming traffic, similar to step 504 above.

ステップ６０８において、特徴ベクトルが、１つまたは複数の特徴ベクトルのセットについてクラスを取得するために、上記のステップ５０８において訓練された分類器に提供され得る。クラスは、図５の訓練トラフィックデータがクラスタリングされたクラスタのうちの１つに対応し得る。特徴ベクトルがクラスに割り当てられてしまうと、このことは、クラスについて関連性があるＡＩエンジンの選択を必然的に伴い得る。 In step 608, the feature vectors may be provided to a classifier trained in step 508 above to obtain a class for a set of one or more feature vectors. The class may correspond to one of the clusters into which the training traffic data of FIG. 5 was clustered. Once the feature vectors have been assigned to a class, this may entail the AI engine selecting the relevant class.

ステップ６１２において、特定のエンジンが、値の１つまたは複数のセット、予測されるトラフィック量の指示を取得するために、特徴ベクトルに適用され得る。例えば、τの１つまたは複数の値について、予測される値の各セットは、Ｎｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）を含み得る。いくつかの実施形態において、エンジンはまた、Ｎｐ、Ｂｏｃ、およびγ^ｉ _ｃの最も良好な組み合わせをもたらすτ値を出力し得る。代替的な実施形態において、クラスが、単一のエンジンに、特徴ベクトルとともに提供され得、そのエンジンは、そのクラスによって特定のエンジンを内部で動作させる。 At step 612, a particular engine may be applied to the feature vector to obtain one or more sets of values, an indication of predicted traffic volumes. For example, for one or more values of τ, each set of predicted values may include Np(t,t+τ,location), Boc(t,t+τ,location), and γ ⁱ _c (t,t+τ,location). In some embodiments, the engine may also output the τ value that results in the best combination of Np, Boc, and γ ⁱ _c . In an alternative embodiment, the class may be provided to a single engine along with the feature vector, which will operate the particular engine internally according to the class.

ステップ６１６において、結果が、特定の特徴ベクトルについて組み合わされ得る。例えば、現在時間、すなわちτ＝０についての値が、τの他の値について取得される値の、１つもしくは複数のセットと比較され得、または、現在時間についての値が、例えば、τの他の値について関連性がある値と平均をとられるなど、組み合わされ得る。 In step 616, the results may be combined for a particular feature vector. For example, the value for the current time, i.e., τ=0, may be compared to one or more sets of values obtained for other values of τ, or the value for the current time may be combined, e.g., averaged, with relevant values for other values of τ.

結果的に生じるしきい値が、次いで適用され得、関連性があるポートのキューが、しきい値によって量を割り当てられ得る。 The resulting thresholds can then be applied and the relevant port's queues can be allocated amounts according to the thresholds.

上記で開示されたように、γ^ｉ _ｃは、キューが先入れ先出し（ＦＩＦＯ）パラダイムにおいて動作することに限定されないならば、従来の方法においてよりも高いことがある。このことは、例えば、「ＯｐｔｉｃａｌＳｗｉｔｃｈｗｉｔｈＡｌｌ－ＯｐｔｉｃａｌＭｅｍｏｒｙＢｕｆｆｅｒ」と題する、本出願と同じ譲受人に譲渡された、米国特許出願第６３／１６７，０８２号においてで詳説されたように、例えば、光学送信を使用することにより可能にされ得る。 As disclosed above, γ ⁱ _c can be higher than in conventional methods if the queues are not limited to operating in a first-in, first-out (FIFO) paradigm, which can be made possible, for example, by using optical transmission, as detailed in U.S. patent application Ser. No. 63/167,082, entitled "Optical Switch with All-Optical Memory Buffer," assigned to the same assignee as this application.

次に図７を参照して、本開示のいくつかの例示的な実施形態による、１つまたは複数のバッファについてキューサイズを決定するためのシステムのブロック図を示す。 Referring now to FIG. 7, a block diagram of a system for determining queue size for one or more buffers is shown, in accordance with some example embodiments of the present disclosure.

図７のブロック図は、上記の図５および図６の方法を実行するように構成され得る。図７のシステムは、１つまたは複数のコンピューティングプラットフォーム７００を含み得る。図７は単一のコンピューティングプラットフォームを示すが、方法は、下記で詳説される構成要素のうちの１つまたは複数を各々が含む、様々なコンピューティングプラットフォームにより実行され得るということが理解されるであろう。よって、図７のコンピューティングプラットフォームは、動作可能に接続され得る、１つもしくは複数のコンピューティングプラットフォームとして実現され得るものであり、または、データは、１つのコンピューティングプラットフォームから他のものに、直接もしくは間接的に提供され得る。例えば、１つのコンピューティングプラットフォーム７００は、データセンタのアクセススイッチ、アグリゲートスイッチ、またはコアスイッチの一部であり得るものであり、図６の方法を実行し得るものであり、一方で、別のコンピューティングプラットフォーム７００は、サーバ、デスクトップコンピュータ、ラップトップコンピュータなどのリモートコンピューティングプラットフォームであり得るものであり、図５の方法を実行し得る。 The block diagram of FIG. 7 may be configured to perform the methods of FIG. 5 and FIG. 6 above. The system of FIG. 7 may include one or more computing platforms 700. Although FIG. 7 shows a single computing platform, it will be understood that the method may be performed by various computing platforms, each including one or more of the components detailed below. Thus, the computing platform of FIG. 7 may be realized as one or more computing platforms that may be operatively connected, or data may be provided directly or indirectly from one computing platform to another. For example, one computing platform 700 may be part of an access switch, aggregate switch, or core switch of a data center and may perform the method of FIG. 6, while another computing platform 700 may be a remote computing platform, such as a server, desktop computer, laptop computer, etc., and may perform the method of FIG. 5.

コンピューティングプラットフォーム７００は、ワイドエリアネットワーク、ローカルエリアネットワーク、イントラネット、インターネット、メモリ記憶デバイスの転送などの、任意の通信チャネルによって、他のコンピューティングプラットフォームと通信し得る。 The computing platform 700 may communicate with other computing platforms via any communication channel, such as a wide area network, a local area network, an intranet, the Internet, or memory storage device transfers.

コンピューティングプラットフォーム７００は、１つもしくは複数の中央処理ユニット（ＣＰＵ）、マイクロプロセッサ、電子回路、集積回路（ＩＣ）などであり得るプロセッサ７０４を含み得る。プロセッサ７０４は、例えば、下記で詳説される記憶デバイス７１６上に記憶されるモジュールを、メモリにロードし、起動させることにより、求められる機能性を提供するように構成され得る。プロセッサ７０４は、同じプラットフォーム上に位置を定められるか否かを問わず、１つまたは複数のプロセッサとして実現され得るということが理解されるであろう。 The computing platform 700 may include a processor 704, which may be one or more central processing units (CPUs), microprocessors, electronic circuits, integrated circuits (ICs), etc. The processor 704 may be configured to provide the required functionality, for example, by loading into memory and activating modules stored on a storage device 716, which will be described in more detail below. It will be appreciated that the processor 704 may be implemented as one or more processors, whether or not located on the same platform.

コンピューティングプラットフォーム７００は、ディスプレイ、スピーカフォン、ヘッドセット、ポインティングデバイス、キーボード、タッチスクリーンなどの入出力（Ｉ／Ｏ）デバイス７０８を含み得る。Ｉ／Ｏデバイス７０８は、ユーザからの入力を受信すること、および、ユーザへの出力を提供すること、例えば、プリファレンスを受信すること、性能統計を表示することなどを行うために利用され得る。 The computing platform 700 may include input/output (I/O) devices 708, such as a display, speakerphone, headset, pointing device, keyboard, touch screen, etc. The I/O devices 708 may be used to receive input from and provide output to a user, such as receiving preferences, displaying performance statistics, etc.

コンピューティングプラットフォーム７００は、他のスイッチ、サーバ、ＰｏＤ、データセンタなどの他のコンピューティングプラットフォームと通信するために通信デバイス７１２を含み得る。通信デバイス７１２は、任意の通信プロトコルにより、および、インターネット、イントラネット、ＬＡＮ、ＷＡＮなどの、任意のチャネル上で通信するように適合させられ得る。 The computing platform 700 may include a communication device 712 for communicating with other computing platforms, such as other switches, servers, PoDs, data centers, etc. The communication device 712 may be adapted to communicate over any communication protocol and over any channel, such as the Internet, an intranet, a LAN, a WAN, etc.

コンピューティングプラットフォーム７００は、ハードディスクドライブ、フラッシュディスク、ランダムアクセスメモリ（ＲＡＭ）、メモリチップなどの記憶デバイス７１６を含み得る。いくつかの例示的な実施形態において、記憶デバイス７１６は、下記で列挙されるモジュールのうちの任意のものと関連付けられる動作、または、上記の図５および図６の方法のステップを遂行することをプロセッサ７０４に行わせるように動作可能なプログラムコードを保持し得る。プログラムコードは、下記で詳説されるように命令を実行するように適合させられる、関数、ライブラリ、スタンドアローンプログラムなどの、１つまたは複数の実行可能ユニットを含み得る。 The computing platform 700 may include a storage device 716, such as a hard disk drive, a flash disk, a random access memory (RAM), a memory chip, etc. In some exemplary embodiments, the storage device 716 may hold program code operable to cause the processor 704 to perform operations associated with any of the modules listed below or to perform the steps of the methods of FIGS. 5 and 6 above. The program code may include one or more executable units, such as functions, libraries, stand-alone programs, etc., adapted to execute instructions as detailed below.

記憶デバイス７１６は、特徴を、パケットなどの単一のトラフィックユニットから、または、２つ以上のそのようなユニットのシーケンスから抽出するための特徴抽出コンポーネント７２０を含み得る。特徴は、例えば、ソースアドレス、デスティネーションアドレス、パケット到着レート、時間および日付、特定のスイッチおよび特定のデータセンタ、優先度レベル（例えば、サービスレベルアグリーメントまたはＭａｐＲｅｄｕｃｅは、高優先度のものであることになる）であり得る。 The storage device 716 may include a feature extraction component 720 for extracting features from a single traffic unit, such as a packet, or from a sequence of two or more such units. Features may be, for example, source addresses, destination addresses, packet arrival rates, time and date, a particular switch and a particular data center, priority level (e.g., a service level agreement or MapReduce would be of high priority).

記憶デバイス７１６は、トラフィックユニットから抽出された複数の特徴ベクトルを受信し、それらの抽出された特徴ベクトルを、クラスまたはクラスタとも呼称されるグループへクラスタリングし、それにより、同じグループに割り当てられる特徴ベクトルが、あらかじめ決定されたメトリクスによって、異なるグループに割り当てられる特徴ベクトルよりも互いに類似するようにする、クラスタリングコンポーネント７２４を含み得る。 The storage device 716 may include a clustering component 724 that receives feature vectors extracted from traffic units and clusters the extracted feature vectors into groups, also referred to as classes or clusters, such that feature vectors assigned to the same group are more similar to each other according to a predetermined metric than feature vectors assigned to different groups.

記憶デバイス７１６は、例えば特徴抽出コンポーネント７２０により抽出されたような、１つまたは複数の特徴ベクトルを受信し、それらの特徴ベクトルを、クラスタリングコンポーネント７２４により創出されたクラスまたはクラスタへと分類するための分類器コンポーネント７２８を含み得る。 The storage device 716 may include a classifier component 728 for receiving one or more feature vectors, such as those extracted by the feature extraction component 720, and classifying those feature vectors into classes or clusters created by the clustering component 724.

記憶デバイス７１６は、各々のそのようなクラスについて、関連性がある特徴ベクトルを、ならびに、種々のτおよびロケーションベクトルについての、Ｎｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）などの１つまたは複数のグラウンドトゥルース値を受信するための予測エンジン訓練コンポーネント７３２を含み得る。訓練の後、各々のそのような訓練されるエンジンは、入力として特徴ベクトル、おそらくはτおよびロケーション値を受信し、関連性があるＮｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）を出力するように構成される。いくつかの実施形態において、エンジンは、τ値を受信するのではなく、むしろ、τの少数の値の各々、または、特定のτ値、および、好まれる組み合わせをもたらす関連付けられる値について、Ｎｐ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、Ｂｏｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）、およびγ^ｉ _ｃ（ｔ，ｔ＋τ，ｌｏｃａｔｉｏｎ）値のセットを出力することがある。訓練エンジンは、利用可能な特徴ベクトルを、訓練ベクトル、検証ベクトル、および試験ベクトルに区分し得る。訓練されるエンジンは、ニューラルネットワーク、ディープニューラルネットワーク、再帰型ニューラルネットワーク（ＲＮＮ）、長短期記憶、または、任意の他の人工知能エンジンであり得る。 The storage device 716 may include a prediction engine training component 732 for receiving, for each such class, the relevant feature vector, as well as one or more ground truth values, such as Np(t,t+τ,location), Boc(t,t+τ,location), and ^γi _c (t,t+τ,location), for various τ and location vectors. After training, each such trained engine is configured to receive as input the feature vector, possibly τ and location values, and output the relevant Np(t,t+τ,location), Boc(t,t+τ,location), and ^γi _c (t,t+τ,location). In some embodiments, the engine may not receive a τ value, but rather output a set of Np(t,t+τ,location), Boc(t,t+τ,location), and ^γi _c (t,t+τ,location) values for each of a small number of values of τ, or for a particular τ value and associated values that result in a preferred combination. The training engine may partition the available feature vectors into training vectors, validation vectors, and test vectors. The engine being trained may be a neural network, a deep neural network, a recurrent neural network (RNN), a long short-term memory, or any other artificial intelligence engine.

ゲート付き再帰型ユニット（ＧＲＵ）は、ＲＮＮにおけるゲーティング機構である。いくつかの実施形態において、ＧＲＵは、性能を向上させるために使用され得る。ＧＲＵの使用法は、２０１７年８月２１日に公表され、ｈｔｔｐｓ：／／ｔｏｗａｒｄｓｄａｔａｓｃｉｅｎｃｅ．ｃｏｍ／ｌｅｃｔｕｒｅ－ｅｖｏｌｕｔｉｏｎ－ｆｒｏｍ－ｖａｎｉｌｌａ－ｒｎｎ－ｔｏ－ｇｒｕ－ｌｓｔｍｓ－５８６８８ｆ１ｄａ８３ａにおいて利用可能な「Ｅｖｏｌｕｔｉｏｎ：ｆｒｏｍｖａｎｉｌｌａＲＮＮｔｏＧＲＵ＆ＬＳＴＭｓ」に関する、ｈｔｔｐｓ：／／ｔｈｅａｉｓｕｍｍｅｒ．ｃｏｍ／ｇｒｕにおいて２０２０年９月１７日に公表された、ＩｋｏｌａｓＡｄａｌｏｇｌｏｕによる「ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋｓ：ｂｕｉｌｄｉｎｇＧＲＵｃｅｌｌｓＶＳＬＳＴＭｃｅｌｌｓｉｎＰｙｔｏｒｃｈ」、および、
ｈｔｔｐｓ：／／ｄｏｃｓ．ｇｏｏｇｌｅ．ｃｏｍ／ｐｒｅｓｅｎｔａｔｉｏｎ／ｄ／１ＵＨＸｒＫＬ１ｏＴｄｇＭＬｏＡＨＨＰｆＭＭ＿ｓｒＤＯ０ＢＣｙＪＸＰｍｈｅ４ＤＮｈ＿Ｇ８／ｐｕｂ？ｓｔａｒｔ＝ｆａｌｓｅ＆ｌｏｏｐ＝ｆａｌｓｅ＆ｄｅｌａｙｍｓ＝３０００＆ｓｌｉｄｅ＝ｉｄ．ｇ２４ｄｅ７３ａ７０ｂ＿０＿０において利用可能な、関連付けられるスライドショーにおいてさらに詳説され、それらの非特許文献はすべて、それらの全体が、任意の目的のために、参照により組み込まれている。ＧＲＵにおける読み出しおよび忘却ゲートは、特定の時間およびロケーションにおけるトラフィックに依存するということが企図される。 A gated recurrent unit (GRU) is a gating mechanism in an RNN. In some embodiments, GRUs can be used to improve performance. The use of GRUs is described in "Evolution: from vanilla RNN to GRU &LSTMs", published on August 21, 2017 and available at https://towardsdatascience.com/lecture-evolution-from-vanilla-rnn-to-gru-lstms-58688f1da83a, https://theaissummer.com/lecture-evolution-from-vanilla-rnn-to-gru-lstms-58688f1da83a. "Recurrent Neural Networks: building GRU cells VS LSTM cells in Pytorch" by Ikolas Adaloglou, published on September 17, 2020 at http://www.google.com/gru; and
This is further detailed in the associated slide show available at https://docs.google.com/presentation/d/1UHXrKL1oTdgMLoAHHPfMM_srDO0BCyJXPmhe4DNh_G8/pub?start=false&loop=false&delayms=3000&slide=id.g24de73a70b_0_0, all of which non-patent literature is incorporated by reference in its entirety for any purpose. It is contemplated that the read and forget gates in the GRU will depend on the traffic at a particular time and location.

記憶デバイス７１６は、上記で詳説されたコンポーネントの各々が、予期される入力を受信し、その各々の出力が、その出力のデスティネーションに指向され、求められる算出が遂行されるように、フローを管理するための制御およびデータフローコンポーネント７３６を含み得る。例えば、制御フローコンポーネント７３６は、トラフィックユニットを受信し、特徴を抽出するために特徴抽出コンポーネント７２０を起動させ、特徴を分類し、クラスについての関連性があるエンジンを起動させ、例えば現在の値および予測される値に基づいて、適用可能な値を計算し、バッファの割り振りのための値を提供し得る。 The storage device 716 may include a control and data flow component 736 for managing the flow so that each of the components detailed above receives the expected input, their respective outputs are directed to their output destinations, and the required calculations are performed. For example, the control flow component 736 may receive traffic units, invoke the feature extraction component 720 to extract features, classify the features, invoke relevance engines for the classes, calculate applicable values, e.g., based on current and predicted values, and provide values for buffer allocation.

記憶デバイス７１６は、分類器訓練コンポーネント７２８および予測エンジン訓練コンポーネント７３２により訓練され、さらなる特徴ベクトルを分類し、求められる値を予測するために使用されるようなエンジン７４０を含み得る。 The storage device 716 may include an engine 740 that is trained by the classifier training component 728 and the prediction engine training component 732 and used to classify further feature vectors and predict desired values.

本発明は、システム、方法、および／またはコンピュータプログラム製品であり得る。コンピュータプログラム製品は、本発明の態様を履行することをプロセッサに行わせるための、コンピュータ可読プログラム命令を有するコンピュータ可読記憶媒体を含み得る。 The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions for causing a processor to implement aspects of the present invention.

コンピュータ可読記憶媒体は、命令実行デバイスによる使用のための命令を保持および記憶し得る有形デバイスであり得る。コンピュータ可読記憶媒体は、例えば、電子記憶デバイス、磁気記憶デバイス、光学記憶デバイス、電磁記憶デバイス、半導体記憶デバイス、または、前述のものの任意の適した組み合わせであり得るものであるが、それらに限定されない。コンピュータ可読記憶媒体のより具体的な例の非網羅的な列挙は、ポータブルコンピュータディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、読み出し専用メモリ（ＲＯＭ）、消去可能プログラマブル読み出し専用メモリ（ＥＰＲＯＭまたはフラッシュメモリ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、ポータブルコンパクトディスク読み出し専用メモリ（ＣＤ－ＲＯＭ）、デジタルバーサタイルディスク（ＤＶＤ）、メモリスティック、フロッピーディスク、記録された命令を有するパンチカードまたは溝内の隆起構造などの機械的にエンコードされたデバイス、および、前述のものの任意の適した組み合わせを含む。本明細書において使用される際のコンピュータ可読記憶媒体は、本質的には、無線波もしくは他の自由伝搬する電磁波、導波路もしくは他の送信媒体を通って伝搬する電磁波（例えば、ファイバ光学ケーブルを通って進む光パルス）、または、電線を通して送信される電気信号などの、一時的な信号であると解釈されるべきではない。 A computer-readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. A computer-readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of computer-readable storage media includes portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically encoded devices such as punch cards or ridge-in-groove structures with recorded instructions, and any suitable combination of the foregoing. A computer-readable storage medium as used herein should not be construed as being essentially a transitory signal, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., a light pulse traveling through a fiber optic cable), or an electrical signal transmitted over an electrical wire.

本明細書において説明されるコンピュータ可読プログラム命令は、コンピュータ可読記憶媒体からそれぞれのコンピューティング／処理デバイスに、または、ネットワーク、例えば、インターネット、ローカルエリアネットワーク、ワイドエリアネットワーク、および／もしくはワイヤレスネットワークを介して、外部コンピュータもしくは外部記憶デバイスにダウンロードされ得る。ネットワークは、銅送信ケーブル、光学送信ファイバ、ワイヤレス送信、ルータ、ファイアウォール、スイッチ、ゲートウェイコンピュータ、および／またはエッジサーバを含み得る。各コンピューティング／処理デバイス内のネットワークアダプタカードまたはネットワークインターフェースは、ネットワークからコンピュータ可読プログラム命令を受信し、それぞれのコンピューティング／処理デバイスの中のコンピュータ可読記憶媒体内での記憶のために、コンピュータ可読プログラム命令をフォワーディングする。 The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to the respective computing/processing device or to an external computer or storage device via a network, e.g., the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical transmission fiber, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

本発明の動作を履行するためのコンピュータ可読プログラム命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、機械命令、機械依存命令、マイクロコード、ファームウェア命令、状態セッティングデータ、または、「Ｃ」、Ｃ＃、Ｃ＋＋、Ｊａｖａ、Ｐｈｙｔｏｎ、Ｓｍａｌｌｔａｌｋなどの、１つもしくは複数のプログラミング言語の任意の組み合わせにおいて書き表される、ソースコードとオブジェクトコードとのいずれかであり得る。コンピュータ可読プログラム命令は、全体的にユーザのコンピュータ上で、部分的にユーザのコンピュータ上で、スタンドアローンソフトウェアパッケージとして、部分的にユーザのコンピュータ上、および、部分的にリモートコンピュータ上で、または、全体的にリモートコンピュータもしくはサーバ上で実行し得る。最後のもののシナリオにおいて、リモートコンピュータは、ローカルエリアネットワーク（ＬＡＮ）もしくはワイドエリアネットワーク（ＷＡＮ）を含む任意のタイプのネットワークを通してユーザのコンピュータに接続され得るものであり、または、接続は、外部コンピュータに対して（例えば、インターネットサービスプロバイダを使用して、インターネットを通して）なされ得る。いくつかの実施形態において、例えば、プログラマブル論理回路、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、またはプログラマブル論理アレイ（ＰＬＡ）を含む、電子回路が、本発明の態様を遂行するために、電子回路をパーソナライズするために、コンピュータ可読プログラム命令の状態情報を利用することにより、コンピュータ可読プログラム命令を実行し得る。 The computer readable program instructions for implementing the operations of the present invention may be either assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, or source and object code written in any combination of one or more programming languages, such as "C", C#, C++, Java, Phyton, Smalltalk, etc. The computer readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (e.g., through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry, including, for example, a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), may execute computer-readable program instructions by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry to perform aspects of the invention.

本発明の態様は、本発明の実施形態による方法、装置（システム）、およびコンピュータプログラム製品の、フローチャート図および／またはブロック図を参照して、本明細書において説明される。フローチャート図および／またはブロック図の各ブロック、ならびに、フローチャート図および／またはブロック図におけるブロックの組み合わせが、コンピュータ可読プログラム命令により実現され得るということが理解されるであろう。 Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

これらのコンピュータ可読プログラム命令は、コンピュータまたは他のプログラマブルデータ処理装置のプロセッサによって実行する命令が、フローチャートおよび／またはブロック図の１つまたは複数のブロックにおいて指定される機能／動作を実現するための手段を創出するように、マシンを生み出すために、汎用コンピュータ、専用コンピュータ、または、他のプログラマブルデータ処理装置のプロセッサに提供され得る。これらのコンピュータ可読プログラム命令はまた、内に命令が記憶されたコンピュータ可読記憶媒体が、フローチャートおよび／またはブロック図の１つまたは複数のブロックにおいて指定される機能／動作の態様を実現する命令を含む製造品を成立させるように、特定の様式において機能するように、コンピュータ、プログラマブルデータ処理装置、および／または、他のデバイスに指図し得る、コンピュータ可読記憶媒体内に記憶され得る。 These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing apparatus create means for implementing the functions/operations specified in one or more blocks of the flowcharts and/or block diagrams. These computer-readable program instructions may also be stored in a computer-readable storage medium in which the instructions are stored may direct a computer, a programmable data processing apparatus, and/or other device to function in a particular manner to produce an article of manufacture that includes instructions that implement aspects of the functions/operations specified in one or more blocks of the flowcharts and/or block diagrams.

コンピュータ可読プログラム命令はまた、コンピュータ、他のプログラマブル装置、または、他のデバイス上で実行する命令が、フローチャートおよび／またはブロック図の１つまたは複数のブロックにおいて指定した機能／動作を実行するよう、コンピュータで実現されるプロセスを生み出すように、一連の動作ステップが、コンピュータ、他のプログラマブル装置、または、他のデバイス上で遂行されるために、コンピュータ、他のプログラマブルデータ処理装置、または、他のデバイス上へとロードされ得る。 The computer-readable program instructions may also be loaded onto a computer, other programmable data processing device, or other device to produce a computer-implemented process in which a series of operational steps are performed on the computer, other programmable device, or other device such that the instructions executing on the computer, other programmable device, or other device perform the functions/operations specified in one or more blocks of the flowcharts and/or block diagrams.

図におけるフローチャートおよびブロック図は、本発明の様々な実施形態によるシステム、方法、およびコンピュータプログラム製品の可能な実現形態のアーキテクチャ、機能、および動作を例解する。この点について、フローチャートまたはブロック図における各ブロックは、指定される論理機能を実現するための１つまたは複数の実行可能命令を含む、命令のモジュール、セグメント、または一部分を表し得る。いくつかの代替的な実現形態において、ブロックにおいて記される機能は、図において記される順序を外れて行われることがある。例えば、連続して示される２つのブロックは、実際には、実質的に並行的に実行されることがあり、または、ブロックは、時には、含まれる機能に依存して、逆の順序において実行されることがある。さらに、ブロック図および／またはフローチャート図の各ブロック、ならびに、ブロック図および／またはフローチャート図におけるブロックの組み合わせは、指定される機能もしくは動作を遂行する、または、専用ハードウェアおよびコンピュータ命令の組み合わせを履行する、専用ハードウェアベースのシステムにより実現され得るということが、指摘される。 The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagram may represent a module, segment, or portion of instructions, including one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions noted in the blocks may be performed out of the order noted in the figures. For example, two blocks shown in succession may in fact be executed substantially in parallel, or the blocks may sometimes be executed in the reverse order, depending on the functionality involved. Furthermore, it is pointed out that each block of the block diagrams and/or flowchart diagrams, as well as combinations of blocks in the block diagrams and/or flowchart diagrams, may be realized by a dedicated hardware-based system that performs the specified functions or operations, or that implements a combination of dedicated hardware and computer instructions.

本明細書において使用される用語は、単に特定の実施形態を説明する目的のためのものであり、本発明について限定的であることを意図されない。本明細書において使用される際、単数形「ａ」、「ａｎ」、および「ｔｈｅ」は、文脈において別段に明確に指示しない限り、複数形もまた含むことを意図される。用語「含む（ｃｏｍｐｒｉｓｅｓ）」および／または「含む（ｃｏｍｐｒｉｓｉｎｇ）」は、本明細書において使用されるとき、説述される特徴、整数、ステップ、動作、要素、および／または構成要素の存在を明記するが、１つまたは複数の、他の特徴、整数、ステップ、動作、要素、構成要素、および／または、それらによる群の、存在または追加を排除しないということが、さらに理解されることになろう。 The terms used herein are merely for the purpose of describing particular embodiments and are not intended to be limiting of the present invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will be further understood that the terms "comprises" and/or "comprising", as used herein, specify the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

下記の特許請求の範囲における、対応する構造、材料、動作、および、すべてのミーンズまたはステッププラスファンクション要素の均等物は、具体的に特許請求されるように、他の特許請求される要素と組み合わせて機能を遂行するための任意の構造、材料、または動作を含むことを意図される。本発明の説明は、図示および説明の目的のために提示されたが、網羅的であること、または、開示される形式で本発明に限定されることを意図されない。多くの修正および変形が、本発明の範囲および趣旨から逸脱することなく、当業者には明白になるであろう。実施形態は、本発明の原理、および、実用的な用途を最も良好に解説するために、ならびに、他の当業者が、企図される特定の使用に適するような様々な修正を伴う様々な実施形態のために本発明を理解することを可能にするために、選定および説明された。 The corresponding structures, materials, acts, and equivalents of all means or step-plus-function elements in the following claims are intended to include any structure, material, or act for performing a function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or to limit the invention to the form disclosed. Many modifications and variations will become apparent to those skilled in the art without departing from the scope and spirit of the invention. The embodiments have been chosen and described to best explain the principles and practical application of the invention, and to enable others skilled in the art to understand the invention for various embodiments with various modifications as suited to the particular use contemplated.

Claims

1. A method for managing traffic in a communications network, comprising:
receiving a plurality of traffic units to be transmitted by the switch through a port, the port having an associated queue;
extracting features from the plurality of traffic units;
providing the characteristics to a first engine to obtain a class for the plurality of traffic units;
obtaining an indication of a predicted amount of traffic for the class for a future time and for a physical location of a switch transmitting the plurality of traffic units using a second engine associated with a traffic model for the class;
allocating a queue of a size corresponding to said indication of said expected traffic volume;
and allocating at least one of said traffic units to said buffer.

The method of claim 1, wherein the traffic unit is a packet.

receiving a preliminary plurality of traffic units to be transmitted;
extracting features from each of the preliminary plurality of traffic units to obtain a plurality of feature vectors;
clustering the feature vectors into a plurality of classes;
and training the first engine to receive the plurality of traffic units and output the class from the plurality of classes.

2. The method of claim 1, further comprising: training the second engine on a subset of a plurality of feature vectors that are assigned to a particular class, such that the second engine is adapted by the traffic model to provide the indication of the predicted traffic volume for the class.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is the current time and τ is a time interval.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on available buffer sizes at the current time and at the future time.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on a number of congested queues of the priority class at the future time.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on a normalized dequeue rate of the queue at the future time.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on a priority of an application or site associated with the plurality of traffic units.

The method of claim 1, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on coefficients associated with the class.

The method of claim 1, wherein the traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval, and the traffic volume is predicted based on the physical location of the switch.

The predicted traffic volume is calculated using the following formula:
T ⁱ _c (t, t + τ, location) =
α _c · 1 / Np '(t, t + τ, location) · γ _i ^c '(t, t + τ, location) · (B - Boc '(t, t + τ, location))
is predicted by
here,
i is the index of the port,
c is a class of the plurality of traffic units;
t is the current time,
τ is the time difference to the future,
α _c is the coefficient assigned to class c,
Location is the physical location of the switch within a data center;
Np'(t, t+τ, location) is
Np(t,location), the number of congested queues of the class of priority at time t for the switch; and
Np(t+τ,location) is the number of congested queues of priority p of the class at time t+τ for the switch.
is a variation or combination of
B-B _oc '(t, t+τ, location) is
B-B _oc (t, location), the remaining buffer for the switch at time t; and
B-B _oc (t+τ, location) is the remaining buffer at time t+τ for the switch.
is a variation or combination of
γ ⁱ _c ′(t, t+τ, location) is
γ ⁱ _c (t,location), the normalized dequeue rate per port of the queue of class c at time t for the switch; and
γ ⁱ _c (t+τ,location) is the normalized dequeue rate per port of the i-th queue of class c at time t+τ for the switch.
The method of claim 1, wherein the α-amino acid is a variation or combination of

The method of claim 1, wherein the queue is not emptied in a first-in, first-out manner.

The method of claim 12, wherein multiple traffic units are dequeued from the queue simultaneously.

1. A computer program product including a non-transitory computer-readable storage medium bearing program instructions, the program instructions, when read by a processor,
receiving a plurality of traffic units to be transmitted by the switch through a port, the port having an associated queue;
extracting features from the plurality of traffic units;
providing the characteristics to a first engine to obtain a class for the plurality of traffic units;
obtaining an indication of a predicted amount of traffic for the class for a future time and for a physical location of a switch that transmits the plurality of traffic units using a second engine associated with a traffic model for the class;
allocating a queue of a size corresponding to said indication of said expected traffic volume;
and allocating at least one of the traffic units to the buffer.

The computer program product of claim 15, wherein the traffic unit is a packet.

The program instructions include:
receiving a preliminary plurality of traffic units to be transmitted;
extracting features from each of the preliminary plurality of traffic units to obtain a plurality of feature vectors;
clustering the feature vectors into a plurality of classes;
16. The computer program product of claim 15, further causing the processor to: train the first engine to receive the plurality of traffic units and output the class from the plurality of classes.

The program instructions include:
16. The computer program product of claim 15, further causing the processor to: train the second engine based on a subset of feature vectors that are assigned to a particular class, such that the second engine is adapted to provide the indication of the expected traffic volume for the class by the traffic model.

The computer program product of claim 15, wherein the predicted traffic volume is predicted for a future time t+τ, where t is a current time and τ is a time interval.

The predicted traffic volume is
the available buffer size at the current time and at said future time;
the number of congested queues of said class of priority at said future time;
a normalized dequeue rate for the queue at the future time;
a priority of an application or site associated with said plurality of traffic units;
a coefficient associated with the class; and
20. The computer program product of claim 15, wherein the prediction is based on at least one item selected from an enumeration consisting of the physical location of the switch.