CN111858061A

CN111858061A - Distributed programmable switch resource capacity expansion method

Info

Publication number: CN111858061A
Application number: CN202010728948.4A
Authority: CN
Inventors: 张栋; 刘宏岩; 陈翔
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-10-30
Anticipated expiration: 2040-07-27
Also published as: CN111858061B

Abstract

The invention relates to a distributed programmable switch resource capacity expansion method. The method and the system aggregate physical resources of physical network nodes to meet the resource requirements of the application program, realize the correct operation of the program through the program placer and the runtime manager and shield complex bottom-level details for users.

Description

Distributed programmable switch resource capacity expansion method

Technical Field

The invention relates to the field of programmable network equipment, in particular to a distributed programmable switch resource capacity expansion method.

Background

Programmable devices that are common today are divided into hardware and software. The software devices are typically virtual switches running on servers, such as OpenvSwitch, Bmv2, and the like. The advantage is that the CPU can run the network function program realized based on a plurality of programming languages, and has high flexibility. The disadvantage is that the CPU has limited computational power. A hardware device is typically referred to as a programmable switch. The method has the advantages of strong computing power, and can realize the data packet processing of linear speed to reach the Tbps level. The disadvantage is that the memory resource of the hardware is limited, and the required precision of the large-scale network function can not be achieved. To improve performance, researchers have attempted to employ network functions implemented using programmable switches using ASIC chips in combination with P4 by offloading the network functions to hardware. The switch of the ASIC chip provides excellent performance but is not programmable, and the ASIC in combination with P4 makes up for this deficiency. More and more network functions are being offloaded to programmable switches through P4 language compilers, in the form of P4 programs, to meet their performance requirements. However, the memory resources (e.g., TCAM and SRAM) of programmable switch chips are limited. Experiments show that when resources are limited, the operating efficiency and accuracy of some complex network functions are often lost. In this context, how to break the limited resource limit of the programmable switch is the key to improve the quality and efficiency of network service.

Disclosure of Invention

The invention aims to provide a distributed programmable switch resource capacity expansion method which can effectively solve the problem of insufficient resources of a single programmable switch

In order to achieve the purpose, the technical scheme of the invention is as follows: a distributed programmable switch resource capacity expansion method comprises the following steps:

step S1, the user compiles a needed data plane program according to the compiling instruction, and aggregates the resources of the appointed distributed programmable switch through the compiling instruction to abstract the resources into an OBS;

step S2, dividing the data plane program written by the user through the program placer according to the compiling instruction used by the user, and respectively deploying the divided code segments to the distributed programmable switch selected by the user in the step S1;

step S3, scanning a data packet processing logic PPL contained in a data plane program written by a user through a program placer, and inserting a module into a code segment obtained by division according to the scanned PPL to maintain the original data packet processing logic so as to ensure that the data plane program written by the user can correctly realize the original function after being divided;

step S4, compiling the final code segment through the program placer, generating a configuration file and deploying the configuration file to the distributed programmable switch selected by the user in step S1;

and step S5, after the deployment is successful, the user realizes two operations of rule issuing and statistical information collection through the runtime manager.

In an embodiment of the present invention, in step S5, for rule issuing, since the program is deployed to multiple distributed programmable switches in a decentralized manner, a rule needs to be installed on multiple switches at the same time, and the runtime manager generates an additional rule in order to maintain correctness of rule installation; for statistics collection, the runtime manager aggregates the information on the distributed programmable switches and provides the aggregated information to the user.

In an embodiment of the invention, the method adopts a P4 language, and a compiling instruction @ pragma sw [ ID ] is newly added on the basis of the P4 language, so that a user is allowed to distribute components of a data plane program to any one underlying distributed programmable switch; the @ pragma sw [ ID ] is used for specifying the association relationship between a program component and the underlying distributed programmable switch, and the instruction is positioned in front of the component definition and indicates that the component is to be placed on the underlying distributed programmable switch with the identifier [ ID ].

In an embodiment of the present invention, in the step S2, the process of dividing the data plane program written by the user through the program placer includes: the program placer searches the user-written data plane program for the called compilation instructions, creates an empty code section for each found distributed programmable switch ID, and populates the code section with MAT code and stateful element slices that need to be deployed to the current distributed programmable switch ID.

In an embodiment of the present invention, in the step S3, the program placer is capable of identifying an affected PPL in the data plane program, and inserting a module into the divided code segment to ensure normal execution of the data plane program, where the affected PPL is divided into two types: (1) because different MATs may be deployed on different switches after program partitioning, MAT dependencies defined in the input program may be disturbed, and the program placer should be maintained and modified; (2) if a stateful element is split into multiple slices after the data plane program is split, a connection should be maintained between the slices to ensure communication between the stateful element slices.

In one embodiment of the invention, at the time of rule installation, the runtime manager will generate additional rules to maintain the correct execution of the rules; for each rulerRun-time manager identificationrCorresponding MATMrRun time manager will locateMrThe underlying distributed programmable switch and rulerTo the distributed programmable switch, if, in additionMrThe method comprises the steps that under the influence of MAT dependency, an additional rule is generated by a runtime manager, the rule is installed on a target distributed programmable switch to maintain the MAT dependency, when statistical information is collected, the runtime manager regularly collects the statistical information from stateful element slices located in different distributed programmable switches, the runtime manager collects the statistical information collected from each slice and feeds the collected statistical information back to a user, and through the mode, the runtime manager can use the collected statistical information to quickly access the stateful elements by corresponding users without exposing bottom-layer details.

Compared with the prior art, the invention has the following beneficial effects: the invention provides a distributed programmable switch resource capacity expansion method, which solves the problem of insufficient resources of the conventional single programmable switch by a method of aggregating a plurality of distributed programmable switch resources, effectively improves the execution efficiency of an application program, and has the advantages of simplicity, flexible realization and stronger practicability.

Drawings

Fig. 1 is a schematic structural diagram of a distributed programmable switch resource capacity expansion method.

Fig. 2 is a flowchart of a method for expanding the distributed programmable switch resource.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

The invention provides a distributed programmable switch resource capacity expansion method, which comprises the following steps:

step S5, after the deployment is successful, the user implements two operations of rule issuing and statistical information collection through the runtime manager: for rule issuing, because a program is dispersedly deployed to a plurality of distributed programmable switches, a rule needs to be simultaneously installed on a plurality of switches, and meanwhile, in order to maintain the correctness of rule installation, an additional rule can be generated by a runtime manager; for statistics collection, the runtime manager aggregates the information on the distributed programmable switches and provides the aggregated information to the user.

The following is a specific implementation of the present invention.

Referring to fig. 1 and fig. 2, the method for expanding the distributed programmable switch resource according to the present invention aggregates physical resources of a plurality of underlying distributed programmable switches, and abstracts the distributed switches into obs (one Big switch) to provide them to users. The resource requirements of most application programs are met, and the resource constraint brought by a single programmable switch is broken. While ensuring proper execution of the application and shielding the user from complex underlying details. The method comprises the following specific steps:

1. a user writes a data plane program based on a compiling instruction provided by the system, and aggregates the designated physical switch resources through the compiling instruction to abstract the physical switch resources into an OBS. Compiling the instructions allows the user to distribute the components of the data plane program to any one of the underlying programmable switches. The @ pragma sw [ ID ] is used to specify the association of a program component (MAT or stateful counter) with the underlying switch. The instruction is located before the component definition and indicates that the component is to be placed on the underlying switch with the identifier [ ID ].

2. The system divides the program written by the user according to the compiling instruction used by the user. The program dividing process comprises the following steps: the program placer searches the input program for the called compilation instruction (which often corresponds to one or more switch IDs). For each switch ID found, the program placer creates an empty code section for it and populates that code section with the MAT code and stateful element slices that need to be deployed to the current switch ID.

3. A program placer in the system scans all Packet processing Logic (PPLs, PPL, or Packet Process Logic) contained in the user program. According to the scanned data packet processing logic, inserting a module into the divided code segment to maintain the original data packet processing logic so as to ensure that the user program can correctly realize the original function after being divided; the program placer will identify the affected PPLs in the program and recover them using the plug-in module. The affected PPLs are specifically divided into two categories: (1) since different MATs may be deployed on different switches after program partitioning, MAT dependencies (i.e., execution order) defined in the incoming program may be disrupted and the program placer should be maintained and modified. (2) If a stateful element is split into multiple slices after program splitting, a connection should be maintained between the slices to ensure communication between the stateful element slices. For example, there is a dependency between MAT a and MAT C (MAT a is placed to switch 1 and MAT C is placed to switch 2). To maintain this dependency, the program placer insertion module "send _ to _ switch 2" directs the packet matching MAT A to MAT C for further processing; if the stateful element is divided into two slices, slice 1 and slice 2. Slice 1 retains the first four columns of stateful elements and slice 2 retains the last four columns. To complete the update operation for the stateful element, the program placer will add a "Sender" module in slice 1 and a "Receiver" module in slice 2. "Sender" determines whether the index of the accessed element exceeds 3. If so, pass the index value to "Receiver" update slice 2, otherwise update slice 1 directly.

4. After the steps are completed, the code segments of each distributed switch are generated. The program placer in the system compiles the final code segment to generate a configuration file and deploys the configuration file to a corresponding physical switch

5. After the deployment is successful, the user can use the runtime manager in the system to realize two operations of rule issuing and statistical information collection. For rule delivery, since programs are deployed to multiple physical switches in a decentralized manner, a rule may need to be installed on multiple switches at the same time. At the same time, the runtime manager may generate additional rules in order to maintain the correctness of the rule installation. For each rulerRun-time manager identificationrCorresponding MATMr. Runtime manager can locateMrThe underlying switch, and rulesrTo the switch. Furthermore, ifMrAffected by the MAT dependency, the runtime manager will generate an additional rule and install this rule into the target switch to maintain the MAT dependency. For statistics collection, the runtime manager periodically collects statistics from stateful element slices located in different switches. And the runtime manager summarizes the statistical information collected from each slice and feeds the summarized statistical information back to the user. In this way, theThe runtime manager can use the collected statistics to quickly respond to a user's access to the tape status element without exposing underlying details.

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A distributed programmable switch resource capacity expansion method is characterized by comprising the following steps:

2. The method for expanding the capacity of the distributed programmable switch resource of claim 1, wherein in step S5, for rule issuing, since the program is deployed to multiple distributed programmable switches in a decentralized manner, a rule needs to be installed on multiple switches at the same time, and in order to maintain the correctness of rule installation, the runtime manager generates an additional rule; for statistics collection, the runtime manager aggregates the information on the distributed programmable switches and provides the aggregated information to the user.

3. The method for expanding the capacity of the distributed programmable switch resources according to claim 1, wherein the method adopts a P4 language, and adds a compiling instruction @ pragma sw [ ID ] based on the P4 language to allow a user to distribute the components of the data plane program to any one underlying distributed programmable switch; the @ pragma sw [ ID ] is used for specifying the association relationship between a program component and the underlying distributed programmable switch, and the instruction is positioned in front of the component definition and indicates that the component is to be placed on the underlying distributed programmable switch with the identifier [ ID ].

4. The method for expanding the capacity of the distributed programmable switch resource according to claim 1, wherein in step S2, the process of partitioning the user-written data plane program by the program placer includes: the program placer searches the user-written data plane program for the called compilation instructions, creates an empty code section for each found distributed programmable switch ID, and populates the code section with MAT code and stateful element slices that need to be deployed to the current distributed programmable switch ID.

5. The method for resource expansion of a distributed programmable switch according to claim 1, wherein in step S3, the program placer is capable of identifying an affected PPL in the data plane program and inserting a module into the partitioned code segment to ensure normal execution of the data plane program, wherein the affected PPL is divided into two types: (1) because different MATs may be deployed on different switches after program partitioning, MAT dependencies defined in the input program may be disturbed, and the program placer should be maintained and modified; (2) if a stateful element is split into multiple slices after the data plane program is split, a connection should be maintained between the slices to ensure communication between the stateful element slices.

6. The method of claim 2, wherein, during rule installation, the runtime manager generates additional rules to maintain the rules executing correctly; for each rulerRun-time manager identificationrCorresponding MATMrRun time manager will locateMrThe underlying distributed programmable switch and rulerTo the distributed programmable switch, if, in additionMrThe method comprises the steps that under the influence of MAT dependency, an additional rule is generated by a runtime manager, the rule is installed on a target distributed programmable switch to maintain the MAT dependency, when statistical information is collected, the runtime manager regularly collects the statistical information from stateful element slices located in different distributed programmable switches, the runtime manager collects the statistical information collected from each slice and feeds the collected statistical information back to a user, and through the mode, the runtime manager can use the collected statistical information to quickly access the stateful elements by corresponding users without exposing bottom-layer details.