CN113518012B - Distributed cooperative flow simulation environment construction method and system - Google Patents

Distributed cooperative flow simulation environment construction method and system Download PDF

Info

Publication number
CN113518012B
CN113518012B CN202111058895.0A CN202111058895A CN113518012B CN 113518012 B CN113518012 B CN 113518012B CN 202111058895 A CN202111058895 A CN 202111058895A CN 113518012 B CN113518012 B CN 113518012B
Authority
CN
China
Prior art keywords
task
module
flow
node
state monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111058895.0A
Other languages
Chinese (zh)
Other versions
CN113518012A (en
Inventor
梁元
邱启仓
姚少峰
肖戈扬
邹涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202111058895.0A priority Critical patent/CN113518012B/en
Publication of CN113518012A publication Critical patent/CN113518012A/en
Application granted granted Critical
Publication of CN113518012B publication Critical patent/CN113518012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The invention discloses a method and a system for constructing a distributed cooperative flow simulation environment. The invention calculates directed acyclic graph information, network environment topology information and cooperative flow statistical distribution data based on the distributed cluster, and generates simulated cooperative flow which accords with real calculation flow characteristics. The simulation environment constructed by the method allows a user to define a message structure, a network structure and operation characteristics, and provides an accurate and controllable experimental environment for the user to carry out the flow scheduling research. The method shields the randomness and the complexity of the communication stage of the calculation task in the real calculation cluster environment, can realize the communication environment simulation and the flow replay scene according to the requirement, and effectively supports the experiment and the test work of the Coflow scheduling research.

Description

Distributed cooperative flow simulation environment construction method and system
Technical Field
The invention belongs to the technical field of computer network flow simulation, and particularly relates to a distributed cooperative flow simulation environment construction method and system.
Background
The new generation information technology is deeply integrated with the manufacturing industry, more and more sensors of the internet of things are embedded into an industrial production line and used for collecting industrial information data of various devices in each link in the manufacturing process. With the development of digital twin technology and the continuous collection of industrial data, the development and application of industrial big data become a necessary trend. Most of the computational analysis tasks require cluster resources, and these tasks often involve a large amount of data exchange. Typical applications of distributed clustering break up complex computing tasks into distinct computing phases with co-ordination and dependencies, and deliver different computing devices to perform computations and data distribution. According to statistics, in typical distributed cluster computing, 50% -70% of task time is occupied by a data exchange process in a network. Therefore, under the condition that the limitation of computing resources such as a cluster computing node CPU and the like is not considered, the exchange transmission of data in the network is optimized, and the efficiency of computing tasks can be obviously improved.
The network traffic in the distributed cluster computing scenario is characterized by different dependent or cooperative computing stages, where a computing stage generally involves the transmission of a series of data streams from different computing devices interleaved on a communication device, and these data streams with a common optimization goal are referred to as a flow. Obviously, the traditional network scheduling method for a single data stream cannot complete network scheduling for the data stream characteristics of a distributed cluster scenario. Therefore, the scheduling research for Coflow is in force, and the intelligent scheduling model is usually trained by adopting two types of machine learning methods, namely supervised method and unsupervised method. However, the communication components of a typical distributed cluster computing framework cannot generate data messages with flow identifiers, and the distributed cluster computing framework dynamically determines the task execution process according to the cluster resource state, so that it is difficult to implement a controllable repetitive scene for training and iteratively scheduling a model.
At present, the flow scheduling research mostly adopts a distributed cluster network packet capturing and replaying mode to train or simulate to generate a flow statistic value so as to train a model, and a complete set of technology and a method for collaborative flow construction simulation at a network communication stage under a distributed cluster computing scene are lacked.
Disclosure of Invention
The invention aims to provide a method and a system for constructing a distributed cooperative flow simulation environment aiming at the defects of the prior art. The method and the device aim to fill the technical blank in the field of collaborative flow structure simulation in the distributed cluster computing scene in the flow scheduling research process, and improve the test efficiency and accuracy of the flow scheduling technical research.
The invention is realized by the following technical scheme: the invention discloses a distributed cooperative flow simulation environment construction system which comprises a user system, a management node module, an execution node cluster and a basic network environment-programmable data platform.
The user system consists of a DAG definition module, a network topology structure definition module, a message structure definition module and a state monitoring module; the DAG definition module completes the definition of a calculation stage and a communication stage by defining nodes and edges of the directed acyclic graph DAG and outputs a directed acyclic graph DAG file; the network topology structure definition module defines terminal nodes in a simulation environment and outputs a network topology structure file; the message structure definition module selects a transmission layer protocol and defines a flow message header to output a self-defined message structure file; the state monitoring module acquires state monitoring data output by the management node module in real time and displays the state monitoring data to a user; and the defined directed acyclic graph DAG file, the network topology structure file and the message structure file form a cooperative flow construction configuration file.
The management node module consists of a task management module and a global task state monitoring module; the task management module checks the integrity and the effectiveness of a user-defined cooperative traffic construction configuration file, sends out a traffic construction task and starts a task process; the global task state monitoring module collects running state monitoring data of a plurality of execution nodes through the management node and periodically reports the running state monitoring data to the state monitoring module of the user system.
The execution node cluster consists of a task execution module and a node state monitoring module; the task execution module receives and executes the flow construction task sent by the task management module; the node state monitoring module counts the traffic transceiving conditions at the message sending terminal and the receiving terminal through the execution node, and periodically reports the execution conditions of the traffic construction task to the management node.
The basic network environment-programmable data platform is composed of programmable switching equipment and is used for forwarding the customized and reconstructed cooperative flow message.
The invention discloses a method for constructing a distributed cooperative flow simulation environment, which comprises the following steps of;
(1) defining a directed acyclic graph DAG, a network topology structure and a message structure of a collaborative flow construction simulation environment, and outputting directed acyclic graph DAG files, network topology structure files and message structure files; and the defined directed acyclic graph DAG file, the network topology structure file and the message structure file form a cooperative flow construction configuration file.
(2) The management node reads the cooperative flow construction configuration file, starts a cooperative flow construction task, acquires the running state monitoring data of the execution node, and periodically reports the running state monitoring data to the state monitoring module of the user system.
(3) The execution node executes the cooperative flow construction task, receives an execution management node command, performs customized reconstruction on the cooperative flow message, outputs the customized reconstructed cooperative flow message, monitors the running state, and sends the state to the management node module.
(4) And (4) receiving and forwarding the customized and modified cooperative flow message output in the step (3) by the basic network environment-programmable data platform.
Further, the step (1) includes the sub-steps of:
(1.1) defining a directed acyclic graph DAG structure by a user through a DAG definition module in a user system, and outputting directed acyclic graph DAG files; the definition to the acyclic graph DAG comprises the definitions of computing nodes and communication processes, wherein the DAG nodes represent computing phases, and the DAG directed edges represent communication phases;
(1.2) a user defines a terminal node in a simulation environment through a network topology structure definition module in a user system and outputs a network topology structure file; the terminal nodes comprise a sending terminal and a receiving terminal of the cooperative flow;
(1.3) determining a transmission layer protocol and a flow self-defined message structure of the cooperative flow message by a user through a message structure definition module in a user system, and outputting a message structure file;
and (1.4) after the user finishes the acyclic graph DAG structure, the network topology structure and the message structure defined in the steps (1.1) to (1.3), sending a starting command of the collaborative flow simulation task to the management node. And after the task is started, entering a task state monitoring module to acquire a task execution state.
Further, the step (2) includes the sub-steps of:
(2.1) reading the cooperative flow construction configuration file through a task management module of the management node, and verifying the integrity and effectiveness of parameters in the cooperative flow construction configuration file;
(2.2) detecting the running state of each execution node through a global task state monitoring module of the management node;
(2.3) the task management module starts a cooperative traffic construction task and sends corresponding traffic construction task parameters to each execution node at different task stages through a communication mechanism of each execution node;
and (2.4) receiving the task state monitoring data returned by the execution node by the global task state monitoring module, summarizing and analyzing the task state monitoring data to obtain the global state of task execution, and periodically uploading the global state to the state monitoring module of the user system.
Further, the step (3) includes the sub-steps of:
(3.1) after receiving the state detection signal of the management node, the node state monitoring module in the execution node cluster feeds back the running state of the node to the management node;
(3.2) after a task execution module in the execution node cluster receives a flow construction task sent by a management node, analyzing flow construction task parameters, constructing messages according to flow size and speed specified by the parameters, taking over a network card through a data plane development kit, carrying out customized transformation on a cooperative flow message according to a message structure defined by a user, and sending the cooperative flow message to the basic network environment;
(3.3) the task execution module in the execution node cluster takes over the network card through the data plane development kit, counts the received cooperative flow message, and performs packet loss operation on the received cooperative flow message after counting is completed;
and (3.4) the node state monitoring module in the execution node cluster periodically sends the running state of the execution node to the global task state monitoring module of the management node for gathering.
The invention has the beneficial effects that: (1) the invention provides rich and controllable test environment for the flow scheduling research, researchers do not need to care about the computation stage of a cluster computation scene, the invention decouples the computation stage and the communication stage in the distributed cluster computation, only defines a Directed-Acyclic Graph (DAG) to represent the relationship between the computation stage and the communication stage and the traffic characteristics of the computation stage, defines the network topology of the distributed cluster terminal nodes, and can rapidly, stably and repeatedly obtain the cooperative traffic conforming to the statistical characteristics by using a single flow statistical characteristic. (2) The simulation environment constructed by the method allows a user to design a message structure and monitor the state of the sending and receiving tasks of the flow, and the optimization effects of different scheduling strategies in the same network scene can be visually compared. (3) The method supports the collaborative flow message structure to be defined according to the requirement of the flow scheduling strategy, and provides more possibilities for flow scheduling research by matching with the programmable switching equipment. (4) The method shields the randomness and the complexity of the communication stage of the calculation task in the real calculation cluster environment, can realize the communication environment simulation and the flow replay scene according to the requirement, and effectively supports the experiment and the test work of the Coflow scheduling research.
Drawings
FIG. 1 is a block diagram of the architecture of a distributed collaborative traffic construction simulation environment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
As shown in fig. 1, the present invention discloses a method and a system for constructing a distributed cooperative traffic simulation environment, including a user system, a management node module, an execution node cluster, and a basic network environment-programmable data platform. The user system is composed of a DAG definition module, a network topology structure definition module, a message structure definition module and a state monitoring module.
The user system prefers a graphical interface design; the DAG definition module completes the definition of a computing stage and a communication stage by defining nodes and edges of a Directed-Acyclic Graph DAG (DAG), and outputs a Directed-Acyclic Graph DAG file; and outputting and storing the recognizable adjacency list data structure after the user completes DAG definition.
The network topology structure definition module defines the terminal node condition in the simulation environment, and comprises a selective message sending terminal and a selective message receiving terminal, and a network topology structure file is output. The message sending terminal and the message receiving terminal can be multiplexed, namely, the multiplexing type terminal can simultaneously carry out message receiving and sending work. And after the user finishes the definition of the network topology structure, outputting and storing a dictionary data structure which can be identified by the management node.
The message structure defining module selects a transmission layer protocol and self-defines an additional message structure, and a message structure file is output by adding the variable name and the size of a message header. And after the user finishes the definition of the message structure, outputting and storing a dictionary data structure which can be identified by the management node.
And the directed acyclic graph DAG file, the network topology structure file and the message structure file form a cooperative flow construction configuration file.
The state monitoring module acquires state monitoring data output by the node management module in real time, wherein the state monitoring data comprise an operating state and alarm information, and the operating state and the alarm information are displayed to a user in a visual mode.
The management node module consists of a task management module and a global task state monitoring module; the task management module first checks the integrity and validity of the user-defined collaborative traffic build profile. If the checking result of the cooperative flow construction configuration file is abnormal, sending alarm information of abnormal configuration to a state monitoring module of the user system, and terminating the task; and if the check result of the configuration file is normal, sending a flow construction task, starting a task process, and detecting the state of each execution node in the execution node cluster through a heartbeat mechanism. The global task state monitoring module collects task execution states reported by a plurality of execution nodes through a management node, and periodically reports the global task execution states to the state monitoring module of the user system after data analysis. If the execution node is abnormal, outputting alarm information of the abnormal execution node to a state monitoring module of the user system, and terminating the task; and if the execution node is normal, the execution node is called to execute according to the configuration file constructed by the cooperative flow.
The execution node cluster consists of a task execution module and a node state monitoring module; the task execution module continuously monitors the information of the communication port, responds to the state detection of the management node and receives the flow construction task issued by the management node. And after receiving the flow construction task issued by the management node, the execution node sends a confirmation message to the management node and starts the flow construction task. And the execution node constructs a message by using a transport layer protocol selected by the user, and sends a message header customized by the user to the network card as a load on the transport layer. The network card is taken over by a Data Plane Development Kit (including but not limited to DPDK, Data Plane Development Kit), the message is modified according to the content in the load on the transmission layer, and the customized modified cooperative flow message is sent to the basic network environment-programmable Data platform after a message header defined by a user is added. The node state monitoring module counts the flow receiving condition at the message receiving terminal through the executing node, the flow receiving condition parameters include but are not limited to flow size, sub-flow number and completion time, and periodically reports the executing condition of the flow construction task to the management node.
The basic network environment-programmable data platform is composed of programmable switching equipment and is used for forwarding the customized and reconstructed cooperative flow message.
The invention provides a method and a system for constructing a distributed cooperative flow simulation environment, which specifically comprise the following steps:
(1) the method for constructing the simulation environment by defining the cooperative traffic comprises the following steps of defining a directed acyclic graph DAG, a network topology structure and a message structure, wherein the DAG comprises the following steps:
(1.1) the user defines a directed acyclic graph DAG structure through a DAG definition module in the user system and outputs a directed acyclic graph DAG file, wherein the directed acyclic graph DAG definition comprises definitions of computing nodes and communication processes. The DAG node represents a calculation stage, a user needs to complete the setting of the flow size, the sub-flow number and the sub-flow length distribution when defining the calculation stage, and if the node is a leaf node (without out-of-date), the flow characteristic does not need to be defined. The DAG directed edges represent communication phases and characterize dependencies between the computation phases. After the user finishes the definition of the computing stages, the DAG directed edges are used for connecting the computing stages, and the dependency relationship among different computing stages is determined. And outputting and storing the recognizable adjacency list data structure after the user completes DAG definition.
And (1.2) the user sets a sending terminal and a receiving terminal of the cooperative flow through a network topology structure definition module in the user system, and outputs a network topology structure file. The basic network formed by the programmable switching devices is abstracted into a non-blocking switch in the network topology defined by the user, so that only the topology of the traffic transmitting terminal node and the traffic receiving terminal node needs to be considered in the module.
And (1.3) the user determines the transmission layer protocol and the flow self-defined message structure of the cooperative flow message through a message structure definition module in the user system, and outputs a message structure file. A user needs to set a flow message header including the name, size and type of a self-defined field in a flow self-defined message structure definition; the cooperative traffic message structure defaults to a flow message header added on the basis of a TCP or UDP message, including but not limited to a flow _ id for recording a cooperative traffic identifier, a flow _ id for a cooperative traffic sub-flow identifier, and a sending timestamp thereof, where the traffic identifier is used as a basis for cooperative traffic scheduling, and the timestamp is used for calculating time consumed in a communication phase.
And (1.4) after the DAG structure, the network topology structure and the message structure are defined, the user can send a starting command of the collaborative flow simulation task to the management node. And after the flow simulation task is started, entering a flow simulation task state monitoring module to acquire a task execution state.
(2) The management node reads the cooperative flow construction configuration file, starts a cooperative flow construction task, acquires the running state monitoring data of the execution node, and periodically reports the running state monitoring data to a state monitoring module of the user system; the method specifically comprises the following substeps:
(2.1) reading a cooperative flow construction configuration file through a task management module of a management node, and checking the integrity and the validity of parameters in the cooperative flow construction configuration file, wherein the integrity and the validity include but are not limited to checking whether a list of a sending node and a receiving node in a network topology structure parameter is empty, if the list is empty, the checking is failed, if the checking is failed, alarm information is returned to a user system, and the task is terminated; and if the verification result is normal, sending a flow construction task, starting a task process, and detecting the state of each execution node in the execution node cluster through a heartbeat mechanism.
(2.2) detecting the running state of each execution node through a global task state monitoring module of the management node; if the running state of the execution node is abnormal, returning alarm information to a user system and terminating the task; the detection method specifically comprises the steps of confirming whether an execution node is in a normal working state through a heartbeat mechanism, detecting a connection state, if the execution node does not return a confirmation message, adopting a strategy of setting a maximum repetition threshold and heartbeat message interval time to continue detection, and if the execution node still fails, sending an alarm signal to a user system and terminating the task.
And (2.3) the task management module starts the cooperative traffic construction task and sends corresponding traffic construction task parameters to each execution node at different task stages through a communication mechanism of each execution node.
And (2.4) receiving the task state monitoring data returned by the execution node by the global task state monitoring module, summarizing and analyzing the task state monitoring data to obtain the global state of task execution, and periodically uploading the global state to the state monitoring module of the user system.
(3) The method comprises the following steps that an execution node executes a cooperative flow construction task, receives an execution management node command, carries out customized reconstruction on a cooperative flow message, outputs the customized reconstructed cooperative flow message, monitors an operation state, and sends the state to a management node module, and specifically comprises the following steps:
(3.1) executing the continuous monitoring of the node state monitoring module in the node cluster for keeping the detection port: and after receiving the state detection signal of the management node, feeding back the self running state to the management node within a specified time period.
And (3.2) after receiving the traffic construction task sent by the management node, a task execution module in the execution node cluster analyzes task parameters carried in the task command, wherein the task parameters include but are not limited to a message structure, a traffic size, a sub-stream size distribution and a source destination address of a cooperative traffic message to be constructed. And starting a message construction process according to the transmission layer protocol type, the self-defined message structure, namely the load content, the flow size and the sub-flow length distribution. And the execution node sends the user-defined message structure to the network card as the load of a TCP or UDP message, and sends the message to the programmable data platform after the message structure is adjusted by a data plane development kit including but not limited to a DPDK takeover network card.
(3.3) the task execution module in the execution node cluster takes over the network card through a data plane development kit (including but not limited to DPDK), performs multi-dimensional statistics on the received cooperative traffic message, including the number and completion time of the received Coflow and the sub-flows thereof, and performs packet loss operation on the received cooperative traffic message after the statistics is completed.
And (3.4) the node state monitoring module in the execution node cluster periodically assembles the running state, the task execution progress and the statistical data of the execution nodes and sends the assembled data to the global task state monitoring module of the management node for gathering. In order to ensure the integrity of the reported data, when the execution node sends the monitoring data, it needs to send an integrity check code at the same time for the management node to confirm whether the data is complete and valid.
The management node determines the tasks of each execution node in different task stages according to the task parameters issued by the user without acquiring all information of the cooperative traffic construction task, and the management node actively calls the execution nodes to execute specific traffic construction or traffic receiving tasks after entering the next stage.
(4) And (4) receiving and forwarding the customized and modified cooperative flow message output in the step (3) by the basic network environment-programmable data platform.
In summary, the invention provides a rich and controllable test environment for the flow scheduling research, and researchers do not need to care about the computation phase of the cluster computation scenario, the invention decouples the computation phase and the communication phase in the distributed cluster computation, and only defines a Directed-Acyclic Graph (DAG) to represent the relationship between the computation phase and the communication phase and the traffic characteristics of the computation phase, defines the network topology of the distributed cluster terminal nodes, and defines the statistical characteristics of a single flow, so that the coordinated traffic conforming to the statistical characteristics can be rapidly, stably, and repeatedly obtained. The simulation environment constructed by the method allows a user to design a message structure and monitor the state of the sending and receiving tasks of the flow, and the optimization effects of different scheduling strategies in the same network scene can be visually compared. The method supports the collaborative flow message structure to be defined according to the requirement of the flow scheduling strategy, and provides more possibilities for flow scheduling research by matching with the programmable switching equipment. The method shields the randomness and the complexity of the communication stage of the calculation task in the real calculation cluster environment, can realize the communication environment simulation and the flow replay scene according to the requirement, and effectively supports the experiment and the test work of the Coflow scheduling research.

Claims (5)

1. A distributed cooperative flow simulation environment construction system is characterized by comprising a user system, a management node module, an execution node cluster and a basic network environment-programmable data platform;
the user system consists of a DAG definition module, a network topology structure definition module, a message structure definition module and a state monitoring module; the DAG definition module completes the definition of a calculation stage and a communication stage by defining nodes and edges of the directed acyclic graph DAG and outputs a directed acyclic graph DAG file; the network topology structure definition module defines terminal nodes in a simulation environment and outputs a network topology structure file; the message structure definition module selects a transmission layer protocol and defines a flow message header to output a self-defined message structure file; the state monitoring module acquires state monitoring data output by the management node module in real time and displays the state monitoring data to a user; the defined directed acyclic graph DAG file, the network topology structure file and the message structure file form a cooperative flow construction configuration file;
the management node module consists of a task management module and a global task state monitoring module; the task management module checks the integrity and the effectiveness of a user-defined cooperative traffic construction configuration file, sends out a traffic construction task and starts a task process; the global task state monitoring module collects running state monitoring data of a plurality of execution nodes through a management node and periodically reports the running state monitoring data to the state monitoring module of the user system;
the execution node cluster consists of a task execution module and a node state monitoring module; the task execution module receives and executes the flow construction task sent by the task management module; the node state monitoring module counts the traffic transceiving conditions at the message sending terminal and the receiving terminal through the execution node, and periodically reports the execution conditions of the traffic construction task to the management node;
the basic network environment-programmable data platform is composed of programmable switching equipment and is used for forwarding the customized and reconstructed cooperative flow message.
2. A method for constructing a distributed cooperative flow simulation environment by applying the system of claim 1, comprising the following steps;
(1) defining a directed acyclic graph DAG, a network topology structure and a message structure of a collaborative flow construction simulation environment, and outputting directed acyclic graph DAG files, network topology structure files and message structure files; the defined directed acyclic graph DAG file, the network topology structure file and the message structure file form a cooperative flow construction configuration file;
(2) the management node reads the cooperative flow construction configuration file, starts a cooperative flow construction task, acquires the running state monitoring data of the execution node, and periodically reports the running state monitoring data to a state monitoring module of the user system;
(3) the execution node executes the cooperative flow construction task, receives an execution management node command, performs customized reconstruction on the cooperative flow message, outputs the customized reconstructed cooperative flow message, monitors the running state and sends the state to the management node module;
(4) and (4) receiving and forwarding the customized and modified cooperative flow message output in the step (3) by the basic network environment-programmable data platform.
3. The distributed cooperative flow simulation environment construction method according to claim 2, wherein the step (1) comprises the following sub-steps:
(1.1) defining a directed acyclic graph DAG structure by a user through a DAG definition module in a user system, and outputting directed acyclic graph DAG files; the definition to the acyclic graph DAG comprises the definitions of computing nodes and communication processes, wherein the DAG nodes represent computing phases, and the DAG directed edges represent communication phases;
(1.2) a user defines a terminal node in a simulation environment through a network topology structure definition module in a user system and outputs a network topology structure file; the terminal nodes comprise a sending terminal and a receiving terminal of the cooperative flow;
(1.3) determining a transmission layer protocol and a flow self-defined message structure of the cooperative flow message by a user through a message structure definition module in a user system, and outputting a message structure file;
(1.4) after the user finishes the acyclic graph DAG structure, the network topology structure and the message structure defined in the step (1.1) to the step (1.3), sending a starting command of the collaborative flow simulation task to a management node; and after the task is started, entering a task state monitoring module to acquire a task execution state.
4. The distributed cooperative flow simulation environment construction method according to claim 2, wherein the step (2) comprises the following sub-steps:
(2.1) reading the cooperative flow construction configuration file through a task management module of the management node, and verifying the integrity and effectiveness of parameters in the cooperative flow construction configuration file;
(2.2) detecting the running state of each execution node through a global task state monitoring module of the management node;
(2.3) the task management module starts a cooperative traffic construction task and sends corresponding traffic construction task parameters to each execution node at different task stages through a communication mechanism of each execution node;
and (2.4) receiving the task state monitoring data returned by the execution node by the global task state monitoring module, summarizing and analyzing the task state monitoring data to obtain the global state of task execution, and periodically uploading the global state to the state monitoring module of the user system.
5. The distributed cooperative flow simulation environment construction method according to claim 2, wherein the step (3) comprises the following sub-steps:
(3.1) after receiving the state detection signal of the management node, the node state monitoring module in the execution node cluster feeds back the running state of the node to the management node;
(3.2) after a task execution module in the execution node cluster receives a flow construction task sent by a management node, analyzing flow construction task parameters, constructing messages according to flow size and speed specified by the parameters, taking over a network card through a data plane development kit, carrying out customized transformation on a cooperative flow message according to a message structure defined by a user, and sending the cooperative flow message to the basic network environment;
(3.3) the task execution module in the execution node cluster takes over the network card through the data plane development kit, counts the received cooperative flow message, and performs packet loss operation on the received cooperative flow message after counting is completed;
and (3.4) the node state monitoring module in the execution node cluster periodically sends the running state of the execution node to the global task state monitoring module of the management node for gathering.
CN202111058895.0A 2021-09-10 2021-09-10 Distributed cooperative flow simulation environment construction method and system Active CN113518012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111058895.0A CN113518012B (en) 2021-09-10 2021-09-10 Distributed cooperative flow simulation environment construction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111058895.0A CN113518012B (en) 2021-09-10 2021-09-10 Distributed cooperative flow simulation environment construction method and system

Publications (2)

Publication Number Publication Date
CN113518012A CN113518012A (en) 2021-10-19
CN113518012B true CN113518012B (en) 2021-12-10

Family

ID=78063052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111058895.0A Active CN113518012B (en) 2021-09-10 2021-09-10 Distributed cooperative flow simulation environment construction method and system

Country Status (1)

Country Link
CN (1) CN113518012B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114465941B (en) * 2022-04-13 2022-07-15 之江实验室 Cluster computing flow simulation method, system and device based on packet receiving and transmitting cooperation
CN114884893B (en) * 2022-07-12 2022-10-25 之江实验室 Forwarding and control definable cooperative traffic scheduling method and system
CN114900472B (en) * 2022-07-12 2022-11-08 之江实验室 Method and system for realizing cooperative flow scheduling by control surface facing to multiple tasks

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115694A (en) * 2020-08-21 2020-12-22 江苏徐工工程机械研究院有限公司 Simulation report generation method and device based on multi-element data structure
CN113127169A (en) * 2021-04-07 2021-07-16 中山大学 Efficient link scheduling method for dynamic workflow in data center network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017139305A1 (en) * 2016-02-09 2017-08-17 Jonathan Perry Network resource allocation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115694A (en) * 2020-08-21 2020-12-22 江苏徐工工程机械研究院有限公司 Simulation report generation method and device based on multi-element data structure
CN113127169A (en) * 2021-04-07 2021-07-16 中山大学 Efficient link scheduling method for dynamic workflow in data center network

Also Published As

Publication number Publication date
CN113518012A (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN113518012B (en) Distributed cooperative flow simulation environment construction method and system
CN110740054B (en) Data center virtualization network fault diagnosis method based on reinforcement learning
CN112118174B (en) Software defined data gateway
CN111966289B (en) Partition optimization method and system based on Kafka cluster
CN108306804A (en) A kind of Ethercat main station controllers and its communication means and system
CN114465941B (en) Cluster computing flow simulation method, system and device based on packet receiving and transmitting cooperation
CN108989136A (en) Business end to end performance monitoring method and device
CN111835579B (en) Method and system for testing effectiveness of network traffic scheduling simulation
CN111259073A (en) Intelligent business system running state studying and judging system based on logs, flow and business access
Inçki et al. Runtime verification of IoT systems using complex event processing
CN113660140A (en) Service function chain fault detection method based on data control plane hybrid sensing
CN116166505B (en) Monitoring platform, method, storage medium and equipment for dual-state IT architecture in financial industry
CN110061931A (en) Clustering method, device, system and the computer storage medium of industry control agreement
CN113094235B (en) Tail delay abnormal cloud auditing system and method
CN110113205A (en) A kind of network troubleshooting system and its working method based on software defined network technology
CN113364651A (en) Novel distributed network flow acquisition method
CN111935767A (en) Network simulation system
WO2015176516A1 (en) Method and apparatus for tracking service process
CN115037651A (en) RDMA bandwidth transmission testing method, system and storage medium
CN115987858A (en) Pressure testing method of block chain network and related equipment
Ackermann et al. Recovering views of inter-system interaction behaviors
CN112261010A (en) Special equipment multi-protocol conversion system, terminal and readable storage medium
Merino et al. Combining SPIN with ns-2 for protocol optimization
CN114095404B (en) Video equipment state calculation output method, device and equipment based on stream calculation
Guo et al. FullSight: A deep learning based collaborated failure detection framework of service function chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant