CN108282378B - Method and device for monitoring network flow - Google Patents

Method and device for monitoring network flow Download PDF

Info

Publication number
CN108282378B
CN108282378B CN201710007945.XA CN201710007945A CN108282378B CN 108282378 B CN108282378 B CN 108282378B CN 201710007945 A CN201710007945 A CN 201710007945A CN 108282378 B CN108282378 B CN 108282378B
Authority
CN
China
Prior art keywords
task
cluster
network traffic
distributed file
cluster system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710007945.XA
Other languages
Chinese (zh)
Other versions
CN108282378A (en
Inventor
曾军
路璐
吴威
方圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710007945.XA priority Critical patent/CN108282378B/en
Publication of CN108282378A publication Critical patent/CN108282378A/en
Application granted granted Critical
Publication of CN108282378B publication Critical patent/CN108282378B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The embodiment of the application discloses a method for monitoring network flow. The method comprises the following steps: the method comprises the steps that a monitoring system obtains a task plan log generated for a task when a task is submitted by a task submitting device; the monitoring system counts the network flow of the cross-cluster in the distributed file system based on the task plan log; and the monitoring system outputs the statistical result of the network flow. The task submitting equipment is used for processing task submission in the distributed file system, and the task plan log comprises a first data volume read by the first cluster system from the second cluster system indicated by the task. By the method provided by the embodiment of the application, the condition that a large amount of cross-cluster data reading operation occupies a large amount of network bandwidth in the distributed file system can be prompted in advance when the distributed file system provides services such as offline calculation and the like, so that the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation. In addition, the embodiment of the application also discloses a device for monitoring the network flow.

Description

Method and device for monitoring network flow
Technical Field
The present application relates to the field of network data processing technologies, and in particular, to a method and an apparatus for monitoring network traffic.
Background
To accommodate the massive job requirements of large data processing platforms, a distributed file system may provide multiple cluster systems for data storage. That is, data in a distributed file system is distributed to a plurality of different cluster systems for storage. When providing services such as offline computing, data reading operations in the distributed file system need to be implemented across cluster systems. Specifically, it is assumed that cluster system a and cluster system B are two cluster systems for data storage in the distributed file system, cluster system a may read data stored in cluster system B from cluster system B, and cluster system B may also read data stored in cluster system a from cluster system a.
In the distributed file system, the cluster systems are connected through a network device, and therefore, a data reading operation across the clusters occupies a network bandwidth. In the prior art, when a distributed file system provides services such as offline computation and the like, a large amount of network bandwidth is occupied sometimes due to a large amount of cross-cluster data reading operations, and at this time, normal operations of the distributed file system are affected.
Disclosure of Invention
The technical problem to be solved in the embodiments of the present application is to provide a method and an apparatus for monitoring network traffic, so that a situation that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operations in a distributed file system can be prompted in advance before the occurrence of the situation, thereby preventing the network bandwidth from being occupied by the cross-cluster data reading operations in a large amount, and preventing normal operation of the distributed file system from being affected.
In a first aspect, an embodiment of the present application provides a method for monitoring network traffic, which is applied to a monitoring system, and includes:
the method comprises the steps that a task plan log generated by task submitting equipment for a task when the task is submitted is obtained, wherein the task submitting equipment is used for processing task submission in a distributed file system, and the task plan log comprises a first data volume read by a first cluster system from a second cluster system through the task indication;
counting cross-cluster network traffic in the distributed file system based on the mission plan log, the cross-cluster network traffic comprising network traffic between the first cluster system and the second cluster system;
and outputting the statistical result of the network flow.
Optionally, the task plan log further includes a second data volume read by the first cluster system from a third cluster system and a third data volume read by the second cluster system from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, the counting network traffic across clusters in the distributed file system based on the mission plan log includes:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
Optionally, the counting network traffic across clusters in the distributed file system based on the task plan log specifically includes:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
Optionally, the method further includes:
and storing the statistical result of the network flow to a data persistence system.
Optionally, the method further includes:
and responding to the statistical result meeting the preset condition, and outputting alarm information.
Optionally, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume are recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
In a second aspect, an embodiment of the present application provides a method for monitoring network traffic, which is applied to a task submitting device, where the task submitting device is configured to process task submission in a distributed file system;
the method comprises the following steps:
generating a task plan log for a task in response to a submission instruction of the task;
sending the task plan log to a monitoring system, so that the monitoring system can count the network traffic of cross-clusters in the distributed file system based on the task plan log and output the counting result of the network traffic;
wherein the task plan log includes a first amount of data that the task instructs a first cluster system to read from a second cluster system, and the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
Optionally, the task plan log further includes a second data volume read by the first cluster system from a third cluster system and a third data volume read by the second cluster system from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume are recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
In a third aspect, an embodiment of the present application provides an apparatus for monitoring network traffic, configured in a monitoring system, including:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a task plan log generated by task submitting equipment for a task when the task is submitted, the task submitting equipment is used for processing task submission in a distributed file system, and the task plan log comprises a first data volume read by a first cluster system from a second cluster system by the task indication;
a statistics unit, configured to count cross-cluster network traffic in the distributed file system based on the mission plan log, where the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system;
and the first output unit is used for outputting the statistical result of the network flow.
Optionally, the task plan log further includes a second data volume read by the first cluster system from a third cluster system and a third data volume read by the second cluster system from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, the statistical unit is specifically configured to:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
Optionally, the statistical unit is specifically configured to:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
Optionally, the method further includes:
and the storage unit is used for storing the statistical result of the network flow to a data persistence system.
Optionally, the method further includes:
and the second output unit is used for responding to the fact that the statistical result meets the preset condition and outputting alarm information.
Optionally, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume are recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
In a fourth aspect, an embodiment of the present application provides an apparatus for monitoring network traffic, where the apparatus is configured to a task submitting device, and the task submitting device is configured to process task submission in a distributed file system;
the device comprises:
the generating unit is used for responding to a submission instruction of a task and generating a task plan log for the task;
a sending unit, configured to send the task plan log to a monitoring system, so that the monitoring system counts network traffic across clusters in the distributed file system based on the task plan log, and outputs a statistical result of the network traffic;
wherein the task plan log includes a first amount of data that the task instructs a first cluster system to read from a second cluster system, and the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
Optionally, the task plan log further includes a second data volume read by the first cluster system from a third cluster system and a third data volume read by the second cluster system from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume are recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
In a fifth aspect, an embodiment of the present application provides a system for monitoring network traffic, including: the system comprises a monitoring system, a task submitting device and a distributed file system; the task submitting equipment is used for processing task submission in the distributed file system; the distributed file system at least comprises a first cluster system and a second cluster system;
the task submitting equipment is used for responding to a submitting instruction of a task and generating a task plan log for the task; wherein the task plan log includes a first amount of data that the task indicates the first cluster system read from the second cluster system;
the monitoring system is used for acquiring a task plan log generated by the task submitting equipment for the task when the task is submitted, counting the network traffic of the cross-cluster in the distributed file system based on the task plan log, and outputting the counting result of the network traffic; wherein the cross-cluster network traffic comprises network traffic between the first cluster system and the second cluster system.
Optionally, the distributed file system further includes a third cluster system;
the task plan log further includes a second amount of data that the task instructs the first cluster system to read from a third cluster system and a third amount of data that the second cluster system reads from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and between the second cluster system and the third cluster system.
Optionally, in order to count the network traffic across the clusters in the distributed file system, the monitoring system is specifically configured to:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
Optionally, in order to count the network traffic across the clusters in the distributed file system, the monitoring system is specifically configured to:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
Optionally, the monitoring system is further configured to:
and storing the statistical result of the network flow to a data persistence system.
Optionally, the monitoring system is further configured to:
and responding to the statistical result meeting the preset condition, and outputting alarm information.
Optionally, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume are recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
Compared with the prior art, the method has the following advantages:
according to the technical scheme of the embodiment of the application, when a task is submitted, a task submitting device responds to a submitting instruction of the task to generate a task plan log for the task and sends the task plan log to a monitoring system, wherein the task relates to the operation of reading data from a second cluster system by a first cluster system, and the task indicates the data amount read from the second cluster system by the first cluster system and is recorded in the task plan log. And the monitoring system counts the network traffic of the cross-cluster in the distributed file system according to the acquired task plan log and outputs the statistical result of the network traffic. Therefore, the task plan logs are generated when the tasks are submitted and are sent to the monitoring system for network flow statistics, so that the monitoring system can output the statistical result of the network flow based on the tasks before the distributed file system actually executes the tasks, the distributed file system can predict and output the network flow required by the cross-cluster data reading operation to be executed when providing services such as offline calculation and the like, the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, the network bandwidth is prevented from being greatly occupied by the cross-cluster data reading operation, and normal operation of the distributed file system is prevented from being influenced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic diagram of a network system framework involved in an application scenario in an embodiment of the present application;
fig. 2 is a schematic flowchart of a method for monitoring network traffic according to an embodiment of the present application;
fig. 3 is a flowchart illustrating a method for monitoring network traffic according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an apparatus for monitoring network traffic according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an apparatus for monitoring network traffic according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a system for monitoring network traffic according to an embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The inventor of the present application has found that, in the prior art, when a distributed file system provides services such as offline computation, a large number of cross-cluster data reading operations may occur. Because the cluster systems are connected through the network device, a large amount of network bandwidth between the cluster systems is occupied by a large amount of cross-cluster data reading operations, and therefore normal operation of the distributed file system is affected.
In order to solve the above problem, in the embodiment of the present application, it is considered that a cross-cluster data reading operation is an operation in a task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and network traffic corresponding to the cross-cluster data reading operation also does not yet appear in a distributed file system, so that a task plan log may be generated for the task when the task is submitted, and network traffic corresponding to the cross-cluster data operation indicated by the task may be recorded in the task plan log, and therefore, before the task is executed, network traffic across clusters in the distributed file system may be counted in advance and output based on the task plan log of each task, and therefore, when the distributed file system provides services such as offline calculation and the like, network traffic required by the cross-cluster data reading operation to be executed can be predicted and output, so as to prompt in advance that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system Therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
For example, the embodiment of the present application may be applied to a network system as shown in fig. 1. The network system has a plurality of cluster systems 103, and different cluster systems 103 may perform data reading operation, that is, one cluster system 103 may read data from another cluster system 103. Any two different cluster systems 103 in the network system are represented by a first cluster system and a second cluster system, and assuming that there is a task related to the operation of the first cluster system to read data from the second cluster system, the task submitting equipment 101 generates a task plan log for the task in response to a submitting instruction of the task and sends the task plan log to the monitoring system 102, wherein the task records in the task plan log the data amount of the first cluster system, which indicates the first cluster system to read from the second cluster system. The monitoring system 102 counts the network traffic across the clusters in the distributed file system according to the obtained task plan log, and outputs a statistical result of the network traffic, wherein the network traffic across the clusters includes the network traffic between the first cluster system and the second cluster system.
Having described the general idea of the present application, various non-limiting embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, a flowchart of a method for monitoring network traffic in an embodiment of the present application is shown. In this embodiment, the method may be applied to, i.e. performed by, a monitoring system. The method may specifically comprise the following steps, for example:
step 201, a task plan log generated for a task by a task submitting device when the task is submitted is obtained, wherein the task submitting device is used for processing task submission in a distributed file system, and the task plan log includes a first data volume read from a second cluster system by a first cluster system indicated by the task.
The distributed file system may respond to tasks related to cross-cluster data read operations while providing services such as offline computing. Upon receiving a submission request for the task, the task submission device may submit the task. When the task submitting equipment submits the task, a task plan log can be generated for the task in response to a submitting instruction of the task and sent to a monitoring system.
In this embodiment, the data amount of one or more cross-cluster data read operations involved in the task may be recorded in the task plan log of the task. If the first cluster system is required to read the first data volume from the second cluster system in the task execution process, the data reading operation of the first cluster system to the second cluster system and the corresponding first data volume can be recorded in the task plan log of the task. If, in the task execution process, the first cluster system is required to read the first data volume from the second cluster system, and the first cluster system is also required to read the second data volume from the third cluster system, the task plan log of the task may further include a second data volume that the task instructs the first cluster system to read from the third cluster system, that is, the task plan log of the task may further record the data reading operation of the first cluster system on the third cluster system and the corresponding second data volume. Further, it is assumed that, in the task execution process, in addition to the first cluster system being required to read the first data volume from the second cluster system and the first cluster system being required to read the second data volume from the third cluster system, the second cluster system is also required to read the third data volume from the third cluster system, the task plan log of the task may further include a third data volume that the task indicates the second cluster system to read from the third cluster system, that is, the task plan log of the task may further record the data reading operation of the second cluster system on the third cluster system and the third data volume corresponding to the third data volume. Of course, all cross-cluster data read operations involved in the task and their corresponding data volumes may be recorded in the task plan log of the task in order to more fully predict network traffic conditions in the distributed file system.
It is understood that, for a task, all cross-cluster data reading operations involved in the task are performed by the cluster system in which the task is located to read data from other cluster systems, that is, the cluster system that acquires the data is the same, and the cluster system that provides the data may be the same or different. For this reason, in some embodiments of this embodiment, for example, an identifier of the task, an identifier of a cluster system that acquires data, an identifier of a cluster system that provides data, and a data volume corresponding to a data reading operation may be recorded in the task plan log, where the identifier of the cluster system that provides data and the data volume corresponding to the data reading operation have a correspondence relationship. For example, it is assumed that a first cluster system is required to read a first data volume from a second cluster system in the task execution process, in a task plan log of the task, an identifier of a cluster system that acquires data is an identifier of the first cluster system, an identifier of a cluster system that provides data is an identifier of the second cluster system, and a data volume corresponding to a data reading operation is the first data volume, where a correspondence relationship exists between the identifier of the second cluster system and the first data volume. For another example, assuming that, in the task execution process, the first cluster system is required to read the first data volume from the second cluster system, and the first cluster system is also required to read the second data volume from the third cluster system, the task plan log of the task may record an identifier of the third cluster system and the second data volume in addition to the identifier of the task, the identifier of the first cluster system, and the identifier of the second cluster system and the first data volume having a corresponding relationship, where the identifier of the third cluster system and the second data volume have a corresponding relationship.
In some embodiments of this embodiment, the manner in which the task submitting device generates the task plan log may specifically be, for example: the task submitting equipment responds to a submitting instruction of the task and acquires an input file and a file size of the task; the task submitting equipment collects all input files of the task according to a cluster system providing data to obtain cross-cluster input information of the task; and the task submitting equipment generates a task plan log according to the cross-cluster input information of the task. The task submitting device may obtain the input file and the file size through input file information of the task, where the format of the input file information may be, for example, < file format >/< cluster >/< file path >/< file size >. For example, if the input file information of the task is pangu:// fake _ cluster2/fake _ dir2/fake _ path:100, the pangu represents the protocol adopted by the file system, the fake _ cluster2 represents the cluster system in which the file is located (i.e., the cluster system providing the data), the path for reading the file is fake _ dir2/fake _ path, and 100 represents the file size (i.e., the data amount corresponding to the data reading operation). If the input file information of the task is [ column:// fake _ cluster2/fake _ dir/fake _ path:100, column:// fake _ cluster3/fake _ dir3/fake _ path:100], the cross-cluster input information of the task obtained after the task submitting equipment is summarized can be { fake _ cluster2:100, fake _ cluster3:300 }. The further generated task plan logs may be (TaskName: fake _ task, running _ cluster: fake _ cluster, Cluster InputInfo: { fake _ cluster2:100, fake _ cluster3:300}), wherein fake _ task is the identification of the task, fake _ cluster is the identification of the cluster system where the task is located, fake _ cluster2 and fake _ cluster3 are the identifications of the cluster systems providing data for the task, 100 represents the amount of data that fake _ cluster reads from fake _ cluster2 in the task, and 300 represents the amount of data that fake _ cluster reads from fake _ cluster3 in the task.
Step 202, based on the mission plan log, counting cross-cluster network traffic in the distributed file system, where the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
It can be understood that the network traffic across the clusters counted based on the task plan log may include the network traffic between two cluster systems corresponding to any data reading operation in the task execution process. For example, if the task instructs a first cluster system to read a first amount of data from a second system cluster, the cross-cluster network traffic may include network traffic between the first cluster system and the second cluster system. For another example, if the task instructs the first cluster system to read the second data volume from the third cluster system in addition to instructing the first cluster system to read the first data volume from the second system cluster, the network traffic across the clusters may further include network traffic between the first cluster system and the third cluster system. For another example, if the task instructs the second cluster system to read the third data volume from the third cluster system in addition to instructing the first cluster system to read the first data volume from the second system cluster and instructing the first cluster system to read the second data volume from the third cluster system, the network traffic across the clusters may further include the network traffic between the second cluster system and the third cluster system.
And 203, outputting the statistical result of the network flow.
In particular, the monitoring system may include a message processing system and a stream processing system. When receiving the content sent by the task submitting device, the message processing system can identify the task plan log from the content and send the task plan log to the stream processing system, wherein the identification mode can be, for example, keyword filtering and the like. When receiving a task plan log sent by a message processing system, a stream processing system can obtain information such as the identification of the task, the time for obtaining the task, the cross-cluster data reading operation related to the task and the corresponding data volume thereof by analyzing the task plan log. Based on the obtained information, the stream processing system may count network traffic across the clusters in the distributed file system for output to the technician to prompt the technician in advance of the occupation of network bandwidth in the distributed file system.
It is understood that, when providing services such as offline computing, the distributed file system generally requests one or more task submitting devices to submit a plurality of tasks, the one or more task submitting devices may generate task plan logs for the plurality of tasks respectively and send the task plan logs to the monitoring system, and the monitoring system may count network traffic across clusters in the distributed file system based on the task plan logs of the plurality of tasks.
In this embodiment, based on the mission plan log, the monitoring system may use a plurality of different methods to perform statistics on the network traffic across the clusters in the distributed file system, so as to obtain a plurality of different statistical results.
For example, in some embodiments of the present implementation, the monitoring system may count the total network traffic between two specific cluster systems involved in all tasks over a period of time, so that the technician can be prompted in advance about the occupancy of network bandwidth between the specific cluster systems. Specifically, step 202 may include, for example: identifying the first amount of data from the mission plan log; according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted; and the acquisition time of the task plan log belongs to the specified time. More specifically, the monitoring system may find task plan logs of all tasks acquired within a specified time, identify data volumes of data read by all the first cluster systems for the second cluster system from the task plan logs of the tasks, and determine a sum of the identified data volumes to be network traffic between the first cluster system and the second cluster system within the specified time.
For another example, in other embodiments of this embodiment, the monitoring system may count the total network traffic occupied by all cross-cluster data reading operations involved in a specific task, so as to prompt a technician in advance of the total network traffic occupied when the specific task is executed. That is, step 202 may include, for example: and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process. Further, the monitoring system may count network traffic occupied in total when all tasks are executed, and then output one or more tasks occupying the maximum network traffic, so as to prompt the technician of tasks occupying too much network bandwidth in advance, so that the technician can process the tasks in advance, thereby avoiding too much network bandwidth consumption.
It is to be understood that, in order to facilitate a technician to view the statistical result of the network traffic at any time, in some embodiments of this embodiment, after the monitoring system obtains the statistical result of the network traffic, in addition to outputting, for example, the statistical result of the network traffic may be saved to the data persistence system. The data persistence system may be, for example, a database system. Of course, the monitoring system may also store the acquired task plan log in the data persistence system, so as to meet the subsequent statistical requirements.
It should be noted that, in order to enable the technician to notice the network bandwidth difference that may appear later
Often, in some embodiments of this embodiment, the monitoring system may further output an alarm message in response to the statistical result meeting a preset condition indicating that the network traffic is abnormal, for example. For example, if the statistical result is the total network traffic between the first cluster system and the second cluster system in the distributed file system within the specified time, the preset condition may be an upper limit of the network traffic between the cluster systems, and if the statistical result is greater than the preset condition, the alarm information is output. For another example, if the statistical result is the total network traffic occupied when the specific task is executed, the preset condition may be the upper limit of the network traffic occupied by the task, and if the statistical result is greater than the preset condition, the alarm information is output.
In an implementation scenario example of this embodiment, the identifier of the task may be 001, the identifier of the cluster system where the task runs may be AY-a, and the input file information of the task may include the following content:
pangu://AY-A/fake_dir2/fake_path:45678
pangu://AY-B/fake_dir/fake_path:123456
pangu://AY-C/fake_dir2/fake_path:234567
pangu://AY-D/fake_dir2/fake_path:345678
the task plan log for the task may include, for example, the following:
TaskID:001
Cluster:AY-A
ClusterInputInfo:{“AY-B”:123456,“AY-C”:234567,“AY-D”:345678}
the monitoring system analyzes the task plan log of the task, and the obtained information may be, for example, the content shown in table 1:
TABLE 1
This cluster Remote cluster TaskID Size
AY-A AY-B 001 123456
AY-A AY-C 001 234567
AY-A AY-D 001 345678
The statistical result obtained by the monitoring system based on the task plan log statistics may be, for example, the content shown in table 2:
TABLE 2
Figure BDA0001203711150000131
Figure BDA0001203711150000141
By the technical scheme of the embodiment, it is considered that the cross-cluster data reading operation is an operation in the task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and the network traffic corresponding to the cross-cluster data reading operation also does not appear in the distributed file system, so that a task plan log can be generated for the task when the task is submitted, and the network traffic corresponding to the cross-cluster data operation indicated by the task can be recorded in the task plan log, therefore, before the task is executed, the network traffic across the cluster in the distributed file system can be counted and output in advance based on the task plan log of each task, therefore, the distributed file system can predict and output the network traffic required by the cross-cluster data reading operation to be executed when providing services such as offline calculation, and the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
Referring to fig. 3, a flowchart of a method for monitoring network traffic in an embodiment of the present application is shown. In this embodiment, the method is applied to a task submitting device, that is, the method is executed by the task submitting device, and the task submitting device is used for processing task submission in a distributed file system. The method may specifically comprise the following steps, for example:
step 301, responding to a submission instruction of a task, and generating a task plan log for the task;
step 302, sending the task plan log to a monitoring system, so that the monitoring system can count the network traffic of the cross-cluster in the distributed file system based on the task plan log, and output the statistical result of the network traffic;
wherein the task plan log includes a first amount of data that the task instructs a first cluster system to read from a second cluster system, and the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
Optionally, the task plan log may further include, for example, that the task instructs the first cluster system to read a second data volume from a third cluster system and a third data volume from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, for example, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume may be recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
By the technical scheme of the embodiment, it is considered that the cross-cluster data reading operation is an operation in the task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and the network traffic corresponding to the cross-cluster data reading operation also does not appear in the distributed file system, so that a task plan log can be generated for the task when the task is submitted, and the network traffic corresponding to the cross-cluster data operation indicated by the task can be recorded in the task plan log, therefore, before the task is executed, the network traffic across the cluster in the distributed file system can be counted and output in advance based on the task plan log of each task, therefore, the distributed file system can predict and output the network traffic required by the cross-cluster data reading operation to be executed when providing services such as offline calculation, and the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
Referring to fig. 4, a schematic structural diagram of an apparatus for monitoring network traffic in an embodiment of the present application is shown. In this embodiment, the apparatus is configured in a monitoring system. The apparatus may specifically include, for example:
an obtaining unit 401, configured to obtain a task plan log generated by a task submitting device for a task when the task is submitted, where the task submitting device is configured to process task submission in a distributed file system, and the task plan log includes a first data amount that a first cluster system is instructed by the task to read from a second cluster system;
a statistics unit 402, configured to count cross-cluster network traffic in the distributed file system based on the mission plan log, where the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system;
a first output unit 403, configured to output a statistical result of the network traffic.
Optionally, the task plan log may further include, for example, that the task instructs the first cluster system to read a second data volume from a third cluster system and a third data volume from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, the statistical unit 402 may be specifically configured to:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
Optionally, the statistical unit 402 may be specifically configured to:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
Optionally, the apparatus may further include:
and the storage unit is used for storing the statistical result of the network flow to a data persistence system.
Optionally, the apparatus may further include:
and the second output unit is used for responding to the fact that the statistical result meets the preset condition and outputting alarm information.
Optionally, for example, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume may be recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
By the technical scheme of the embodiment, it is considered that the cross-cluster data reading operation is an operation in the task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and the network traffic corresponding to the cross-cluster data reading operation also does not appear in the distributed file system, so that a task plan log can be generated for the task when the task is submitted, and the network traffic corresponding to the cross-cluster data operation indicated by the task can be recorded in the task plan log, therefore, before the task is executed, the network traffic across the cluster in the distributed file system can be counted and output in advance based on the task plan log of each task, therefore, the distributed file system can predict and output the network traffic required by the cross-cluster data reading operation to be executed when providing services such as offline calculation, and the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
Referring to fig. 5, a schematic structural diagram of an apparatus for monitoring network traffic in an embodiment of the present application is shown. In this embodiment, the apparatus is configured to a task submitting device, and the task submitting device is configured to process task submission in a distributed file system.
The apparatus may specifically include, for example:
a generating unit 501, configured to generate a task plan log for a task in response to a submission instruction of the task;
a sending unit 502, configured to send the task plan log to a monitoring system, so that the monitoring system counts inter-cluster network traffic in the distributed file system based on the task plan log, and outputs a statistical result of the network traffic, where the inter-cluster network traffic includes network traffic between the first cluster system and the second cluster system;
wherein the task plan log includes a first amount of data that the task indicates the first cluster system read from the second cluster system.
Optionally, the task plan log may further include, for example, that the task instructs the first cluster system to read a second data volume from a third cluster system and a third data volume from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and network traffic between the second cluster system and the third cluster system.
Optionally, for example, an identifier of the task, an identifier of the first cluster system, an identifier of the second cluster system, and the first data volume may be recorded in the task plan log, where the identifier of the second cluster system and the first data volume have a corresponding relationship.
By the technical scheme of the embodiment, it is considered that the cross-cluster data reading operation is an operation in the task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and the network traffic corresponding to the cross-cluster data reading operation also does not appear in the distributed file system, so that a task plan log can be generated for the task when the task is submitted, and the network traffic corresponding to the cross-cluster data operation indicated by the task can be recorded in the task plan log, therefore, before the task is executed, the network traffic across the cluster in the distributed file system can be counted and output in advance based on the task plan log of each task, therefore, the distributed file system can predict and output the network traffic required by the cross-cluster data reading operation to be executed when providing services such as offline calculation, and the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
Referring to fig. 6, a schematic structural diagram of a system for monitoring network traffic in an embodiment of the present application is shown. In this embodiment, the system for monitoring network traffic may include, for example, a monitoring system 601, a task submitting device 602, and a distributed file system 603; the task submission device 602 is configured to process task submission in the distributed file system 603; the distributed file system 603 includes at least a first cluster system 604 and a second cluster system 605;
the task submitting device 602 is configured to generate a task plan log for a task in response to a submitting instruction of the task; wherein the task plan log comprises the task indicating a first amount of data the first cluster system 604 reads from the second cluster system 605;
the monitoring system 601 is configured to obtain a task plan log generated by the task submitting device 602 for a task when the task is submitted, count network traffic across a cluster in the distributed file system 603 based on the task plan log, and output a statistical result of the network traffic; wherein the cross-cluster network traffic comprises network traffic between the first cluster system 604 and the second cluster system 605.
Optionally, the distributed file system further includes a third cluster system 606;
the task plan log further includes a second amount of data that the task instructs the first cluster system 604 to read from a third cluster system 606 and a third amount of data that the second cluster system 605 reads from the third cluster system 606, and the cross-cluster network traffic further includes network traffic between the first cluster system 604 and the third cluster system 606 and network traffic between the second cluster system 605 and the third cluster system 606.
Optionally, in order to count the network traffic across the clusters in the distributed file system 603, the monitoring system 601 is specifically configured to:
identifying the first amount of data from the mission plan log;
according to the first data volume, counting network traffic between the first cluster system 604 and the second cluster system 605 in the distributed file system 603 within a specified time;
and the acquisition time of the task plan log belongs to the specified time.
Optionally, in order to count the network traffic across the clusters in the distributed file system 603, the monitoring system 601 is specifically configured to:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
Optionally, the monitoring system 601 is further configured to:
the statistics of the network traffic are saved to a data persistence system 607.
Optionally, the monitoring system 601 is further configured to:
and responding to the statistical result meeting the preset condition, and outputting alarm information.
Optionally, the task plan log records an identifier of the task, an identifier of the first cluster system 604, an identifier of the second cluster system 605, and the first data volume, where the identifier of the second cluster system 605 and the first data volume have a corresponding relationship.
By the technical scheme of the embodiment, it is considered that the cross-cluster data reading operation is an operation in the task execution process, the cross-cluster data reading operation is not yet executed when a task is submitted, and the network traffic corresponding to the cross-cluster data reading operation also does not appear in the distributed file system, so that a task plan log can be generated for the task when the task is submitted, and the network traffic corresponding to the cross-cluster data operation indicated by the task can be recorded in the task plan log, therefore, before the task is executed, the network traffic across the cluster in the distributed file system can be counted and output in advance based on the task plan log of each task, therefore, the distributed file system can predict and output the network traffic required by the cross-cluster data reading operation to be executed when providing services such as offline calculation, and the condition that a large amount of network bandwidth is occupied by a large amount of cross-cluster data reading operation in the distributed file system is prompted in advance, therefore, the network bandwidth is prevented from being largely occupied by the cross-cluster data reading operation, and the normal operation of the distributed file system is prevented from being influenced.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described device embodiments are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (19)

1. A method for monitoring network traffic is applied to a monitoring system, and comprises the following steps:
the method comprises the steps that a task plan log generated by task submitting equipment for a task when the task is submitted is obtained, wherein the task submitting equipment is used for processing task submission in a distributed file system, and the task plan log comprises a first data volume read by a first cluster system from a second cluster system through the task indication;
counting cross-cluster network traffic in the distributed file system based on the mission plan log, the cross-cluster network traffic comprising network traffic between the first cluster system and the second cluster system; wherein, based on the mission plan log, counting network traffic across clusters in the distributed file system comprises: analyzing the task plan log, and acquiring the identification of the task, the time of acquiring the task, the cross-cluster data reading operation related to the task and the corresponding data volume of the cross-cluster data reading operation; counting network traffic across clusters in the distributed file system based on the obtained data;
and outputting the statistical result of the network flow to prompt the occupation condition of the network bandwidth in the distributed file system in advance.
2. The method of claim 1, wherein the task plan log further comprises a second amount of data read by the first cluster system from a third cluster system and a third amount of data read by the second cluster system from the third cluster system indicated by the task, and wherein the cross-cluster network traffic further comprises network traffic between the first cluster system and the third cluster system and between the second cluster system and the third cluster system.
3. The method of claim 1, wherein the accounting for network traffic across clusters in the distributed file system based on the mission plan log comprises:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
4. The method of claim 1, wherein the accounting of network traffic across clusters in the distributed file system based on the mission plan log comprises:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
5. The method of claim 1, further comprising:
and storing the statistical result of the network flow to a data persistence system.
6. The method of claim 1, further comprising:
and responding to the statistical result meeting the preset condition, and outputting alarm information.
7. The method of claim 1, wherein an identification of the task, an identification of the first cluster system, an identification of the second cluster system, and the first data volume are recorded in the task plan log, wherein the identification of the second cluster system and the first data volume have a correspondence relationship.
8. A method for monitoring network flow is characterized in that the method is applied to task submitting equipment, and the task submitting equipment is used for processing task submission in a distributed file system;
the method comprises the following steps:
generating a task plan log for a task in response to a submission instruction of the task;
sending the task plan log to a monitoring system, so that the monitoring system can count the network traffic of the cross-cluster in the distributed file system based on the task plan log, and output the statistical result of the network traffic to prompt the occupation condition of the network bandwidth in the distributed file system in advance; wherein, based on the mission plan log, counting network traffic across clusters in the distributed file system comprises: analyzing the task plan log, and acquiring the identification of the task, the time of acquiring the task, the cross-cluster data reading operation related to the task and the corresponding data volume of the cross-cluster data reading operation; counting network traffic across clusters in the distributed file system based on the obtained data;
wherein the task plan log includes a first amount of data that the task instructs a first cluster system to read from a second cluster system, and the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
9. The method of claim 8, wherein the task plan log further comprises a second amount of data read by the first cluster system from a third cluster system and a third amount of data read by the second cluster system from the third cluster system indicated by the task, and wherein the cross-cluster network traffic further comprises network traffic between the first cluster system and the third cluster system and between the second cluster system and the third cluster system.
10. The method of claim 8, wherein an identification of the task, an identification of the first cluster system, an identification of the second cluster system, and the first data volume are recorded in the task plan log, wherein the identification of the second cluster system and the first data volume have a correspondence relationship.
11. An apparatus for monitoring network traffic, configured in a monitoring system, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a task plan log generated by task submitting equipment for a task when the task is submitted, the task submitting equipment is used for processing task submission in a distributed file system, and the task plan log comprises a first data volume read by a first cluster system from a second cluster system by the task indication;
a statistics unit, configured to count cross-cluster network traffic in the distributed file system based on the mission plan log, where the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system; wherein, based on the mission plan log, counting network traffic across clusters in the distributed file system comprises: analyzing the task plan log, and acquiring the identification of the task, the time of acquiring the task, the cross-cluster data reading operation related to the task and the corresponding data volume of the cross-cluster data reading operation; counting network traffic across clusters in the distributed file system based on the obtained data;
and the first output unit is used for outputting the statistical result of the network flow so as to prompt the occupation condition of the network bandwidth in the distributed file system in advance.
12. An apparatus for monitoring network traffic is configured to a task submission device, where the task submission device is configured to process task submission in a distributed file system;
the device comprises:
the generating unit is used for responding to a submission instruction of a task and generating a task plan log for the task;
the sending unit is used for sending the task plan log to a monitoring system so that the monitoring system can count the network traffic of the cross-cluster in the distributed file system based on the task plan log and output the counting result of the network traffic to prompt the occupation condition of the network bandwidth in the distributed file system in advance; wherein, based on the mission plan log, counting network traffic across clusters in the distributed file system comprises: analyzing the task plan log, and acquiring the identification of the task, the time of acquiring the task, the cross-cluster data reading operation related to the task and the corresponding data volume of the cross-cluster data reading operation; counting network traffic across clusters in the distributed file system based on the obtained data;
wherein the task plan log includes a first amount of data that the task instructs a first cluster system to read from a second cluster system, and the cross-cluster network traffic includes network traffic between the first cluster system and the second cluster system.
13. A system for monitoring network traffic, comprising: the system comprises a monitoring system, a task submitting device and a distributed file system; the task submitting equipment is used for processing task submission in the distributed file system; the distributed file system at least comprises a first cluster system and a second cluster system;
the task submitting equipment is used for responding to a submitting instruction of a task and generating a task plan log for the task; wherein the task plan log includes a first amount of data that the task indicates the first cluster system read from the second cluster system;
the monitoring system is used for acquiring a task plan log generated by the task submitting equipment for the task when the task is submitted, counting the network traffic of the cross-cluster in the distributed file system based on the task plan log, and outputting a counting result of the network traffic so as to prompt the occupation condition of the network bandwidth in the distributed file system in advance; wherein the cross-cluster network traffic comprises network traffic between the first cluster system and the second cluster system; counting network traffic across clusters in the distributed file system based on the mission plan log, comprising: analyzing the task plan log, and acquiring the identification of the task, the time of acquiring the task, the cross-cluster data reading operation related to the task and the corresponding data volume of the cross-cluster data reading operation; and counting network traffic across the clusters in the distributed file system based on the acquired data.
14. The system of claim 13, wherein the distributed file system further comprises a third cluster system;
the task plan log further includes a second amount of data that the task instructs the first cluster system to read from a third cluster system and a third amount of data that the second cluster system reads from the third cluster system, and the cross-cluster network traffic further includes network traffic between the first cluster system and the third cluster system and between the second cluster system and the third cluster system.
15. The system of claim 13, wherein to account for network traffic across clusters in the distributed file system, the monitoring system is specifically configured to:
identifying the first amount of data from the mission plan log;
according to the first data quantity, network traffic between the first cluster system and the second cluster system in the distributed file system within a specified time is counted;
and the acquisition time of the task plan log belongs to the specified time.
16. The system of claim 13, wherein to account for network traffic across clusters in the distributed file system, the monitoring system is specifically configured to:
and counting the total data volume read by the cross-cluster system indicated by the task based on the task plan log, wherein the total data volume is used as the total network traffic involved in the task execution process.
17. The system of claim 13, wherein the monitoring system is further configured to:
and storing the statistical result of the network flow to a data persistence system.
18. The system of claim 13, wherein the monitoring system is further configured to:
and responding to the statistical result meeting the preset condition, and outputting alarm information.
19. The system of claim 13, wherein an identification of the task, an identification of the first cluster system, an identification of the second cluster system, and the first data volume are recorded in the task plan log, wherein the identification of the second cluster system has a correspondence with the first data volume.
CN201710007945.XA 2017-01-05 2017-01-05 Method and device for monitoring network flow Active CN108282378B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710007945.XA CN108282378B (en) 2017-01-05 2017-01-05 Method and device for monitoring network flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710007945.XA CN108282378B (en) 2017-01-05 2017-01-05 Method and device for monitoring network flow

Publications (2)

Publication Number Publication Date
CN108282378A CN108282378A (en) 2018-07-13
CN108282378B true CN108282378B (en) 2021-11-09

Family

ID=62800623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710007945.XA Active CN108282378B (en) 2017-01-05 2017-01-05 Method and device for monitoring network flow

Country Status (1)

Country Link
CN (1) CN108282378B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102014137A (en) * 2010-12-13 2011-04-13 哈尔滨工业大学 Common distributed data recording device and method based on HLA (high level architecture)
CN102855299A (en) * 2012-08-16 2013-01-02 上海引跑信息科技有限公司 Method for realizing iterative migration of distributed database without interrupting service
CN103327048A (en) * 2012-03-22 2013-09-25 阿里巴巴集团控股有限公司 Online-application data matching method and device
CN103389946A (en) * 2013-07-16 2013-11-13 中国科学院计算技术研究所 Fragmentization removal method and system
CN103455633A (en) * 2013-09-24 2013-12-18 浪潮齐鲁软件产业有限公司 Method of distributed analysis for massive network detailed invoice data
CN103678520A (en) * 2013-11-29 2014-03-26 中国科学院计算技术研究所 Multi-dimensional interval query method and system based on cloud computing
CN106034160A (en) * 2015-03-19 2016-10-19 阿里巴巴集团控股有限公司 Distributed computing system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10003649B2 (en) * 2015-05-07 2018-06-19 Dell Products Lp Systems and methods to improve read/write performance in object storage applications

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102014137A (en) * 2010-12-13 2011-04-13 哈尔滨工业大学 Common distributed data recording device and method based on HLA (high level architecture)
CN103327048A (en) * 2012-03-22 2013-09-25 阿里巴巴集团控股有限公司 Online-application data matching method and device
CN102855299A (en) * 2012-08-16 2013-01-02 上海引跑信息科技有限公司 Method for realizing iterative migration of distributed database without interrupting service
CN103389946A (en) * 2013-07-16 2013-11-13 中国科学院计算技术研究所 Fragmentization removal method and system
CN103455633A (en) * 2013-09-24 2013-12-18 浪潮齐鲁软件产业有限公司 Method of distributed analysis for massive network detailed invoice data
CN103678520A (en) * 2013-11-29 2014-03-26 中国科学院计算技术研究所 Multi-dimensional interval query method and system based on cloud computing
CN106034160A (en) * 2015-03-19 2016-10-19 阿里巴巴集团控股有限公司 Distributed computing system and method

Also Published As

Publication number Publication date
CN108282378A (en) 2018-07-13

Similar Documents

Publication Publication Date Title
CN110213068B (en) Message middleware monitoring method and related equipment
CN110232010A (en) A kind of alarm method, alarm server and monitoring server
CN110471821B (en) Abnormality change detection method, server, and computer-readable storage medium
CN112152823B (en) Website operation error monitoring method and device and computer storage medium
US9588813B1 (en) Determining cost of service call
CN112307057A (en) Data processing method and device, electronic equipment and computer storage medium
CN112054915B (en) Processing method, device and system for client exception pre-warning and computing equipment
WO2018122890A1 (en) Log analysis method, system, and program
CN113505044A (en) Database warning method, device, equipment and storage medium
CN107957933B (en) Data replication monitoring method and device
US9009735B2 (en) Method for processing data, computing node, and system
JP2019049802A (en) Failure analysis supporting device, incident managing system, failure analysis supporting method, and program
CN108282378B (en) Method and device for monitoring network flow
CN110138720B (en) Method and device for detecting abnormal classification of network traffic, storage medium and processor
CN111784176A (en) Data processing method, device, server and medium
CN110955587A (en) Method and device for determining equipment to be replaced
CN112000657A (en) Data management method, device, server and storage medium
CN110968475A (en) Method and device for monitoring webpage, electronic equipment and readable storage medium
EP2770447A1 (en) Data processing method, computational node and system
CN115543665A (en) Memory reliability evaluation method and device and storage medium
CN114816915A (en) Link tracking method and device
CN114860432A (en) Method and device for determining information of memory fault
CN113409876A (en) Method and system for positioning fault hard disk
CN114285786A (en) Method and device for constructing network link library
CN111061609A (en) Log monitoring method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant