CN112350854A - Flow fault positioning method, device, equipment and storage medium - Google Patents

Flow fault positioning method, device, equipment and storage medium Download PDF

Info

Publication number
CN112350854A
CN112350854A CN202011142411.6A CN202011142411A CN112350854A CN 112350854 A CN112350854 A CN 112350854A CN 202011142411 A CN202011142411 A CN 202011142411A CN 112350854 A CN112350854 A CN 112350854A
Authority
CN
China
Prior art keywords
data
network
data packet
monitoring data
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011142411.6A
Other languages
Chinese (zh)
Other versions
CN112350854B (en
Inventor
郭俊
孙姗姗
丁利锋
张越鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202011142411.6A priority Critical patent/CN112350854B/en
Publication of CN112350854A publication Critical patent/CN112350854A/en
Application granted granted Critical
Publication of CN112350854B publication Critical patent/CN112350854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for positioning flow faults, which are suitable for an IPv6 network, wherein the method comprises the following steps: acquiring traffic monitoring data of a network to be detected, which is collected from a plurality of monitoring points; the multiple monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located; analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result; and positioning a fault occurrence point in the network to be detected according to the analysis result, and performing fault alarm according to the positioning result. The embodiment of the invention can analyze the flow monitoring data in the container network and the virtual private network, and position and alarm the fault occurrence point of the network system according to the analysis result.

Description

Flow fault positioning method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of network communication, in particular to a method, a device, equipment and a storage medium for positioning a flow fault.
Background
The internet platform deploys the applications on the physical machines at first, and only one application is deployed on one physical machine in order to ensure that the applications do not conflict with each other. With the continuous development of application services, the strategy of deploying one application by one physical machine is too resource-wasting, so that a plurality of applications are deployed on one physical machine, but the management is troublesome, and certain conflicts or mutual influences exist among the applications.
The virtual machine has come, more applications can be deployed on the virtual machine, and isolation is good. But the granularity of virtual machine resource isolation is too coarse and containers are developed. The container can pack an application, cover information such as environment configuration and the like, can be operated as only one process, has certain isolation, and simultaneously controls the granularity of resource use to be fine enough.
In the cloud computing era of today, virtualization technology is used as a main key technology for constructing cloud computing, so that the performance and reliability of a cloud computing platform can be ensured, and hardware resources of the cloud computing platform can be utilized to the maximum extent. The container technology is used as a light-weight virtualization technology, so that intermediate levels are reduced, and efficient and accurate control over system resources is realized. However, the container network and the virtual private network need to be deployed in the container and the virtual machine and in various software-defined gateways, and an effective automatic network flow analysis tool based on a virtualization environment is not available at present.
At present, a perfect analysis means is provided for a traditional network, and a method commonly adopted for positioning a network fault of an IPv4 network architecture is to perform packet capture analysis on each node of a full link of the network, and determine which node in the network has a link establishment failure condition by analyzing a data packet, so as to determine whether the node has a network fault. The mode has complex configuration commands and various scenes, and the captured data can only be used for post analysis, thereby seriously influencing the troubleshooting efficiency.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. To this end, the first aspect of the present invention provides a traffic fault location method, where the method is applied to an IPv6 network, and the method includes:
acquiring traffic monitoring data of a network to be detected, which is collected from a plurality of monitoring points; a plurality of the monitoring points are deployed on a container and/or a virtual machine and/or a server; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located;
analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result;
and positioning a fault occurrence point in the network to be detected according to the analysis result, and performing fault alarm according to the positioning result.
Further, the acquiring traffic monitoring data of the network traffic to be detected, which is collected from the multiple monitoring points, includes:
capturing a network data packet and extracting a packet header;
and/or extracting system and application logs;
and/or acquiring a task ID in application interaction;
and/or, obtaining system and network performance information.
Further, before analyzing the traffic monitoring data based on the preset analysis rule, the method includes:
and carrying out data preprocessing on the flow monitoring data.
Further, the performing data preprocessing on the flow monitoring data includes:
judging whether a first data packet and a second data packet in the flow monitoring data are the same data packet or not; the second data packet is other data packets except the first data packet in the flow monitoring data;
if so, merging the first data packet and the second data packet to obtain a merged first data packet; the merged first data packet comprises data information of the first data packet, a timestamp of the first data packet and a timestamp of the second data packet;
and executing the step of judging whether the first data packet and the second data packet in the traffic monitoring data are the same data packet or not until the traffic monitoring data do not have the data packet which is the same as the first data packet.
Further, after the determining whether the first data packet and the second data packet in the traffic monitoring data are the same data packet, the method further includes:
if not, storing the first data packet; alternatively, the first and second electrodes may be,
if not, the first data packet and the related data corresponding to the first data packet are synchronously stored.
Further, the performing data preprocessing on the flow monitoring data includes:
extracting the length of an IP address in the flow monitoring data;
when the length of the IP address is 128 bits, triggering the next step;
and when the length of the IP address is 32 bits, discarding the flow monitoring data corresponding to the IP address.
Further, still include: analyzing the flow monitoring data based on a preset analysis rule to obtain intermediate data;
and displaying the flow monitoring data and/or the intermediate data and/or the analysis result by the flow monitoring data while executing the step of analyzing the flow monitoring data based on the preset analysis rule.
Further, the analyzing the preprocessed traffic monitoring data based on a preset analysis rule includes:
carrying out inductive statistics on the flow monitoring data;
generating a baseline standard and an alarm threshold based on the statistical result; wherein the alarm threshold is higher than the baseline criteria;
and generating alarm information when the traffic monitoring data suddenly exceeds an alarm threshold or is obviously lower than the baseline standard.
The second aspect of the present invention provides a traffic fault location apparatus, which is suitable for an IPv6 network, and includes:
the data acquisition module is used for acquiring the traffic monitoring data of the network to be detected, which are collected from a plurality of monitoring points; a plurality of the monitoring points are deployed on a container and/or a virtual machine and/or a server; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located;
the data analysis module is used for analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result;
and the fault alarm module is used for positioning a fault occurrence point existing in the network to be detected according to the analysis result and carrying out fault alarm according to the positioning result.
A third aspect of the present invention provides an apparatus, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the method for locating a traffic fault according to the first aspect of the present invention.
A fourth aspect of the present invention provides a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the method for locating a traffic fault according to the first aspect of the present invention.
The implementation of the invention has the following beneficial effects:
1. the embodiment of the invention adopts a systematized method to analyze the flow in the container network and the virtual private network and judges whether the network system has faults or not by analyzing and processing the data packet.
2. The embodiment of the invention collects, analyzes and prejudges the log information in the container network and the virtual private network, and judges whether the network system has faults or not according to the information in the log.
3. The abnormal result of the system analysis of the embodiment of the invention can inform the relevant system management personnel in the modes of warning, visual display and the like, thereby facilitating the system management personnel to process the fault in the system at the first time.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a traffic fault location method according to an embodiment of the present invention;
FIG. 2 is a diagram of a design architecture of a flow fault locating device provided by an embodiment of the present invention;
fig. 3 is a flowchart of a method for locating a traffic fault according to an embodiment of the present invention;
FIG. 4 is a flow chart of data preprocessing for traffic monitoring data according to an embodiment of the present invention;
fig. 5 is a block diagram of a flow fault locating device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. Examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout.
The following are some noun explanations relating to the embodiments of the present invention:
(1)IPv6
IPv6 is an abbreviation of Internet Protocol Version 6, Version 6 of the english language, and is the next generation IP Protocol designed by the Internet Engineering Task Force (IETF) to replace IPv4, and the number of addresses can be called as one address for each sand worldwide. The biggest problem of the IPv4 is that the network address resource is insufficient, and the application and the development of the Internet are severely restricted. The use of the IPv6 not only solves the problem of the number of network address resources, but also solves the obstacle of connecting various access devices to the Internet.
The address length of the IPv6 is 128 bits, which is 4 times of the address length of the IPv 4. The IPv4 dot decimal format is then no longer applicable, in hexadecimal notation. IPv6 has 3 representation methods: the method comprises a first-out sixteen-in-one expression method, a 0-bit compression expression method and an embedded IPv4 address expression method.
(2) Container network
A Linux container is a series of processes that are isolated from the rest of the system. All files required for running the processes are provided by another image, and the container technology has high portability and consistency because the containers are provided by the same image version. Thus, the containers are operated much faster than development channels that rely on repeated traditional testing environments. The containers are relatively common and easy to use.
The container network is a network interconnecting containers, and can connect a plurality of independent containers together to realize mutual scheduling and access among container resources.
Examples
FIG. 1 is a flow chart of a method for flow fault location provided by an embodiment of the present invention, where the present specification provides method operational steps as in the embodiment or the flow chart, but may include more or less operational steps based on routine or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 1, the method for locating a flow fault according to this embodiment includes the following steps:
s101: acquiring traffic monitoring data of a network to be detected, which is collected from a plurality of monitoring points;
specifically, the network to be detected is an IPv6 network, which may specifically be a container network, a virtual private network, or the like; the monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located; in one embodiment, the network to be detected comprises a container and a server, wherein monitoring points are respectively deployed on the container and the server; in one embodiment, the network to be detected comprises a container, a virtual machine and a server, wherein monitoring points are respectively deployed on the container and the virtual machine; it should be noted that the deployment positions and the number of the monitoring points may be set according to the actual fault location analysis requirement, which is not limited in this embodiment.
Specifically, the traffic monitoring data includes, but is not limited to, the IPv6 address of the container, virtual machine, or server where the monitoring point is located. Due to the adoption of the IPV6 technology, the uniqueness of the address code of the container, the virtual machine or the server equipment where the monitoring point is located can be ensured.
Fig. 2 is a design and architecture diagram of a flow fault location device according to an embodiment of the present invention, and specifically, as shown in fig. 2, agents are deployed on a container, a virtual machine, and a physical server, and the agents are applets installed at various monitoring points such as a CVM, the container, a parent machine, a BMS bare metal, and an elastic network card, and have weak analysis capabilities.
Specifically, the agent is responsible for monitoring and capturing network data packets, extracting packet headers, extracting system and application logs, and acquiring information such as task IDs, system and network performance in application interaction. The agent may periodically send the information to the data collection and aggregation component. The user can also check the relevant collected information of the user through the agent on the container, the virtual machine and the physical server, and judge whether the application or the system has problems through the collected information in all aspects at the same time point.
Preferably, in the status information sent by the agent to the data collection and aggregation component on a daily basis, the network packet part only includes the content of the packet header, so as to reduce the influence of the content of the packet capture on the capacity of the production network. If necessary, the data packet can be adjusted by the management component and then changed into a full data packet.
S102: analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result;
specifically, the preset analysis rule may be determined according to actual analysis and monitoring requirements, the preset analysis rule is adapted to the analysis and monitoring requirements, and the analysis rules corresponding to different analysis and monitoring requirements are different.
Optionally, a big data analysis technology is adopted to analyze the traffic monitoring data based on a preset analysis rule, and the problem of locating the north-south fault (namely the traffic fault between the client and the data center) in the network is mainly solved.
Optionally, the traffic monitoring data is analyzed based on a preset analysis rule to obtain intermediate data and an analysis result. The intermediate data includes, but is not limited to, a change value of network traffic, a change value of interface capacity, a change value of delay, and a task flow statistic result.
In a specific scenario, the flow of the data analysis component performing intelligent analysis on the received traffic monitoring data is as follows:
a. and carrying out induction statistics on the change of network flow, the change of interface capacity and the change of time delay in the flow monitoring data, and establishing a baseline standard through intelligent analysis to form an alarm threshold.
Analyzing the traffic monitoring data further comprises: predictions are made about future capacity changes, and the prediction results can be used for further determining whether other undetected fault occurrence points exist in the current state, or the prediction results can also be used for predicting future impending fault alarms.
And sending the intermediate data and the analysis result obtained by analyzing the flow monitoring data to the visual display component in real time.
b. And aiming at the tracking of the task ID, establishing a baseline standard through intelligent analysis, forming an alarm threshold value and sending a task flow statistical result to a visual display component.
c. For log analysis, display and alarm can be performed according to predefined log keyword levels, and relevant information is sent to an external alarm system in the form of short messages and the like.
The data analysis component also has the functions of alarm suppression, duplicate removal, maintenance setting and the like, can perform short-time alarm suppression on a large number of repeated alarms, and can set the system not to send alarm short messages and report alarm information during system maintenance. In one embodiment, after performing fault warning according to the positioning result, the method further includes:
judging whether the fault alarm meets an alarm suppression condition;
if yes, executing alarm suppression operation on the fault alarm according to a preset rule;
judging whether the fault alarm in the alarm suppression state meets an alarm suppression cancellation condition or not;
and if so, executing alarm suppression cancelling operation on the fault alarm.
S103: and positioning a fault occurrence point in the network to be detected according to the analysis result, and performing fault alarm according to the positioning result.
The fault alarm can be performed by sending alarm results to the user side or other control components in various forms such as short messages, audio, video, characters, pictures and the like.
In order to reduce the alarm information flow and facilitate the analysis and positioning of the problem for the user, the corresponding alarm suppression treatment can be carried out according to the analysis result and the preset rule for the alarm suppression.
Further, the step S102 is executed while the following steps are included:
and displaying the flow monitoring data and/or the intermediate data and/or the analysis result.
The visualization display system is responsible for the page display of monitoring, alarming and forecasting data, and the display form comprises a pie chart, a curve, a topology and the like. The display information can help an administrator to visually check the operation condition of the system, and the fault node can be displayed on the visual display system at the first time.
In order to meet personal preferences, use habits and the like of different tenants, the visual display interface can be adjusted according to interface adjustment requirement information input by the tenants, wherein the interface adjustment requirement information comprises but is not limited to changing the style of the display interface, changing the position of the functional component and the like.
Specifically, step S102 includes the steps of:
carrying out inductive statistics on the flow monitoring data;
generating a baseline standard and an alarm threshold based on the statistical result; wherein the alarm threshold is higher than the baseline criteria;
optionally, the baseline criteria include, but are not limited to, total traffic, maximum instantaneous traffic, minimum instantaneous traffic, data flow direction, duration, and the traffic data for baseline analysis should be traffic data of each service module of the automation system for a specific IP address, port, protocol.
Alarm information is generated when the flow monitoring data suddenly exceeds an alarm threshold or is significantly below a baseline level.
When the flow monitoring data is the task ID in the application interaction, the sudden task flow path change will automatically generate the alarm information.
For example, in a fault scenario where the application response times out, the flow fault location procedure is as follows:
a. the agent sends daily flow monitoring data to the data analysis component through the data collection and collection component;
b. the data analysis component carries out intelligent analysis after receiving the data, and can transversely carry out analysis by applying indexes such as transaction chain breakpoints, application processing delay exceeding a threshold value, unchanged interface bandwidth and the like, and finally gives a fault point positioning result. The result is sent to the relevant responsible person through an alarm short message; and synchronously sending the alarm result to the management and control component.
c. And the management and control component receives the alarm, checks whether the automatic emergency operation is targeted for the alarm, and if so, executes the automatic emergency operation according to the emergency operation script, such as deleting the container, pulling up the fault point application again and the like.
d. The management and control component synchronously returns the processing state to the data analysis component, the data analysis component carries out maintenance operation on related alarms, and alarm short messages of emergency processing operation are synchronously sent to related management personnel.
e. After the emergency operation is finished, the management and control component synchronously returns the processing result to the data analysis component, the data analysis component sends a short message to a related manager for the related processing result, and the maintenance state of related alarm is synchronously cancelled.
For example, in a fault scenario applying a bug, a flow fault locating process is as follows:
a. the agent sends daily flow monitoring data to the data analysis component through the data collection and collection component; the data analysis component carries out intelligent analysis after receiving the data, and can transversely carry out analysis by indexes such as application transaction chain breakpoints, application processing delay, application log keywords and the like, and finally gives a fault point positioning result. The result is sent to the relevant responsible person through an alarm short message.
b. Before software upgrading, an application manager can control the data analysis component to carry out maintenance operation on related alarms through the control component.
e. After the software upgrading operation is finished, the data analysis component is controlled by the pipe control component to cancel the maintenance state of the related alarm.
Fig. 3 is a flowchart of a flow fault location method provided in an embodiment of the present invention, and in order to improve efficiency of fault analysis and save time, the embodiment provides a flow fault location method, specifically as shown in fig. 3, the method includes the following steps:
s201: acquiring traffic monitoring data of a network to be detected, which is collected from a plurality of monitoring points; a plurality of monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected;
s202: carrying out data preprocessing on the flow monitoring data to obtain preprocessed flow monitoring data;
and the data collection and summary component collects and summarizes data sent by agents of the whole network in a unified manner. In order to improve the efficiency of fault analysis and save time, in the process of summarizing, the component performs deduplication processing, merges the same data packets, and only retains necessary information such as timestamps, traffic status codes, process information, IP addresses and the like. Meanwhile, the data collection and summarization component can send summary information of the data to the data analysis component in real time.
S203: analyzing the preprocessed flow monitoring data based on a preset analysis rule to obtain an analysis result;
optionally, the traffic monitoring data is analyzed based on a preset analysis rule to obtain intermediate data and an analysis result.
S204: and positioning a fault occurrence point in the network to be detected according to the analysis result, and performing fault alarm according to the positioning result.
Fig. 4 is a flowchart of data preprocessing on flow monitoring data according to an embodiment of the present invention, and specifically as shown in fig. 4, the data preprocessing on flow monitoring data includes the following steps:
s301: judging whether a first data packet and a second data packet in the flow monitoring data are the same data packet or not; the second data packet is other data packets except the first data packet in the flow monitoring data;
s302: if the first data packet and the second data packet are the same data packet, merging the first data packet and the second data packet to obtain a merged first data packet; the merged first data packet comprises data information of the first data packet, a timestamp of the first data packet and a timestamp of the second data packet;
s303: and executing the step of judging whether the first data packet and the second data packet in the traffic monitoring data are the same data packet or not until the data packet which is the same as the first data packet does not exist in the traffic monitoring data.
The data collection and summarization component sends the processed data to the data storage component for subsequent query of the original data.
In one embodiment, step S301 is followed by the following steps: if the first data packet is not the same as the second data packet, storing the first data packet; preferably, the data collection component supports multi-tenancy and can be stored in a designated data storage system according to tenant selection.
In one embodiment, step S301 is followed by the following steps: and if the first data packet and the second data packet are not the same data packet, synchronously storing the first data packet and the related data corresponding to the first data packet. For example, the related data can be synchronously stored in a data storage system such as a designated COS and the like according to the selection of the tenant.
Optionally, the data preprocessing is performed on the flow monitoring data, and includes the following steps:
extracting the length of an IP address in the flow monitoring data;
when the length of the IP address is 128 bits, triggering the next step;
and when the length of the IP address is 32 bits, discarding the flow monitoring data corresponding to the IP address. Through screening the flow monitoring data, the data which do not meet the requirements are removed, the workload of flow monitoring data analysis can be reduced, and the speed of obtaining the analysis result is accelerated to a certain extent.
The embodiment of the invention provides a flow fault positioning method, a device, equipment and a storage medium suitable for a container network and a virtual private network.
It is to be understood that the invention is not limited by the illustrated ordering of acts, as some steps may occur in other orders or concurrently with other steps, in accordance with the invention.
Fig. 5 is a block diagram of a flow fault location device according to an embodiment of the present invention, and specifically, as shown in fig. 5, the device includes the following modules:
a data obtaining module 401, configured to obtain traffic monitoring data of a to-be-detected network collected from multiple monitoring points;
specifically, the network to be detected may be a container network, a virtual private network, or the like, the container network may be an IPv4 container network, or an IPv6 container network, and the plurality of monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected;
specifically, the traffic monitoring data includes, but is not limited to, the IPv6 address of the container, virtual machine, or server where the monitoring point is located. Due to the adoption of the IPV6 technology, the uniqueness of the address code of the container, the virtual machine or the server equipment where the monitoring point is located can be ensured.
A data analysis module 402, configured to analyze the traffic monitoring data based on a preset analysis rule to obtain an analysis result;
and a fault alarm module 403, configured to locate a fault occurrence point existing in the network to be detected according to the analysis result, and perform fault alarm according to the location result.
In one embodiment, the device further includes a data preprocessing module, configured to perform data preprocessing on the traffic monitoring data before analyzing the traffic monitoring data based on a preset analysis rule, so as to obtain preprocessed traffic monitoring data;
specifically, the data preprocessing module comprises:
the judging module is used for judging whether a first data packet and a second data packet in the flow monitoring data are the same data packet or not; the second data packet is other data packets except the first data packet in the flow monitoring data;
the merging module is used for merging the first data packet and the second data packet when the first data packet and the second data packet in the flow monitoring data are the same data packet to obtain a merged first data packet; the merged first data packet comprises data information of the first data packet, a timestamp of the first data packet and a timestamp of the second data packet;
and the repeating module is used for executing the step of judging whether the first data packet and the second data packet in the traffic monitoring data are the same data packet or not, and executing the next step until the traffic monitoring data do not have the data packet which is the same as the first data packet.
Optionally, the data preprocessing module includes:
the extraction module is used for extracting the length of the IP address in the flow monitoring data;
the triggering module is used for triggering the data analysis module when the length of the IP address is 128 bits;
and the discarding module is used for discarding the flow monitoring data corresponding to the IP address when the length of the IP address is 32 bits.
Optionally, the apparatus further includes a data storage module, configured to store a first data packet in the flow monitoring data when the first data packet is not the same as a second data packet;
optionally, the apparatus further includes a data storage module, configured to store the first data packet and the related data corresponding to the first data packet synchronously when the first data packet and the second data packet in the flow monitoring data are not the same data packet.
Optionally, the device further includes a visualization display module, and the data analysis module 402 is configured to analyze the flow monitoring data based on a preset analysis rule to obtain intermediate data and an analysis result; the visual display module is used for displaying the flow monitoring data and/or the intermediate data and/or the analysis result in real time when analyzing the flow monitoring data based on the preset analysis rule.
Specifically, the data analysis module 402 includes:
the statistical module is used for carrying out inductive statistics on the flow monitoring data;
the standard and threshold generating module is used for generating a baseline standard and an alarm threshold based on the statistical result; wherein the alarm threshold is higher than the baseline criteria;
and the alarm information generating module is used for generating alarm information when the traffic monitoring data suddenly exceeds an alarm threshold or is obviously lower than a baseline standard.
Embodiments of the present invention also provide an apparatus comprising a processor and a memory, the memory having at least one instruction, at least one program, set of codes, or set of instructions stored therein, the at least one instruction, the at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement a method of traffic fault localization as in the method embodiments.
Embodiments of the present invention further provide a storage medium, where the storage medium may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing the traffic fault location method in the method embodiment, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the traffic fault location method provided in the above method embodiment.
Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
As can be seen from the embodiments of the method, the apparatus, the device or the storage medium for locating a traffic fault provided by the present invention, the embodiments of the present invention adopt a systematic method to monitor and collect traffic in a container network and a virtual private network, and determine whether a fault exists in the container network system by analyzing and processing a data packet; collecting, analyzing and prejudging the log information in the container network and the virtual private network, and judging whether network system faults exist in the container network and the virtual private network or not through the information in the log; and the abnormal result of the system analysis can be notified to relevant system management personnel in the modes of alarming, visual display and the like, so that the system management personnel can conveniently process the faults existing in the system at the first time.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (11)

1. A traffic fault positioning method is applicable to an IPv6 network, and comprises the following steps:
acquiring traffic monitoring data of a network to be detected, which is collected from a plurality of monitoring points; the monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located;
analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result;
and positioning a fault occurrence point in the network to be detected according to the analysis result, and performing fault alarm according to the positioning result.
2. The method according to claim 1, wherein the obtaining traffic monitoring data of the network to be detected collected from a plurality of monitoring points comprises:
capturing a network data packet and extracting a packet header;
and/or extracting system and application logs;
and/or acquiring a task ID in application interaction;
and/or, obtaining system and network performance information.
3. The method according to claim 1, wherein before analyzing the traffic monitoring data based on the preset analysis rule, the method comprises:
and carrying out data preprocessing on the flow monitoring data.
4. The method of claim 3, wherein the data preprocessing of the traffic monitoring data comprises:
judging whether a first data packet and a second data packet in the flow monitoring data are the same data packet or not; the second data packet is other data packets except the first data packet in the flow monitoring data;
if so, merging the first data packet and the second data packet to obtain a merged first data packet; the merged first data packet comprises data information of the first data packet, a timestamp of the first data packet and a timestamp of the second data packet;
and executing the step of judging whether the first data packet and the second data packet in the traffic monitoring data are the same data packet or not until the traffic monitoring data do not have the data packet which is the same as the first data packet.
5. The method of claim 4, wherein after determining whether the first packet and the second packet in the traffic monitoring data are the same packet, the method further comprises:
if not, storing the first data packet; alternatively, the first and second electrodes may be,
if not, the first data packet and the related data corresponding to the first data packet are synchronously stored.
6. The method of claim 3, wherein the data preprocessing of the traffic monitoring data comprises:
extracting the length of an IP address in the flow monitoring data;
when the length of the IP address is 128 bits, triggering the next step;
and when the length of the IP address is 32 bits, discarding the flow monitoring data corresponding to the IP address.
7. The method of claim 1, further comprising:
analyzing the flow monitoring data based on a preset analysis rule to obtain intermediate data;
and displaying the flow monitoring data and/or the intermediate data and/or the analysis result while executing the step of analyzing the flow monitoring data based on the preset analysis rule.
8. The method according to claim 1, wherein the analyzing the traffic monitoring data based on the preset analysis rule comprises:
carrying out inductive statistics on the flow monitoring data;
generating a baseline standard and an alarm threshold based on the statistical result; wherein the alarm threshold is higher than the baseline criteria;
and generating alarm information when the traffic monitoring data suddenly exceeds an alarm threshold or is obviously lower than the baseline standard.
9. A traffic fault location device, wherein the device is suitable for IPv6 network, the device includes:
the data acquisition module is used for acquiring the traffic monitoring data of the network to be detected, which are collected from a plurality of monitoring points; the monitoring points are deployed on a container and/or a virtual machine and/or a server of the network to be detected; the flow monitoring data comprises the IPv6 address of the container, the virtual machine or the server where the monitoring point is located;
the data analysis module is used for analyzing the flow monitoring data based on a preset analysis rule to obtain an analysis result;
and the fault alarm module is used for positioning a fault occurrence point existing in the network to be detected according to the analysis result and carrying out fault alarm according to the positioning result.
10. An apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the traffic fault localization method according to any one of claims 1-8.
11. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of flow fault location according to any one of claims 1 to 8.
CN202011142411.6A 2020-10-22 2020-10-22 Flow fault positioning method, device, equipment and storage medium Active CN112350854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011142411.6A CN112350854B (en) 2020-10-22 2020-10-22 Flow fault positioning method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011142411.6A CN112350854B (en) 2020-10-22 2020-10-22 Flow fault positioning method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112350854A true CN112350854A (en) 2021-02-09
CN112350854B CN112350854B (en) 2022-11-18

Family

ID=74359829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011142411.6A Active CN112350854B (en) 2020-10-22 2020-10-22 Flow fault positioning method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112350854B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113179182A (en) * 2021-04-27 2021-07-27 中国联合网络通信集团有限公司 Network supervision method, device, equipment and storage medium
CN114422337A (en) * 2022-03-09 2022-04-29 中国建设银行股份有限公司 Method and related device for network packet capturing and fault positioning
CN115190001A (en) * 2022-07-22 2022-10-14 天翼云科技有限公司 Network abnormal state analysis method and device
CN116886517A (en) * 2023-09-04 2023-10-13 江苏点石乐投科技有限公司 Alarm system and method based on flow data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105635035A (en) * 2014-10-27 2016-06-01 青岛金讯网络工程有限公司 Method for monitoring flow of virtual machine
CN107111509A (en) * 2014-10-26 2017-08-29 微软技术许可有限责任公司 Method for the virtual machine (vm) migration in computer network
WO2018001326A1 (en) * 2016-06-29 2018-01-04 中兴通讯股份有限公司 Method and device for acquiring fault information
US20180351782A1 (en) * 2017-05-31 2018-12-06 Cisco Technology, Inc. Associating network policy objects with specific faults corresponding to fault localizations in large-scale network deployment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107111509A (en) * 2014-10-26 2017-08-29 微软技术许可有限责任公司 Method for the virtual machine (vm) migration in computer network
CN105635035A (en) * 2014-10-27 2016-06-01 青岛金讯网络工程有限公司 Method for monitoring flow of virtual machine
WO2018001326A1 (en) * 2016-06-29 2018-01-04 中兴通讯股份有限公司 Method and device for acquiring fault information
US20180351782A1 (en) * 2017-05-31 2018-12-06 Cisco Technology, Inc. Associating network policy objects with specific faults corresponding to fault localizations in large-scale network deployment

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113179182A (en) * 2021-04-27 2021-07-27 中国联合网络通信集团有限公司 Network supervision method, device, equipment and storage medium
CN113179182B (en) * 2021-04-27 2022-11-22 中国联合网络通信集团有限公司 Network supervision method, device, equipment and storage medium
CN114422337A (en) * 2022-03-09 2022-04-29 中国建设银行股份有限公司 Method and related device for network packet capturing and fault positioning
CN115190001A (en) * 2022-07-22 2022-10-14 天翼云科技有限公司 Network abnormal state analysis method and device
CN115190001B (en) * 2022-07-22 2024-03-08 天翼云科技有限公司 Network abnormal state analysis method and device
CN116886517A (en) * 2023-09-04 2023-10-13 江苏点石乐投科技有限公司 Alarm system and method based on flow data
CN116886517B (en) * 2023-09-04 2023-11-24 江苏点石乐投科技有限公司 Alarm system and method based on flow data

Also Published As

Publication number Publication date
CN112350854B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
CN112350854B (en) Flow fault positioning method, device, equipment and storage medium
JP7325584B2 (en) ALARM LOG COMPRESSION METHOD, APPARATUS AND SYSTEM, AND STORAGE MEDIUM
CN107508722B (en) Service monitoring method and device
US8266097B2 (en) System analysis program, system analysis method, and system analysis apparatus
CN108900374B (en) Data processing method and device applied to DPI equipment
CN108989136B (en) Business end-to-end performance monitoring method and device
CN106941493B (en) Network security situation perception result output method and device
CN111274604B (en) Service access method, device, equipment and computer readable storage medium
CN111124819A (en) Method and device for monitoring full link
CN111176941B (en) Data processing method, device and storage medium
WO2019006008A1 (en) Apparatus and method for monitoring network performance of virtualized resources
CN107635003A (en) The management method of system journal, apparatus and system
CN105007175A (en) Openflow-based flow depth correlation analysis method and system
US10775751B2 (en) Automatic generation of regular expression based on log line data
CN111258971A (en) Application state monitoring alarm system and method based on access log
CN111970151A (en) Flow fault positioning method and system for virtual and container network
CN114598622B (en) Data monitoring method and device, storage medium and computer equipment
CN110572291A (en) System and method for realizing automatic framework identification function for distributed system
CN112910842A (en) Network attack event evidence obtaining method and device based on flow reduction
CN107181701A (en) The collection method and device of CGI data
CN112822075A (en) Service link tracking method and related device
CN110933066A (en) Monitoring system and method for illegal access of network terminal to local area network
CN116302711B (en) Disaster recovery method and device based on cloud platform mirror image, electronic equipment and storage medium
CN116074388B (en) Flow forwarding method and system based on log queue
CN112929362B (en) Probe device, front-end message processing method and wireless communication management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant