CN117596133B - Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data - Google Patents

Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data Download PDF

Info

Publication number
CN117596133B
CN117596133B CN202410069535.8A CN202410069535A CN117596133B CN 117596133 B CN117596133 B CN 117596133B CN 202410069535 A CN202410069535 A CN 202410069535A CN 117596133 B CN117596133 B CN 117596133B
Authority
CN
China
Prior art keywords
data
time
service
value
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410069535.8A
Other languages
Chinese (zh)
Other versions
CN117596133A (en
Inventor
吴博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Zhongce Information Technology Co ltd
Original Assignee
Shandong Zhongce Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Zhongce Information Technology Co ltd filed Critical Shandong Zhongce Information Technology Co ltd
Priority to CN202410069535.8A priority Critical patent/CN117596133B/en
Publication of CN117596133A publication Critical patent/CN117596133A/en
Application granted granted Critical
Publication of CN117596133B publication Critical patent/CN117596133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Environmental & Geological Engineering (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)

Abstract

The invention discloses a business portrayal and anomaly monitoring system and method based on multidimensional data, belonging to the technical field of data anomaly monitoring, comprising the following steps: the system comprises an active detection system, a passive data analysis system and a multidimensional data analysis platform; the monitoring method comprises the following steps: data acquisition, data preprocessing, feature extraction, business portrayal construction, setting a stepping time length parameter, acquiring a new time attribute by adopting the current time attribute to step forward and backward, and repeating the analysis process to dynamically monitor the basic states of the business portrayal of different time attributes so as to acquire the business portrayal of a dynamic monitoring time interval. The invention can comprehensively and accurately reflect the user behavior by collecting the multidimensional data of the network traffic and improve the accuracy of the business portrait.

Description

Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data
Technical Field
The invention discloses a business portrayal and anomaly monitoring system and method based on multidimensional data, and belongs to the technical field of data anomaly monitoring.
Background
With the development of network technology, network services are increasingly abundant and various, and network traffic data are increasingly complex. The traditional network management mode can not meet the requirements of rapid and accurate identification and intelligent management of network service behavior characteristics.
The existing network technologies mainly have several types:
(1) Network device management techniques. And managing and monitoring the network equipment through network traffic and SNMP network management protocol to acquire data such as state, performance and the like of the network equipment.
(2) Network security analysis techniques. And comparing the network traffic data with a virus library or an attack library by acquiring the network traffic data, so as to acquire whether the network traffic data is an attack behavior or a virus transmission behavior or not, and then blocking.
(3) Server agent software monitoring techniques. And acquiring the service running state, service flow delay and other contents on the server through the server installation agent. Or network management software is connected with network equipment through SNMP protocol to check port state of the network equipment and condition of internal LAN link connected with the switch.
To sum up, the prior art has the technical problem that can't be solved:
(1) The data dimension is single: the prior art mainly focuses on single dimension of network traffic, such as traffic size, transmission speed, etc., while ignoring data of other dimensions, such as source IP, target IP, ports, etc., so that comprehensive and accurate analysis of user behaviors is limited.
(2) The accuracy of the data conclusion is not enough: the prior art only analyzes a small amount of or single data indexes, the indexes can only reflect the current data at a certain time point, and the false alarm rate of the analysis result is higher due to the interference of other factors of the network data.
(3) The service and the service operation index cannot be determined: the existing network management technology and network security analysis technology only use a small amount or single data analysis result for network management or security attack detection, and cannot identify different information services and service behaviors.
(4) The full chain state of service access cannot be identified: the existing server agent technology only reflects the service operation condition of the server end and fails to reflect the actual service operation condition of the client end. Aiming at the network management software technology, the network condition reaction of the local area network is limited, and the operation condition of the service cannot be truly reflected. And the prior art cannot accurately reflect the situation that the service from the client to the server runs the full chain and the nodes on the chain.
(5) Lack of active analog user access technology: the prior art lacks active simulation capability for user behaviors, and can not predict and simulate the user behaviors, so that a network administrator can not know the actual use state of a user, and can not discover and solve the network problem in advance.
(6) Cannot be monitored and pre-warned in real time: the prior art cannot monitor network traffic in real time and send out early warning, so that an administrator cannot discover and process network problems in time, and network faults or service interruption can be caused.
Thus, the prior art pairs: the service image is obtained through the multidimensional data, and the service requirements of anomaly monitoring are continuously displayed, so that how to track the whole service access process from the user side by taking the service access process as a view angle, and the real feedback of the perception of the service operation condition by the user side becomes the technical problem to be solved urgently in the technical field.
Disclosure of Invention
Aiming at the defects of the prior art, the invention discloses a business portrait and anomaly monitoring system based on multidimensional data. The invention adopts a multidimensional data combination mode, takes a service access flow as a view angle, and comprehensively analyzes service operation conditions, network operation conditions and service full-chain key node operation conditions based on an active detection technology and a passive data analysis technology to form a service operation monitoring system with accurate service operation sensing and service full-chain monitoring.
The invention also discloses a monitoring method for realizing the monitoring system.
The detailed technical scheme of the invention is as follows:
a business portrayal and anomaly monitoring system based on multidimensional data, comprising: an active detection system (detection probe), a passive data analysis system (flow probe) and a multidimensional data analysis platform;
the active detection system is used for accessing the service system by deploying an active detection probe in a network and adopting a real simulation access method at the side close to the client side so as to acquire service access data, namely service active detection data;
the passive data analysis system obtains specific flow in the network by deploying a flow probe at a core network node so as to obtain related flow original data existing in the network, namely data of passive data analysis;
the multidimensional data analysis platform is used for collecting multidimensional data as a multidimensional data source, and the multidimensional data comprises: actively detecting data by a service; data of passive data analysis; log data of an operating system, middleware, a database and a service platform in the service system; network equipment configuration information and logs; security device configuration and logging;
in the multidimensional data analysis platform:
forming a single time slice dataset: the user sets a basic time unit as the time attribute of the business portrait according to the user's own, wherein the time unit is a day, a week, a month, a year or a designated interval; the user sets the interval time in the basic time unit as a time slice according to the user's own, and the interval time is used for calculating the business portrait index value in the time slice, and the time slice unit is a second, an hour, a day or a designated interval; dividing the data of the multidimensional data source into different time slice data sets according to the time vectors of the time slices, and taking the data in a single time slice as the data set of the single time slice;
performing big data analysis on the data of the single time slice data set: dividing service data in the data of the single time slice data set into different clusters through a K-means clustering algorithm, wherein each cluster represents a service running state index, and obtaining the running state index of a certain time slice of the service by the highest value, the lowest value and the middle value in the data of the single cluster; converging indexes of all clusters into a data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, taking the state parameter set as a normal form reference of the current time attribute of the service operation, namely, the basic state of the service portrait of the current time attribute, and acquiring a new time attribute by adopting the forward and backward movement of the current time attribute;
dynamically monitoring basic states of business portraits with different time attributes by repeating the big data analysis process;
according to basic states of business images of different time attributes, respectively calculating a calculated number average value and an increment deviation value of all basic states in a set time, taking the calculated number average value and the increment deviation value in N time intervals of a time interval to be predicted as input, and predicting a final high boundary value, a final low boundary value and a centering value of the basic states of the business images of the next time attribute through a Support Vector Machine (SVM) algorithm, wherein the centering value refers to the median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
analyzing the running state indexes of the corresponding time slices of the service based on the same comparison time: constructing a decision tree by adopting an isolated Forest (Isolation Forest) algorithm, classifying each running state index, and judging an index of an abnormal value according to a classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
A monitoring method for implementing the above-mentioned monitoring system, characterized by comprising:
(1) And (3) data acquisition: collecting original data of network flow through a passive data analysis system (flow probe), and transmitting the data into an original number bin through a physical network card of the flow probe; the returned data of the active access acquired by the active detection system (detection probe) is also transmitted into the original digital bin; the method comprises the steps that various configurations and log data obtained through a log interface of a multidimensional data analysis platform are transmitted into an original number bin; forming an original data bin of the multidimensional data, and preparing for data preprocessing;
(2) Data preprocessing: in a passive data analysis system (flow probe), cleaning, de-duplication and normalizing pretreatment operations are carried out on the collected multidimensional data so as to obtain pretreatment treatment;
(3) Feature extraction: in a multidimensional data analysis platform, obtaining data simulating real service access characteristics of a user by obtaining the preprocessing data in the step (2), and obtaining various configurations and log data capable of reflecting the user access data and the product information data through a log interface of the multidimensional data analysis platform;
simultaneously extracting characteristic index items of the various configurations and log data, and taking the characteristic index items as the input of the next business portraits; the characteristic index item comprises: network data package class, application system access feature class, application system circulation content class, software and hardware product information class;
(4) The business portrait construction method comprises the following steps:
(4-1) determining the business asset using a clustering algorithm: clustering the original data by a source address, a destination address, a protocol and a port, marking data overlapped with a network address of a service system and in the same network segment, and taking the clustering result of the source address, the destination address, the protocol and the port as a service end of the service system for determining different service system assets;
(4-2) setting a time slice, collecting a time slice dataset: the user sets a basic time unit as the time attribute of the business portrait according to the user's own, wherein the time unit can be day, week, month, year or a designated interval; the user sets the interval time in the basic time unit as a time slice according to the user's own, and the time slice unit can be seconds, hours, days or a designated interval; dividing the data of the multidimensional data source into different time slice data sets according to the time vector of the time slice, and taking the data in a single time slice as the data set of the time slice;
(4-3) clustering to obtain single time slice index items: the service system server addresses obtained by clustering are used as access purposes, the same service system server addresses are classified into the same service, data of a single time slice data set are used as input, the service data are divided into different clusters through a K-means clustering algorithm, and each cluster represents a service running state index;
(4-4) analysis of single index baseline: carrying out service running state index analysis on a single cluster of a single time slice, namely carrying out service running state index data sets [ a, b ] on the basis of a concentration percentage parameter X set by a user and the single cluster of the single time slice, wherein [ a ] and [ b ] are respectively an initial low boundary value and an initial high boundary value of a region in which index data is concentrated, and removing data outside the concentration percentage parameter X to be used as a data concentrated region of the index;
according to the experience of the past data jitter threshold, a service fault-tolerant coefficient k is manually set, defaulting to 1, and the service fault-tolerant coefficient k is overlapped by [ a ] and [ b ] to obtain a final high boundary value and a final low boundary value of a certain time slice of a single service running state index, wherein the formula is calculated as follows:
final low boundary value = kx (a+ (b-a) X) (I); final high boundary value=kx (b- (b-a) ×x) (II);
the method comprises the steps of obtaining a running state index of a certain time slice of a service from a final high boundary value, a final low boundary value and a centering value in data of a single cluster, wherein the centering value refers to a median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
(4-5) liter dimension data set, drawing business portrait basic state: converging the service running state indexes of all clusters into an index data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, and taking the parameter set as a normal form reference of the current time attribute of the service operation, namely the basic state of the service portrait of the current time attribute;
(5) Setting a stepping time length parameter, stepping forward and backward by adopting the current time attribute to obtain a new time attribute, and repeating the analysis processes of the steps (4-3) - (4-5) to dynamically monitor the basic states of the business portraits of different time attributes so as to obtain the business portraits of the dynamic monitoring time interval.
According to the invention, the monitoring method for realizing the monitoring system preferably further comprises the following steps: predicting the value of the next time attribute and forming a trend analysis:
according to basic states of business images of different time attributes, respectively calculating a calculated number average value and an increment deviation value of the highest value, the lowest value and the middle value of the basic states of all the business images in set time, taking the calculated number average value and the increment deviation value of a time interval to be predicted in N time intervals as input, and predicting a final high boundary value, a final low boundary value and the middle value of the basic states of the business images of the next time attribute through a Support Vector Machine (SVM) algorithm, wherein the middle value refers to the median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form a prediction of centering value operation trend and are displayed in a platform.
According to the invention, the monitoring method for realizing the monitoring system preferably further comprises the following steps: and (3) abnormal data analysis: analyzing the time slice running state indexes based on the same time, constructing a decision tree by adopting an isolated Forest (Isolation Forest) algorithm, classifying each running state index, and judging the index of the abnormal value according to the classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
According to the invention, the monitoring method for realizing the monitoring system preferably further comprises the following steps: and inquiring and displaying the service, and inquiring, large-screen displaying, counting and analyzing all data of the whole service or the specific service in a targeted manner through a multidimensional data analysis platform.
The invention has the technical advantages that:
1. the invention can comprehensively and accurately reflect the user behavior by collecting the multidimensional data of the network traffic and improve the accuracy of the business portrait.
2. The invention actively simulates the user access to predict the user behavior and simulate the access behavior of the user to the network, thereby being beneficial to discovering and solving the network problem in advance.
3. Based on the trend of large-scale and tight coupling of the service system, the proxy software is installed on the service system, which may cause performance influence of the service system and unknown adaptation fault to influence the service.
4. The invention adopts the perception that the simulated user accesses to truly acquire the user access service. The operation condition of the service system is obtained not only through bypass data such as network equipment operation data, server operation data and the like, but also through real service access data accessed by a simulation user.
5. The invention forms a self-grinding algorithm based on big data technology and artificial intelligence technology, acquires automatic business operation portraits and abnormal trend prediction, and accurately positions.
6. The algorithm adopted in the invention can be widely applied to various network environments, and has important significance for improving network service quality, automatically identifying service systems, classifying services, conforming services and other scenes.
Drawings
FIG. 1 is a schematic diagram of the overall structure of the system of the present invention;
fig. 2 is a flow chart of the operation of the system of the present invention.
Detailed Description
The present invention will be described in detail with reference to examples and drawings, but is not limited thereto.
Example 1,
As shown in fig. 1, a business portrayal and anomaly monitoring system based on multidimensional data includes: an active detection system (detection probe), a passive data analysis system (flow probe) and a multidimensional data analysis platform;
the active detection system is used for accessing the service system by deploying an active detection probe in a network and adopting a real simulation access method at the side close to the client side so as to acquire service access data, namely service active detection data;
the passive data analysis system obtains specific flow in the network by deploying a flow probe at a core network node so as to obtain related flow original data existing in the network, namely data of passive data analysis;
the multidimensional data analysis platform is used for collecting multidimensional data as a multidimensional data source, and the multidimensional data comprises: actively detecting data by a service; data of passive data analysis; log data of an operating system, middleware, a database and a service platform in the service system; network equipment configuration information and logs; security device configuration and logging;
in the multidimensional data analysis platform:
forming a single time slice dataset: the user sets a basic time unit as the time attribute of the business portrait according to the user's own, wherein the time unit is a day, a week, a month, a year or a designated interval; the user sets the interval time in the basic time unit as a time slice according to the user's own, and the interval time is used for calculating the business portrait index value in the time slice, and the time slice unit is a second, an hour, a day or a designated interval; dividing the data of the multidimensional data source into different time slice data sets according to the time vectors of the time slices, and taking the data in a single time slice as the data set of the single time slice;
performing big data analysis on the data of the single time slice data set: dividing service data in the data of the single time slice data set into different clusters through a K-means clustering algorithm, wherein each cluster represents a service running state index, and obtaining the running state index of a certain time slice of the service by the highest value, the lowest value and the middle value in the data of the single cluster; converging indexes of all clusters into a data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, taking the state parameter set as a normal form reference of the current time attribute of the service operation, namely, the basic state of the service portrait of the current time attribute, and acquiring a new time attribute by adopting the forward and backward movement of the current time attribute;
dynamically monitoring basic states of business portraits with different time attributes by repeating the big data analysis process;
according to basic states of business images of different time attributes, respectively calculating a calculated number average value and an increment deviation value of all basic states in a set time, taking the calculated number average value and the increment deviation value in N time intervals of a time interval to be predicted as input, and predicting a final high boundary value, a final low boundary value and a centering value of the basic states of the business images of the next time attribute through a Support Vector Machine (SVM) algorithm, wherein the centering value refers to the median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
analyzing the running state indexes of the corresponding time slices of the service based on the same comparison time: constructing a decision tree by adopting an isolated Forest (Isolation Forest) algorithm, classifying each running state index, and judging an index of an abnormal value according to a classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
EXAMPLE 2,
As shown in fig. 2, a monitoring method for implementing the above-mentioned monitoring system as described in embodiment 1 includes:
(1) And (3) data acquisition: collecting original data of network flow through a passive data analysis system (flow probe), and transmitting the data into an original number bin through a physical network card of the flow probe; the returned data of the active access acquired by the active detection system (detection probe) is also transmitted into the original digital bin; the method comprises the steps that various configurations and log data obtained through a log interface of a multidimensional data analysis platform are transmitted into an original number bin; forming an original data bin of the multidimensional data, and preparing for data preprocessing;
(2) Data preprocessing: in a passive data analysis system (flow probe), cleaning, de-duplication and normalizing pretreatment operations are carried out on the collected multidimensional data so as to obtain pretreatment treatment;
(3) Feature extraction: in a multidimensional data analysis platform, obtaining data simulating real service access characteristics of a user by obtaining the preprocessing data in the step (2), and obtaining various configurations and log data capable of reflecting the user access data and the product information data through a log interface of the multidimensional data analysis platform;
simultaneously extracting characteristic index items of the various configurations and log data, and taking the characteristic index items as the input of the next business portraits; the characteristic index item comprises: network data package class, application system access feature class, application system circulation content class, software and hardware product information class;
(4) The business portrait construction method comprises the following steps:
(4-1) determining the business asset using a clustering algorithm: clustering the original data by a source address, a destination address, a protocol and a port, marking data overlapped with a network address of a service system and in the same network segment, and taking the clustering result of the source address, the destination address, the protocol and the port as a service end of the service system for determining different service system assets;
(4-2) setting a time slice, collecting a time slice dataset: the user sets a basic time unit as the time attribute of the business portrait according to the user's own, wherein the time unit can be day, week, month, year or a designated interval; the user sets the interval time in the basic time unit as a time slice according to the user's own, and the time slice unit can be seconds, hours, days or a designated interval; dividing the data of the multidimensional data source into different time slice data sets according to the time vector of the time slice, and taking the data in a single time slice as the data set of the time slice;
(4-3) clustering to obtain single time slice index items: the service system server addresses obtained by clustering are used as access purposes, the same service system server addresses are classified into the same service, data of a single time slice data set are used as input, the service data are divided into different clusters through a K-means clustering algorithm, and each cluster represents a service running state index;
(4-4) analysis of single index baseline: carrying out service running state index analysis on a single cluster of a single time slice, namely carrying out service running state index data sets [ a, b ] on the basis of a concentration percentage parameter X set by a user and the single cluster of the single time slice, wherein [ a ] and [ b ] are respectively an initial low boundary value and an initial high boundary value of a region in which index data is concentrated, and removing data outside the concentration percentage parameter X to be used as a data concentrated region of the index;
according to the experience of the past data jitter threshold, a service fault-tolerant coefficient k is manually set, defaulting to 1, and the service fault-tolerant coefficient k is overlapped by [ a ] and [ b ] to obtain a final high boundary value and a final low boundary value of a certain time slice of a single service running state index, wherein the formula is calculated as follows:
final low boundary value = kx (a+ (b-a) X) (I); final high boundary value=kx (b- (b-a) ×x) (II);
the method comprises the steps of obtaining a running state index of a certain time slice of a service from a final high boundary value, a final low boundary value and a centering value in data of a single cluster, wherein the centering value refers to a median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
(4-5) liter dimension data set, drawing business portrait basic state: converging the service running state indexes of all clusters into an index data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, and taking the parameter set as a normal form reference of the current time attribute of the service operation, namely the basic state of the service portrait of the current time attribute;
(5) Setting a stepping time length parameter, stepping forward and backward by adopting the current time attribute to obtain a new time attribute, and repeating the analysis processes of the steps (4-3) - (4-5) to dynamically monitor the basic states of the business portraits of different time attributes so as to obtain the business portraits of the dynamic monitoring time interval.
EXAMPLE 3,
A monitoring method for implementing the monitoring system according to embodiment 2 further comprises: predicting the value of the next time attribute and forming a trend analysis:
according to basic states of business images of different time attributes, respectively calculating a calculated number average value and an increment deviation value of the highest value, the lowest value and the middle value of the basic states of all the business images in set time, taking the calculated number average value and the increment deviation value of a time interval to be predicted in N time intervals as input, and predicting a final high boundary value, a final low boundary value and the middle value of the basic states of the business images of the next time attribute through a Support Vector Machine (SVM) algorithm, wherein the middle value refers to the median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form a prediction of centering value operation trend and are displayed in a platform.
EXAMPLE 4,
A monitoring method for implementing the monitoring system according to embodiment 3 further comprises: and (3) abnormal data analysis: analyzing the time slice running state indexes based on the same time, constructing a decision tree by adopting an isolated Forest (Isolation Forest) algorithm, classifying each running state index, and judging the index of the abnormal value according to the classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
EXAMPLE 5,
The method for implementing the monitoring system according to embodiments 2-4, further comprising: and inquiring and displaying the service, and inquiring, large-screen displaying, counting and analyzing all data of the whole service or the specific service in a targeted manner through a multidimensional data analysis platform.

Claims (5)

1. A business portrayal and anomaly monitoring system based on multidimensional data, comprising: the system comprises an active detection system, a passive data analysis system and a multidimensional data analysis platform;
the active detection system is used for accessing the service system by deploying an active detection probe in a network and adopting a real simulation access method at the side close to the client side so as to acquire service access data, namely service active detection data;
the passive data analysis system obtains specific flow in the network by deploying a flow probe at a core network node so as to obtain related flow original data existing in the network, namely data of passive data analysis;
the multidimensional data analysis platform is used for collecting multidimensional data as a multidimensional data source, and the multidimensional data comprises: actively detecting data by a service; data of passive data analysis; log data of an operating system, middleware, a database and a service platform in the service system; network equipment configuration information and logs; security device configuration and logging;
in the multidimensional data analysis platform:
forming a single time slice dataset: the user sets a basic time unit as the time attribute of the business portrait according to the user's own, wherein the time unit is a day, a week, a month, a year or a designated interval; the user sets the interval time in the basic time unit as a time slice according to the user's own, and the interval time is used for calculating the business portrait index value in the time slice, and the time slice unit is a second, an hour, a day or a designated interval; dividing the data of the multidimensional data source into different time slice data sets according to the time vectors of the time slices, and taking the data in a single time slice as the data set of the single time slice;
performing big data analysis on the data of the single time slice data set: dividing service data in the data of the single time slice data set into different clusters through a K-means clustering algorithm, wherein each cluster represents a service running state index, and obtaining the running state index of a certain time slice of the service by the highest value, the lowest value and the middle value in the data of the single cluster; converging indexes of all clusters into a data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, taking the state parameter set as a normal form reference of the current time attribute of the service operation, namely, the basic state of the service portrait of the current time attribute, and acquiring a new time attribute by adopting the forward and backward movement of the current time attribute;
carrying out big data analysis on the data of the single time slice data set by repeating to dynamically monitor the basic states of the business portraits with different time attributes;
according to basic states of business images with different time attributes, respectively calculating a numerical average value and an increment deviation value of all basic states in set time, taking the numerical average value and the increment deviation value in N time intervals of a time interval to be predicted as input, and predicting a final high boundary value, a final low boundary value and a centering value of the basic states of the business images with the next time attributes through a support vector machine algorithm, wherein the centering value refers to a median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
analyzing the running state indexes of the corresponding time slices of the service based on the same comparison time: constructing a decision tree by adopting an isolated forest algorithm, classifying each running state index, and judging an index of an abnormal value according to a classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
2. A monitoring method for implementing the monitoring system of claim 1, comprising:
(1) And (3) data acquisition: collecting original data of network flow through a passive data analysis system, and transmitting the data into an original number bin through a physical network card of a flow probe; the actively accessed return data acquired by the active detection system is also transmitted into the original number bin; the method comprises the steps that various configurations and log data obtained through a log interface of a multidimensional data analysis platform are transmitted into an original number bin; forming an original data bin of the multidimensional data, and preparing for data preprocessing;
(2) Data preprocessing: in a passive data analysis system, cleaning, de-duplication and normalizing preprocessing operations are carried out on the collected multidimensional data so as to obtain preprocessing;
(3) Feature extraction: in a multidimensional data analysis platform, obtaining data simulating real service access characteristics of a user by obtaining the preprocessing data in the step (2), and obtaining various configurations and log data capable of reflecting the user access data and the product information data through a log interface of the multidimensional data analysis platform;
extracting characteristic index items of the various configurations and log data at the same time; the characteristic index item comprises: network data package class, application system access feature class, application system circulation content class, software and hardware product information class;
(4) The business portrait construction method comprises the following steps:
(4-1) determining the business asset using a clustering algorithm: clustering the original data by a source address, a destination address, a protocol and a port, marking data overlapped with a network address of a service system and in the same network segment, and taking the clustering result of the source address, the destination address, the protocol and the port as a service end of the service system for determining different service system assets;
(4-2) setting a time slice, collecting a time slice dataset: dividing the data of the multidimensional data source into different time slice data sets according to the time vector of the time slice, and taking the data in a single time slice as the data set of the time slice;
(4-3) clustering to obtain single time slice index items: the service system server addresses obtained by clustering are used as access purposes, the same service system server addresses are classified into the same service, data of a single time slice data set are used as input, the service data are divided into different clusters through a K-means clustering algorithm, and each cluster represents a service running state index;
(4-4) analysis of single index baseline: carrying out service running state index analysis on a single cluster of a single time slice, namely carrying out service running state index data sets [ a, b ] on the basis of a concentration percentage parameter X set by a user and the single cluster of the single time slice, wherein [ a ] and [ b ] are respectively an initial low boundary value and an initial high boundary value of a region in which index data is concentrated, and removing data outside the concentration percentage parameter X to be used as a data concentrated region of the index;
setting a service fault tolerance coefficient k, and superposing the service fault tolerance coefficients k by [ a ] and [ b ] to obtain a final high boundary value and a final low boundary value of a certain time slice of a single service running state index, wherein the formula is calculated as follows:
final low boundary value = kx (a+ (b-a) X) (I); final high boundary value=kx (b- (b-a) ×x) (II);
the method comprises the steps of obtaining a running state index of a certain time slice of a service from a final high boundary value, a final low boundary value and a centering value in data of a single cluster, wherein the centering value refers to a median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form prediction of centering value operation trend and are displayed in a platform;
(4-5) liter dimension data set, drawing business portrait basic state: converging the service running state indexes of all clusters into an index data set as an index set of a single time slice; converging all time slice index sets in a basic time unit to form a state parameter set of service operation, and taking the parameter set as a normal form reference of the current time attribute of the service operation, namely the basic state of the service portrait of the current time attribute;
(5) Setting a stepping time length parameter, stepping forward and backward by adopting the current time attribute to obtain a new time attribute, and repeating the analysis processes of the steps (4-3) - (4-5) to dynamically monitor the basic states of the business portraits of different time attributes so as to obtain the business portraits of the dynamic monitoring time interval.
3. The monitoring method of claim 2, further comprising: predicting the value of the next time attribute and forming a trend analysis:
according to basic states of business images of different time attributes, respectively calculating a calculated number average value and an increment deviation value of the highest value, the lowest value and the middle value of the basic states of all the business images in set time, taking the calculated number average value and the increment deviation value of a time interval to be predicted in N time intervals as input, and predicting a final high boundary value, a final low boundary value and the middle value of the basic states of the business images of the next time attribute through a support vector machine SVM algorithm, wherein the middle value refers to the median between the final high boundary value and the final low boundary value;
all final high boundary values are connected according to time to form a prediction of the running trend of the final high boundary values, and the prediction is displayed in a platform;
all final low boundary values are connected according to time to form a prediction of the running trend of the final low boundary values, and the prediction is displayed in a platform;
all the centering values are connected according to time to form a prediction of centering value operation trend and are displayed in a platform.
4. The monitoring method of claim 2, further comprising: and (3) abnormal data analysis: analyzing the time slice running state indexes based on the same ratio time, constructing a decision tree by adopting an isolated forest algorithm, classifying each running state index, and judging the index of the abnormal value according to the classification result:
when the single-dimensional running state index is abnormal, the multi-dimensional data analysis platform displays and alarms the abnormal running state index; and determining host assets according to network addresses of abnormal operation state indexes and against an information asset database of a multidimensional data analysis platform, and according to data reserved by each node of a service access chain in the platform, locating specific positions of the abnormal occurrence in the service access chain in a contrasting manner and displaying abnormal alarm positions of the service through the platform.
5. A method of monitoring as claimed in claim 2 or 3, further comprising: and inquiring and displaying the service, and inquiring, large-screen displaying, counting and analyzing all data of the whole service or the specific service in a targeted manner through a multidimensional data analysis platform.
CN202410069535.8A 2024-01-18 2024-01-18 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data Active CN117596133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410069535.8A CN117596133B (en) 2024-01-18 2024-01-18 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410069535.8A CN117596133B (en) 2024-01-18 2024-01-18 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data

Publications (2)

Publication Number Publication Date
CN117596133A CN117596133A (en) 2024-02-23
CN117596133B true CN117596133B (en) 2024-04-05

Family

ID=89918618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410069535.8A Active CN117596133B (en) 2024-01-18 2024-01-18 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data

Country Status (1)

Country Link
CN (1) CN117596133B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018014674A1 (en) * 2016-07-20 2018-01-25 中兴通讯股份有限公司 Method, apparatus, and system for determining degree of association of input and output of black box system
JP2018073389A (en) * 2016-10-26 2018-05-10 株式会社デンソー Data processing device and data processing method
CN108777643A (en) * 2018-06-08 2018-11-09 武汉思普崚技术有限公司 A kind of traffic visualization plateform system
CN108833397A (en) * 2018-06-08 2018-11-16 武汉思普崚技术有限公司 A kind of big data safety analysis plateform system based on network security
CN110069732A (en) * 2019-03-29 2019-07-30 腾讯科技(深圳)有限公司 A kind of method, device and equipment that information is shown
CN110457193A (en) * 2019-07-30 2019-11-15 深圳供电局有限公司 Health portrait methods of exhibiting and its system based on power information system operation/maintenance data
CN112749181A (en) * 2021-01-20 2021-05-04 丁同梅 Big data processing method aiming at authenticity verification and credible traceability and cloud server
WO2021196097A1 (en) * 2020-04-01 2021-10-07 深圳市欢太科技有限公司 User portrait list construction method and apparatus, server, and storage medium
CN114462834A (en) * 2022-01-13 2022-05-10 河北航天信息技术有限公司 Regional portrait construction method and system based on multi-channel data fusion
CN115935237A (en) * 2022-12-16 2023-04-07 北京神州新桥科技有限公司 Method and device for detecting abnormity of service flow data and electronic equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018014674A1 (en) * 2016-07-20 2018-01-25 中兴通讯股份有限公司 Method, apparatus, and system for determining degree of association of input and output of black box system
JP2018073389A (en) * 2016-10-26 2018-05-10 株式会社デンソー Data processing device and data processing method
CN108777643A (en) * 2018-06-08 2018-11-09 武汉思普崚技术有限公司 A kind of traffic visualization plateform system
CN108833397A (en) * 2018-06-08 2018-11-16 武汉思普崚技术有限公司 A kind of big data safety analysis plateform system based on network security
CN110069732A (en) * 2019-03-29 2019-07-30 腾讯科技(深圳)有限公司 A kind of method, device and equipment that information is shown
CN110457193A (en) * 2019-07-30 2019-11-15 深圳供电局有限公司 Health portrait methods of exhibiting and its system based on power information system operation/maintenance data
WO2021196097A1 (en) * 2020-04-01 2021-10-07 深圳市欢太科技有限公司 User portrait list construction method and apparatus, server, and storage medium
CN112749181A (en) * 2021-01-20 2021-05-04 丁同梅 Big data processing method aiming at authenticity verification and credible traceability and cloud server
CN114462834A (en) * 2022-01-13 2022-05-10 河北航天信息技术有限公司 Regional portrait construction method and system based on multi-channel data fusion
CN115935237A (en) * 2022-12-16 2023-04-07 北京神州新桥科技有限公司 Method and device for detecting abnormity of service flow data and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
冯阳 ; 晏清洪 ; 夏照华 ; 许永利 ; .水土保持核心业务管理系统研究.科技创新与应用.2017,(21),全文. *
韩伟红 ; 隋品波 ; 贾焰 ; .大规模网络安全态势分析与预测系统YHSAS.信息网络安全.2012,(08),全文. *

Also Published As

Publication number Publication date
CN117596133A (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN111209131B (en) Method and system for determining faults of heterogeneous system based on machine learning
CN101751535B (en) Data loss protection through application data access classification
CN111756582B (en) Service chain monitoring method based on NFV log alarm
CN111506478A (en) Method for realizing alarm management control based on artificial intelligence
US20110276836A1 (en) Performance analysis of applications
CN103746829A (en) Cluster-based fault perception system and method thereof
CN112559237B (en) Operation and maintenance system troubleshooting method and device, server and storage medium
CN111782484B (en) Anomaly detection method and device
CN115118581B (en) Internet of things data all-link monitoring and intelligent guaranteeing system based on 5G
CN112783682B (en) Abnormal automatic repairing method based on cloud mobile phone service
CN104574219A (en) System and method for monitoring and early warning of operation conditions of power grid service information system
US20200374179A1 (en) Techniques for correlating service events in computer network diagnostics
CN115237717A (en) Micro-service abnormity detection method and system
Chen et al. Graph-based incident aggregation for large-scale online service systems
CN117473571B (en) Data information security processing method and system
CN117596133B (en) Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data
CN115314424B (en) Method and device for rapidly detecting network signals
CN112839029B (en) Botnet activity degree analysis method and system
CN111988172A (en) Network information management platform, device and security management method
Biswas et al. An Iterative Clustering Approach for Tracking Server Logs for Monitoring SCADA EMS/DMS
Zhao et al. Multi-stage Location for Root-Cause Metrics in Online Service Systems
CN118051657B (en) Method and system for testing case library for fault location of data private line
CN113037550B (en) Service fault monitoring method, system and computer readable storage medium
LYU et al. Alarm-Based Root Cause Analysis Based on Weighted Fault Propagation Topology for Distributed Information Network
CN117040918A (en) Network security management platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant