CN112396135B - Method and system for detecting abnormal traffic of converged communication network - Google Patents

Method and system for detecting abnormal traffic of converged communication network Download PDF

Info

Publication number
CN112396135B
CN112396135B CN202110078910.1A CN202110078910A CN112396135B CN 112396135 B CN112396135 B CN 112396135B CN 202110078910 A CN202110078910 A CN 202110078910A CN 112396135 B CN112396135 B CN 112396135B
Authority
CN
China
Prior art keywords
individual
student
greedy
network
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110078910.1A
Other languages
Chinese (zh)
Other versions
CN112396135A (en
Inventor
焦显伟
陶子元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Telecom Easiness Information Technology Co Ltd
Original Assignee
Beijing Telecom Easiness Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Telecom Easiness Information Technology Co Ltd filed Critical Beijing Telecom Easiness Information Technology Co Ltd
Priority to CN202110078910.1A priority Critical patent/CN112396135B/en
Publication of CN112396135A publication Critical patent/CN112396135A/en
Application granted granted Critical
Publication of CN112396135B publication Critical patent/CN112396135B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method and a system for detecting abnormal traffic of a converged communication network. The method comprises the following steps: acquiring flow data of a converged communication network; according to the flow data, based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; and determining the network flow state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected. The invention improves the accuracy of the abnormal traffic detection of the converged communication network.

Description

Method and system for detecting abnormal traffic of converged communication network
Technical Field
The invention relates to the field of converged communication networks and machine learning, in particular to a method and a system for detecting abnormal traffic of a converged communication network.
Background
With the development of communication technology and information technology, the types of communication networks are increasingly complex, the network scale is continuously enlarged, the converged communication service based on the IP network is continuously increased, and the converged communication of the converged communication technology and the information technology has become one of the important development directions of current telecommunication operators. The converged communication is based on IP communication, takes VoIP, video communication, multimedia conference, cooperative office, instant communication and the like as core business capabilities, and provides various services which are accessed to a network anytime and anywhere and enjoy unified communication for users. The converged communication network has a plurality of services, the stable and good operation of each service is very important, and the corresponding network security problem is more prominent. Therefore, the research on the security technology of the converged communication network is significant to the development of the converged communication network.
In the current numerous network security technologies, a network traffic anomaly detection technology provides effective guarantee for interception of network attacks, and is one of the primary technical means for solving the network security problem. Therefore, the traffic anomaly detection technology for the converged communication network provides an important guarantee basis for the availability and the safety of the converged communication network service. The existing flow anomaly detection methods can be mainly divided into detection methods based on feature rules, detection methods based on statistical analysis and detection methods based on machine learning. With the rapid development of machine learning technology, machine learning technology is widely applied in flow anomaly detection due to its flexible and intelligent characteristics.
For the anomaly detection method based on machine learning, the parameters of the machine learning algorithm are key factors influencing the anomaly detection performance. For example, a Support Vector Machine (SVM) method in machine learning plays a very important role in machine learning technology because it has a solid theoretical basis and is suitable for processing high-dimensional nonlinear data. In practical applications, the kernel function and the free parameters in the SVM method are key factors determining the performance of the SVM method. Generally, a Radial Basis Function (RBF) is selected as a kernel function of an SVM method, and two key influencing factors of the SVM method based on the RBF kernel function are a penalty coefficient and a kernel width. The penalty coefficient is used for controlling the penalty degree of misjudgment samples, and the balance between model complexity and data misfitting is as follows: the larger the penalty coefficient is, the larger the penalty degree is for the misjudged sample, so that overfitting can occur due to too many support vectors, and the generalization capability is poor; on the contrary, the smaller the penalty coefficient is, the smaller the penalty degree is for the misjudged sample, which may cause the model to be too simple and to appear under-fitting, and the structural risk is larger. The kernel width determines the feature space after data mapping, thereby affecting the performance of the SVM method. It can be seen that the penalty factor and the kernel width respectively affect the performance of the SVM method from different aspects. Without loss of generality, parameters of the machine learning algorithm are very important to the performance of the algorithm, so that the optimization of the selection of the parameters of the machine learning algorithm is very important to the anomaly detection method based on machine learning.
At present, the commonly used method for optimizing the parameters of the machine learning algorithm mainly comprises the following steps: experiment selection, grid search and gradient descent. The experimental selection method is completely empirical, and lacks theoretical guidance, and the selected parameters are not necessarily optimal. The grid searching method includes the steps of firstly setting reasonable search interval limits and interval step lengths to divide a search interval grid, carrying out exhaustive verification on all grid points, and then selecting a group of parameters with the best verification effect as optimal parameters. Although the method can obtain better parameter combination to a certain extent, the method has low search efficiency and is difficult to set reasonable search interval limit and interval step length. The gradient descent method is very sensitive to the selection of the initial value, and it is difficult to obtain the optimal parameter if the initial value is set unreasonably. Therefore, an efficient optimization algorithm needs to be designed for parameter optimization selection.
In recent years, the group intelligent optimization algorithm has attracted extensive attention in academia and industry due to its good global optimization capability and optimization efficiency, and is applied to the problem of parameter optimization selection of a machine learning method. The teaching and learning optimization algorithm is an optimization algorithm for simulating the learning process of students in classes, and is applied to various fields due to the characteristics of strong global optimization capability and high optimization precision. However, when a complex optimization problem is faced, the algorithm may be trapped in local optimization, and therefore a new optimization algorithm is urgently needed to reduce the possibility of trapping in local optimization, so that optimal parameters can be obtained in parameter optimization of a machine learning algorithm, the performance of the algorithm is effectively improved, and the accuracy of detecting the traffic anomaly of the converged communication network is improved.
Disclosure of Invention
The invention aims to provide a method and a system for detecting the traffic anomaly of a converged communication network, which can improve the accuracy of detecting the traffic anomaly of the converged communication network.
In order to achieve the purpose, the invention provides the following scheme:
a method for detecting abnormal traffic of a converged communication network comprises the following steps:
acquiring flow data of a converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network flow state comprises abnormal network flow or normal network flow;
according to the flow data, based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output;
and determining the network flow state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected.
Optionally, the acquiring traffic data of the converged communication network further includes:
and carrying out standardization processing on the flow data.
Optionally, the data according to the flow rate is based onKAnd (3) performing cross validation, namely determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm, and specifically comprising the following steps of:
acquiring a machine learning algorithm; the machine learning algorithm includes: a support vector machine, a decision tree and a neural network;
based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameters comprise a penalty coefficient and a kernel width;
and determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm.
Optionally, the base isKAnd (3) performing cross-turn verification, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm, and specifically comprising the following steps of:
initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width;
random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm;
determining an optimal student individual and a worst student individual according to the fitness, and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness;
calculating the individual fitness of the student after the learning in the first stage;
updating individual students based on a non-greedy strategy;
performing second-stage learning according to the historical optimal student individuals and random student individuals;
calculating the individual fitness of the student after the second-stage learning;
secondarily updating the individual students based on a non-greedy strategy;
judging whether the maximum iteration times is reached;
if so, determining the optimal student individual as the optimized parameter of the machine learning algorithm;
and if not, returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual.
A converged communication network traffic anomaly detection system, comprising:
the traffic data acquisition module is used for acquiring traffic data of the converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network flow state comprises abnormal network flow or normal network flow;
a network flow anomaly detection model determination module for determining the flow data based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output;
and the network flow state determining module is used for determining the network flow state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected.
Optionally, the method further includes:
and the flow data standardization module is used for carrying out standardization processing on the flow data.
Optionally, the module for determining a network traffic anomaly detection model specifically includes:
a machine learning algorithm acquisition unit for acquiring a machine learning algorithm; the machine learning algorithm includes: a support vector machine, a decision tree and a neural network;
a parameter optimization unit of machine learning algorithm for being based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameters comprise a penalty coefficient and a kernel width;
and the network flow abnormity detection model determining unit is used for determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm.
Optionally, the parameter optimization unit of the machine learning algorithm specifically includes:
the non-greedy teaching and learning optimization algorithm parameter initialization subunit is used for initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width;
a fitness first calculation subunit for random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm;
the first-stage learning subunit is used for determining an optimal student individual and a worst student individual according to the fitness and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness;
the fitness second calculating subunit is used for calculating the individual fitness of the student after the first-stage learning;
a first updating subunit, configured to update the student individuals based on a non-greedy policy;
the second-stage learning subunit is used for performing second-stage learning according to the historical optimal student individuals and the random student individuals;
the fitness third calculating subunit is used for calculating the individual fitness of the student after the second-stage learning;
the second updating subunit is used for updating the student individuals secondarily based on a non-greedy strategy;
the judging subunit is used for judging whether the maximum iteration number is reached;
the parameter determining subunit of the optimized machine learning algorithm is used for determining the optimal student individual as the parameter of the optimized machine learning algorithm if the optimal student individual is reached;
and the iteration subunit is used for returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual if the fitness is not reached.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method and the system for detecting the traffic anomaly of the converged communication network, provided by the invention, the key parameters influencing the performance of the anomaly detection method based on machine learning are optimized and selected through a non-greedy teaching and learning optimization algorithm, and the possibility of the key parameters falling into local optimum is reduced, so that the optimal parameters can be obtained in the parameter optimization of the machine learning algorithm, the performance of the algorithm is effectively improved, and the accuracy of the traffic anomaly detection of the converged communication network is improved. The method is simple, efficient and easy to implement, can provide accurate and reasonable prediction for the abnormal flow of the converged communication network, assists in guiding the formulation of relevant decisions, and promotes the intelligent and scientific development of the converged communication network, so that the method has very important application value.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a method for detecting abnormal traffic in a converged communication network according to the present invention;
FIG. 2 is a schematic diagram of parameter optimization of a machine learning algorithm provided by the present invention;
fig. 3 is a schematic structural diagram of a system for detecting abnormal traffic in a converged communication network according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a method and a system for detecting the traffic anomaly of a converged communication network, which can improve the accuracy of detecting the traffic anomaly of the converged communication network.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a schematic flow chart of a method for detecting traffic anomaly in a converged communication network, according to fig. 1, the method for detecting traffic anomaly in a converged communication network includes:
s101, acquiring traffic data of the converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network traffic state comprises network traffic abnormity or network traffic normality. For the network traffic status label, "1" indicates that the network traffic is abnormal, and'-1' indicates that the network traffic is normal.
Then also comprises the following steps:
and carrying out standardization processing on the flow data.
Using formulas
Figure 100002_DEST_PATH_IMAGE001
Standardizing the flow data to make the flow data at [0, 1]And (3) a range.
Wherein the content of the first and second substances,P i is shown asiThe original value of the data on a certain feature,P min represents the minimum value of all data on the feature,P max represents the maximum value of all data over the feature,P i is shown asiThe normalized value of each data over the feature.
S102, according to the flow data, based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output.
The network flow abnormity detection model isf(x) Wherein, in the step (A),
Figure 995327DEST_PATH_IMAGE002
sgn (.) is a sign function,xindicating converged communication network data to be detected,x i is shown asiTraining sample data corresponding to each support vector,
Figure 100002_DEST_PATH_IMAGE003
for training sample datax i The corresponding langerhan coefficient of the corresponding langerhans,y i representing training samplesx i The corresponding label is marked with a corresponding label,ba threshold value is indicated which is indicative of,k(.) is a radial basis kernel function, such as the formula
Figure 34172DEST_PATH_IMAGE004
As shown in the drawings, the above-described,
Figure 100002_DEST_PATH_IMAGE005
is kernel wide.
As shown in fig. 2, S102 specifically includes:
acquiring a machine learning algorithm; the machine learning algorithm includes: support vector machines, decision trees, and neural networks.
Based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameters include, but are not limited to, penalty factor and kernel width of the support vector machine.
And determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm.
Based on the K-fold cross validation, a non-greedy teaching and learning optimization algorithm is adopted to optimize parameters of the machine learning algorithm, and the method specifically comprises the following steps:
initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width.
Random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm.
Determining an optimal student individual and a worst student individual according to the fitness, and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness.
And calculating the individual fitness of the students after the learning in the first stage.
Individual students are updated based on a non-greedy strategy.
And performing the second stage learning according to the historical optimal student individuals and random student individuals.
And calculating the individual fitness of the student after the second-stage learning.
And secondarily updating the individual students based on a non-greedy strategy.
And judging whether the maximum iteration number is reached.
And if so, determining the optimal student individual as the optimized parameter of the machine learning algorithm.
And if not, returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual.
As a specific embodiment, a specific process for performing parameter optimization by using a non-greedy teaching and learning optimization algorithm is as follows:
step 1: initialization of non-greedy teaching and learning optimization algorithm, including number of studentsNMaximum number of iterationsTNon-greedy coefficients
Figure 938543DEST_PATH_IMAGE006
And anomaly detection algorithmMA search space of parameters;
step 2: individual student initialization, random generationNIndividual studentS i =[S i,1 ,…,S i,D ](i=1,2,..,N),S i,1 S i,D 1 st and 1 st respectively representing abnormality detection algorithms represented by individual studentsDAnd (3) mapping the continuous value into the discrete value by adopting a rounding value-taking strategy aiming at the discrete value parameter.
And step 3: calculate each studentiIs adapted tof i The fitness is based on studentsiThe abnormality representedDetecting algorithm parameter values, internallyKCalculating the average value of the accuracy indexes F1-score of the anomaly detection model obtained by the parameter training by the cross-folding verification strategy (the average value is the value of the accuracy indexes F1-score of the anomaly detection model obtained by the parameter training: (F1avg) And variance (F1sd) Then according to the formula
Figure 100002_DEST_PATH_IMAGE007
Calculating studentiIs adapted tof i
And 4, step 4: based on the formula
Figure 255123DEST_PATH_IMAGE008
Updating the individual students with a non-greedy strategy, wherein,randis [0, 1 ]]The random number in (1) is selected,S i is an original individual, and is a new individual,S kbest for the best individual among the current student individuals,S kworst is the worst individual of the current student individuals,M i is the average of all the individuals at present,T F1 leading factors for the best individual and the worst individual,
Figure 100002_DEST_PATH_IMAGE009
T F2 is the worst individual lead factor and is the worst individual lead factor,
Figure 194129DEST_PATH_IMAGE010
random(a,b) As a random function, from [ a, b ]]Randomly generating a value. Non-greedy policy updates as formula
Figure 100002_DEST_PATH_IMAGE011
As shown in the drawings, the above-described,random_select(.) is selected at random in the form of a random selection,greedy_select(.) are greedy choices, always select the best individual,
Figure 409735DEST_PATH_IMAGE012
entering the next stage of learning process for the updated individual students;
and 5: based on the formula
Figure DEST_PATH_IMAGE013
Updating the individual student by a non-greedy strategy for a second stage learningS k,j Is selected randomly differently fromS i,j The individual(s) of (a),S gbest,j is composed ofS i,j The history of the individual is the best one,randis [0, 1 ]]The random number in (1) is also expressed by the formula
Figure 4664DEST_PATH_IMAGE014
Updating the student individuals by the non-greedy strategy, and entering the next learning process by the updated individuals;
step 6: judging whether the maximum iteration number is reachedTIf yes, go to step 7, otherwise go to step 4;
and 7: and acquiring the individual student with the maximum fitness, namely the optimal abnormal detection algorithm parameter value.
S103, determining the network traffic state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected.
The key innovation points and effects of the invention are as follows:
1. based on a non-greedy teaching and learning optimization algorithm, key parameters influencing the performance of the fused communication network flow anomaly detection algorithm are automatically optimized, so that an efficient and accurate prediction model is obtained;
2. aiming at the selection process of the teaching and learning optimization algorithm, a non-greedy selection strategy is provided to prevent trapping into local optimization;
3. aiming at the learning process of a teaching and learning optimization algorithm, worst student individuals and the optimal historical information of the students are integrated, so that the learning efficiency is improved;
4. aiming at the problem of prediction precision of an anomaly detection algorithm, a method considering the prediction precision and stability is provided.
Fig. 3 is a schematic structural diagram of a converged communication network traffic anomaly detection system provided by the present invention, and as shown in fig. 3, the converged communication network traffic anomaly detection system provided by the present invention includes:
a traffic data acquiring module 301, configured to acquire traffic data of the converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network traffic state comprises network traffic abnormity or network traffic normality.
A network flow anomaly detection model determination module 302 for determining the flow data based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output.
The network traffic state determining module 303 is configured to determine the network traffic state of the converged communication network to be detected by using the network anomaly detection model according to the network state data, the protocol analysis data, and the service operation state data of the converged communication network to be detected.
The invention provides a system for detecting the abnormal traffic of a converged communication network, which further comprises:
and the flow data standardization module is used for carrying out standardization processing on the flow data.
The module 302 for determining a network traffic anomaly detection model specifically includes:
a machine learning algorithm acquisition unit for acquiring a machine learning algorithm; the machine learning algorithm includes: support vector machines, decision trees, and neural networks.
A parameter optimization unit of machine learning algorithm for being based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameter packageIncluding but not limited to penalty factor and kernel width of the support vector machine.
And the network flow abnormity detection model determining unit is used for determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm.
The parameter optimization unit of the machine learning algorithm specifically comprises:
the non-greedy teaching and learning optimization algorithm parameter initialization subunit is used for initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width.
A fitness first calculation subunit for random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm.
The first-stage learning subunit is used for determining an optimal student individual and a worst student individual according to the fitness and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness;
and the fitness second calculating subunit is used for calculating the individual fitness of the student after the first-stage learning.
And the first updating subunit is used for updating the student individuals based on a non-greedy strategy.
And the second-stage learning subunit is used for performing second-stage learning according to the historical optimal student individuals and the random student individuals.
And the fitness third calculating subunit is used for calculating the individual fitness of the student after the second-stage learning.
And the second updating subunit is used for updating the student individuals secondarily based on a non-greedy strategy.
And the judging subunit is used for judging whether the maximum iteration number is reached.
And the parameter determining subunit is used for determining the optimal student individual as the optimized parameter of the machine learning algorithm if the optimal parameter is reached.
And the iteration subunit is used for returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual if the fitness is not reached.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (4)

1. A method for detecting abnormal traffic of a converged communication network is characterized by comprising the following steps:
acquiring flow data of a converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network flow state comprises abnormal network flow or normal network flow;
according to the flow data, based onKPerforming refraction and cross validation, and determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the above-mentionedThe network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output;
determining the network flow state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected;
said data according to said flow rate is based onKAnd (3) performing cross validation, namely determining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm, and specifically comprising the following steps of:
acquiring a machine learning algorithm; the machine learning algorithm includes: a support vector machine, a decision tree and a neural network;
based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameters comprise a penalty coefficient and a kernel width;
determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm;
the base isKAnd (3) performing cross-turn verification, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm, and specifically comprising the following steps of:
initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width;
random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm;
determining an optimal student individual and a worst student individual according to the fitness, and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness;
calculating the individual fitness of the student after the learning in the first stage;
updating individual students based on a non-greedy strategy;
performing second-stage learning according to the historical optimal student individuals and random student individuals;
calculating the individual fitness of the student after the second-stage learning;
secondarily updating the individual students based on a non-greedy strategy;
judging whether the maximum iteration times is reached;
if so, determining the optimal student individual as the optimized parameter of the machine learning algorithm;
if not, returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual;
the specific process of performing parameter optimization by adopting a non-greedy teaching and learning optimization algorithm comprises the following steps:
step 1: initialization of non-greedy teaching and learning optimization algorithm, including number of studentsNMaximum number of iterationsTNon-greedy coefficients
Figure DEST_PATH_IMAGE001
And anomaly detection algorithmMA search space of parameters;
step 2: individual student initialization, random generationNIndividual studentS i = [S i,1 ,…,S i,D ](i=1,2,..,N),S i,1 S i,D 1 st and 1 st respectively representing abnormality detection algorithms represented by individual studentsDThe parameters adopt a rounding value strategy to map continuous values into discrete values aiming at the parameters of the discrete values;
and step 3: calculate each studentiIs adapted tof i The fitness is based on studentsiThe expressed abnormal detection algorithm parameter values are internallyKCalculating the accuracy index of the abnormal detection model obtained by the parameter training through the refraction and crossing verification strategyAverage value of F1-score ((S))F1avg) And variance (F1sd) Then according to the formula
Figure 225857DEST_PATH_IMAGE002
Calculating studentiIs adapted tof i
And 4, step 4: based on the formula
Figure DEST_PATH_IMAGE003
Updating the individual students with a non-greedy strategy, wherein,randis [0, 1 ]]The random number in (1) is selected,S i is an original individual, and is a new individual,S kbest for the best individual among the current student individuals,S kworst is the worst individual of the current student individuals,M i is the average of all the individuals at present,T F1 leading factors for the best individual and the worst individual,
Figure 957053DEST_PATH_IMAGE004
T F2 is the worst individual lead factor and is the worst individual lead factor,
Figure DEST_PATH_IMAGE005
random(a,b) As a random function, from [ a, b ]]Randomly generating a value; non-greedy policy updates as formula
Figure 215996DEST_PATH_IMAGE006
As shown in the drawings, the above-described,random_select(.) is selected at random in the form of a random selection,greedy_select(.) are greedy choices, always select the best individual,
Figure DEST_PATH_IMAGE007
entering the next stage of learning process for the updated individual students;
and 5: based on the formula
Figure 32642DEST_PATH_IMAGE008
Updating the individual student by a non-greedy strategy for a second stage learningS k,j Is selected randomly differently fromS i,j The individual(s) of (a),S gbest,j is composed ofS i,j The history of the individual is the best one,randis [0, 1 ]]The random number in (1) is also expressed by the formula
Figure DEST_PATH_IMAGE009
Updating the student individuals by the non-greedy strategy, and entering the next learning process by the updated individuals;
step 6: judging whether the maximum iteration number is reachedTIf yes, go to step 7, otherwise go to step 4;
and 7: and acquiring the individual student with the maximum fitness, namely the optimal abnormal detection algorithm parameter value.
2. The method for detecting the abnormal traffic of the converged communication network according to claim 1, wherein the acquiring traffic data of the converged communication network further comprises:
and carrying out standardization processing on the flow data.
3. A converged communication network traffic anomaly detection system is characterized by comprising:
the traffic data acquisition module is used for acquiring traffic data of the converged communication network; the traffic data includes: network state data, protocol analysis data, service operation state data and corresponding network flow state; the network status data includes: throughput, packet traffic, delay jitter, and call traffic; the protocol analysis data includes: protocol type, protocol packet length, connection duration, port information, and IP information; the service operation state data comprises service fault information; the network flow state comprises abnormal network flow or normal network flow;
a network flow anomaly detection model determination module for determining the flow data based onKFold-cross validationDetermining a network flow abnormity detection model by adopting a non-greedy teaching and learning optimization algorithm; the network anomaly detection model takes network state data, protocol analysis data and service operation state data as input and takes the network flow state as output;
the network traffic state determining module is used for determining the network traffic state of the converged communication network to be detected by adopting the network anomaly detection model according to the network state data, the protocol analysis data and the service operation state data of the converged communication network to be detected;
the network traffic anomaly detection model determining module specifically includes:
a machine learning algorithm acquisition unit for acquiring a machine learning algorithm; the machine learning algorithm includes: a support vector machine, a decision tree and a neural network;
a parameter optimization unit of machine learning algorithm for being based onKPerforming cross validation, namely optimizing parameters of the machine learning algorithm by adopting a non-greedy teaching and learning optimization algorithm; the parameters comprise a penalty coefficient and a kernel width;
the network flow abnormity detection model determining unit is used for determining a network flow abnormity detection model according to the flow data and the optimized parameters of the machine learning algorithm;
the parameter optimization unit of the machine learning algorithm specifically comprises:
the non-greedy teaching and learning optimization algorithm parameter initialization subunit is used for initializing non-greedy teaching and learning optimization algorithm parameters; the non-greedy teaching and learning optimization algorithm parameters comprise the number of students, the maximum iteration times, a non-greedy coefficient, a search space of a punishment coefficient and a search space of a kernel width;
a fitness first calculation subunit for random initializationNIndividual student and based onKCalculating the fitness of each student individual by a folding and crossing verification strategy; the individual students are parameters of a machine learning algorithm;
the first-stage learning subunit is used for determining an optimal student individual and a worst student individual according to the fitness and performing first-stage learning based on the optimal student individual and the worst student individual; the optimal student individuals are the student individuals with the maximum fitness; the worst individual student is the individual student with the minimum fitness;
the fitness second calculating subunit is used for calculating the individual fitness of the student after the first-stage learning;
a first updating subunit, configured to update the student individuals based on a non-greedy policy;
the second-stage learning subunit is used for performing second-stage learning according to the historical optimal student individuals and the random student individuals;
the fitness third calculating subunit is used for calculating the individual fitness of the student after the second-stage learning;
the second updating subunit is used for updating the student individuals secondarily based on a non-greedy strategy;
the judging subunit is used for judging whether the maximum iteration number is reached;
the parameter determining subunit of the optimized machine learning algorithm is used for determining the optimal student individual as the parameter of the optimized machine learning algorithm if the optimal student individual is reached;
the iteration subunit is used for returning to the step of determining the optimal student individual and the worst student individual according to the fitness and performing the first-stage learning based on the optimal student individual and the worst student individual if the fitness is not reached;
the specific process of performing parameter optimization by adopting a non-greedy teaching and learning optimization algorithm comprises the following steps:
step 1: initialization of non-greedy teaching and learning optimization algorithm, including number of studentsNMaximum number of iterationsTNon-greedy coefficients
Figure 619523DEST_PATH_IMAGE001
And anomaly detection algorithmMA search space of parameters;
step 2: individual student initialization, random generationNIndividual studentS i = [S i,1 ,…,S i,D ](i=1,2,..,N),S i,1 S i,D 1 st and 1 st respectively representing abnormality detection algorithms represented by individual studentsDThe parameters adopt a rounding value strategy to map continuous values into discrete values aiming at the parameters of the discrete values;
and step 3: calculate each studentiIs adapted tof i The fitness is based on studentsiThe expressed abnormal detection algorithm parameter values are internallyKCalculating the average value of the accuracy indexes F1-score of the anomaly detection model obtained by the parameter training by the cross-folding verification strategy (the average value is the value of the accuracy indexes F1-score of the anomaly detection model obtained by the parameter training: (F1avg) And variance (F1sd) Then according to the formula
Figure 674066DEST_PATH_IMAGE002
Calculating studentiIs adapted tof i
And 4, step 4: based on the formula
Figure 635069DEST_PATH_IMAGE010
Updating the individual students with a non-greedy strategy, wherein,randis [0, 1 ]]The random number in (1) is selected,S i is an original individual, and is a new individual,S kbest for the best individual among the current student individuals,S kworst is the worst individual of the current student individuals,M i is the average of all the individuals at present,T F1 leading factors for the best individual and the worst individual,
Figure 407853DEST_PATH_IMAGE004
T F2 is the worst individual lead factor and is the worst individual lead factor,
Figure 214135DEST_PATH_IMAGE005
random(a,b) As a random function, from [ a, b ]]Randomly generating a value; non-greedy policy updates as formula
Figure 388765DEST_PATH_IMAGE006
As shown in the drawings, the above-described,random_select(.) is selected at random in the form of a random selection,greedy_select(.) are greedy choices, always select the best individual,
Figure 458352DEST_PATH_IMAGE007
entering the next stage of learning process for the updated individual students;
and 5: based on the formula
Figure DEST_PATH_IMAGE011
Updating the individual student by a non-greedy strategy for a second stage learningS k,j Is selected randomly differently fromS i,j The individual(s) of (a),S gbest,j is composed ofS i,j The history of the individual is the best one,randis [0, 1 ]]The random number in (1) is also expressed by the formula
Figure 751055DEST_PATH_IMAGE009
Updating the student individuals by the non-greedy strategy, and entering the next learning process by the updated individuals;
step 6: judging whether the maximum iteration number is reachedTIf yes, go to step 7, otherwise go to step 4;
and 7: and acquiring the individual student with the maximum fitness, namely the optimal abnormal detection algorithm parameter value.
4. The converged communication network traffic anomaly detection system according to claim 3, further comprising:
and the flow data standardization module is used for carrying out standardization processing on the flow data.
CN202110078910.1A 2021-01-21 2021-01-21 Method and system for detecting abnormal traffic of converged communication network Active CN112396135B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110078910.1A CN112396135B (en) 2021-01-21 2021-01-21 Method and system for detecting abnormal traffic of converged communication network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110078910.1A CN112396135B (en) 2021-01-21 2021-01-21 Method and system for detecting abnormal traffic of converged communication network

Publications (2)

Publication Number Publication Date
CN112396135A CN112396135A (en) 2021-02-23
CN112396135B true CN112396135B (en) 2021-06-15

Family

ID=74625633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110078910.1A Active CN112396135B (en) 2021-01-21 2021-01-21 Method and system for detecting abnormal traffic of converged communication network

Country Status (1)

Country Link
CN (1) CN112396135B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113347384B (en) * 2021-08-06 2021-11-05 北京电信易通信息技术股份有限公司 Video conference flow prediction method and system based on time sequence representation learning
CN113627382B (en) * 2021-08-24 2022-02-22 北京电信易通信息技术股份有限公司 User behavior identification method and system for video conference system and storage medium
CN114928555B (en) * 2022-05-12 2024-03-26 浙江上创智能科技有限公司 Fully-mechanized coal mining face display method, device and medium
CN116776148A (en) * 2023-06-15 2023-09-19 江西师范大学 QUIC network abnormal behavior detection method, system and equipment
CN117354066A (en) * 2023-12-06 2024-01-05 吉林省吉能电力通信有限公司 Abnormal data processing system for power communication flow prediction

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104539482A (en) * 2014-12-29 2015-04-22 南京邮电大学 Converged communication network monitoring managing system
CN105491013B (en) * 2015-11-20 2018-11-16 电子科技大学 A kind of multiple-domain network Security Situation Awareness Systems and method based on SDN
EP3324346A1 (en) * 2016-11-19 2018-05-23 Tata Consultancy Services Limited Parallelization approaches of modified teaching learning based search optimization technique for variable selection
CN111757365A (en) * 2020-06-03 2020-10-09 湃方科技(北京)有限责任公司 Abnormal equipment identification method and device in wireless network

Also Published As

Publication number Publication date
CN112396135A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
CN112396135B (en) Method and system for detecting abnormal traffic of converged communication network
US11928611B2 (en) Conversational interchange optimization
BR112016017972B1 (en) METHOD FOR MODIFICATION OF COMMUNICATION FLOW
CN110516697B (en) Evidence graph aggregation and reasoning based statement verification method and system
CN110166344B (en) Identity identification method, device and related equipment
CN107704868B (en) User clustering method based on mobile application use behaviors
CN112559745B (en) Method and related device for determining hot event
CN111797320B (en) Data processing method, device, equipment and storage medium
Orabona et al. New adaptive algorithms for online classification
CN110502269A (en) Application program optimization method, equipment, storage medium and device
Xu et al. An actor-critic-based transfer learning framework for experience-driven networking
CN108197177A (en) Monitoring method, device, storage medium and the computer equipment of business object
CN116170208A (en) Network intrusion real-time detection method based on semi-supervised ISODATA algorithm
Hofmann et al. A graph auto-encoder model of derivational morphology
CN114697127B (en) Service session risk processing method based on cloud computing and server
Yang et al. An academic social network friend recommendation algorithm based on decision tree
CN112507185B (en) User portrait determination method and device
CN112463964B (en) Text classification and model training method, device, equipment and storage medium
US20220261683A1 (en) Constraint sampling reinforcement learning for recommendation systems
CN114722723B (en) Emotion tendency prediction method and equipment based on kernel extreme learning machine optimization
CN117786416B (en) Model training method, device, equipment, storage medium and product
Wen et al. Application of clustering algorithm in social network user scenario prediction
RU2802164C1 (en) Method for detecting normal reactions of computer network nodes to network packets related to unknown traffic
CN113923143B (en) Cloud network adjusting method and device and storage medium
WO2022121029A1 (en) Multipath routing method and device for supercomputing user experience quality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: An anomaly detection method and system for converged communication network traffic

Effective date of registration: 20220712

Granted publication date: 20210615

Pledgee: Beijing technology intellectual property financing Company limited by guarantee

Pledgor: BEIJING TELECOMMUNICATION YITONG INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2022990000449

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230627

Granted publication date: 20210615

Pledgee: Beijing technology intellectual property financing Company limited by guarantee

Pledgor: BEIJING TELECOMMUNICATION YITONG INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2022990000449

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Method and System of Traffic Anomaly Detection in Converged Communication Network

Effective date of registration: 20230627

Granted publication date: 20210615

Pledgee: Beijing technology intellectual property financing Company limited by guarantee

Pledgor: BEIJING TELECOMMUNICATION YITONG INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2023990000317