CN117118810B

CN117118810B - Network communication abnormity early warning method and system

Info

Publication number: CN117118810B
Application number: CN202311384466.1A
Authority: CN
Inventors: 赵岩; 苏涛; 孙晓娜
Original assignee: Liguo Intelligent Technology Kunshan Co ltd
Current assignee: Liguo Intelligent Technology Kunshan Co ltd
Priority date: 2023-10-25
Filing date: 2023-10-25
Publication date: 2023-12-29
Anticipated expiration: 2043-10-25
Also published as: CN117118810A

Abstract

The invention provides a network communication abnormity early warning method and a system, which relate to the technical field of data processing and comprise the following steps: establishing a communication database, carrying out communication node association, generating abnormal accumulation constraint, executing feature cluster training, optimizing feature cluster parameters, configuring master constraint according to communication tasks, configuring slave constraint by using communication nodes, setting identification termination conditions, executing cluster iteration training, finishing clustering when the identification termination conditions are met, recording cluster parameters, receiving real-time communication data, executing cluster analysis, generating cluster danger values, carrying out continuous communication accumulation evaluation, and completing network communication abnormal early warning according to evaluation results. The invention solves the technical problems that the traditional method is low in efficiency when processing large-scale data, cannot meet the requirements of real-time performance and high efficiency, has limited judgment accuracy on abnormality, and is easy to generate false alarm or missing report, so that the system safety and reliability are threatened.

Description

Network communication abnormity early warning method and system

Technical Field

The invention relates to the technical field of data processing, in particular to a network communication abnormity early warning method and system.

Background

In a conventional network communication system, monitoring and identifying communication anomalies mainly includes a rule-based method that relies on a set of rules defined in advance to determine whether communication is normal, and a statistical analysis-based method, however, such a method often requires manual setting of rules, and may not cover all anomalies for complex communication scenarios. The method based on statistical analysis detects the abnormality by carrying out statistical feature extraction and model establishment on the communication data, however, the statistical analysis method cannot meet the large-scale communication data processing requirement and has limited effect in processing the abnormal situations of nonlinearity and dynamic change.

Therefore, the traditional method is low in efficiency when processing large-scale data, cannot meet the requirements of real-time performance and high efficiency, has limited judgment accuracy on abnormality, is easy to generate the problem of false alarm or missing report, and threatens the safety and reliability of the system.

Disclosure of Invention

The application aims to solve the technical problems that the traditional method is low in efficiency when processing large-scale data, cannot meet the requirements of real-time performance and high efficiency, is limited in abnormality judgment accuracy, and is easy to produce false alarm or missing report, so that the system safety and reliability are threatened.

In view of the above problems, the present application provides a network communication anomaly early warning method and system.

In a first aspect of the present disclosure, a method for early warning of network communication abnormality is provided, where the method includes: establishing a communication database, wherein the communication database is used for analyzing communication tasks, setting communication nodes according to analysis results, and matching constructed data sets by taking the communication tasks as matching features and the communication nodes as classification features; carrying out communication node association according to the analysis result to generate abnormal accumulated constraint; respectively executing feature clustering training on databases corresponding to the classified features, optimizing feature clustering parameters, wherein feature clustering is realized by calculating feature distances, and each data in the communication databases is provided with a data state identifier; configuring master constraints according to the communication tasks, configuring slave constraints by the communication nodes, and setting identification termination conditions of each communication node through the master constraints and the slave constraints; performing clustering iterative training on each communication node, and when the clustering result meets the recognition termination condition, ending the clustering and recording the clustering parameters; receiving real-time communication data, and performing cluster analysis on the real-time communication data through cluster parameters to generate a cluster danger value, wherein the cluster parameters are control parameters of nodes corresponding to the real-time communication data; and carrying out continuous communication accumulation evaluation on the clustering dangerous values by using the abnormal accumulation constraint, and completing network communication abnormal early warning according to an evaluation result.

In another aspect of the disclosure, a network communication anomaly early warning system is provided, where the system is used in the above method, and the system includes: the database establishment module is used for establishing a communication database, and the communication database is used for establishing a data set which is established by taking the communication task as a matching characteristic and taking the communication node as a classification characteristic in a matching way after analyzing the communication task and setting the communication node according to the analysis result; the communication node association module is used for carrying out communication node association according to the analysis result and generating abnormal accumulated constraint; the feature clustering training module is used for respectively executing feature clustering training on databases corresponding to the classified features, optimizing feature clustering parameters, realizing feature clustering by calculating feature distances, and enabling each data in the communication databases to have a data state identifier; the termination condition setting module is used for configuring master constraints according to the communication tasks, configuring slave constraints by the communication nodes, and setting identification termination conditions of all the communication nodes through the master constraints and the slave constraints; the clustering iteration training module is used for executing clustering iteration training on each communication node, and when the clustering result meets the identification termination condition, clustering is finished and clustering parameters are recorded; the cluster analysis module is used for receiving the real-time communication data, performing cluster analysis on the real-time communication data through cluster parameters and generating a cluster danger value, wherein the cluster parameters are control parameters of nodes corresponding to the real-time communication data; and the abnormality early warning module is used for carrying out continuous communication accumulation evaluation on the clustering dangerous values by using the abnormality accumulation constraint and completing network communication abnormality early warning according to an evaluation result.

One or more technical solutions provided in the present application have at least the following technical effects or advantages:

by establishing a communication database and carrying out communication node association according to the analysis result, large-scale communication data can be efficiently organized and managed, so that subsequent analysis and processing are more convenient and efficient; the accuracy and stability of anomaly detection can be improved through feature clustering training and parameter optimization, and the association and anomaly between communication nodes can be better identified through feature clustering realized by calculating feature distances; through executing clustering iterative training on each communication node and evaluating according to the set recognition termination condition, accurate anomaly detection and early warning of real-time communication data can be realized, and the recording and the use of clustering parameters enhance the flexibility and the adaptability of the method; by using the abnormal accumulation constraint to carry out continuous communication accumulation evaluation, the potential abnormal situation in the network communication can be tracked and judged more accurately, and the reliability and timeliness of early warning are improved. In summary, the network communication abnormality early warning method solves the technical problems of large-scale data processing, abnormality detection accuracy, stability and the like, so that effective early warning of network communication abnormality is realized, and the safety and reliability of a communication system are improved.

The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.

Drawings

Fig. 1 is a schematic flow chart of a network communication anomaly early warning method provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of a network communication anomaly early warning system according to an embodiment of the present application.

Reference numerals illustrate: the system comprises a database establishment module 10, a communication node association module 20, a feature cluster training module 30, a termination condition setting module 40, a cluster iteration training module 50, a cluster analysis module 60 and an abnormality early warning module 70.

Detailed Description

The embodiment of the application solves the technical problems that the traditional method is low in efficiency when processing large-scale data, cannot meet the requirements of real-time performance and high efficiency, has limited accuracy in judging abnormality, is easy to generate false alarm or missing report, and threatens the system safety and reliability.

Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.

Example 1

As shown in fig. 1, an embodiment of the present application provides a network communication anomaly early warning method, where the method includes:

establishing a communication database, wherein the communication database is used for analyzing communication tasks, setting communication nodes according to analysis results, and matching constructed data sets by taking the communication tasks as matching features and the communication nodes as classification features;

analyzing the communication tasks, extracting relevant information such as a sender, a receiver, communication content, a time stamp and the like, and setting a communication node for each communication task according to the analysis result, wherein the communication node can be communication equipment, a network node and the like. And combining the communication tasks with the communication nodes to form data items by taking the communication tasks as matching features and the communication nodes as classification features, constructing a data set, and adding the data set into a communication database.

Carrying out communication node association according to the analysis result to generate abnormal accumulated constraint;

according to the analysis result, the relevant communication task is associated with the corresponding communication node, for example, matching and association are performed based on sender and receiver information, communication content, time stamp and the like in the task. After the communication node association is completed, an abnormal threshold is set according to task requirements, for each communication node, statistical analysis is carried out according to the associated communication tasks, for example, indexes such as frequency, data quantity and time delay of the communication tasks are calculated, the indexes are compared with expected values under normal conditions, a comparison result is compared with the set threshold, whether the abnormality exists or not is judged, if some statistical indexes exceed the threshold, the abnormal indexes can be regarded as the abnormality, and the generated abnormal accumulated constraint is used for monitoring and evaluating the abnormal conditions of network communication.

Respectively executing feature clustering training on databases corresponding to the classified features, optimizing feature clustering parameters, wherein feature clustering is realized by calculating feature distances, and each data in the communication databases is provided with a data state identifier;

and clustering training is carried out on databases corresponding to the classified features by using a clustering algorithm, such as K-means, the clustering purpose is to classify the data items with similar features into one class to form feature clusters, and the similarity among the data items is judged by calculating the distance among the features in the clustering process. By analyzing the clustering results and the evaluation indexes, parameters of feature clustering are optimized, for example, the number of clustering centers, the distance threshold or other related parameters of a clustering algorithm can be adjusted to improve the clustering effect and the accuracy.

Each data item in the communication database has a data state identifier for indicating whether it is normal data or abnormal data, which identifier can be determined according to predefined rules, manually labeled methods, and as part of feature cluster training.

Further, the method further comprises the following steps:

analyzing the communication database according to the classification characteristics, and generating the number of characteristic clusters according to the analysis result, wherein the number of characteristic clusters is provided with floating space identifiers;

distributing cluster centers according to the number of the feature clusters, and performing center authentication;

if the center authentication is passed, cluster search is executed through the corresponding cluster center;

and executing convergence adjustment of clustering according to the state identification to complete feature clustering training.

And extracting attributes related to the classification tasks from each communication record in the communication database according to the classification features, namely the communication nodes, for example, analyzing the attributes according to the communication tasks, the time stamp, the sender, the receiver and the like to obtain a plurality of features, generating the number of feature clusters for clustering according to the number of the features, wherein the number of the feature clusters represents the number of independent clusters for clustering the data.

Meanwhile, in order to cope with the variability of the data, the number of the feature clusters should also have a certain floating space, which means that when the number of the feature clusters is generated, a certain margin is given to the number of the feature clusters in consideration of the uncertainty and variability of the data, so that the method can adapt to the variability of the data under different conditions and ensure the stability and reliability of the clustering result.

According to the number of the generated feature clusters, uniformly distributing data into the cluster clusters, if the number of the feature clusters is k, distributing each cluster to 1/k part of the number of the feature clusters, calculating the central value of the data point of each cluster, and obtaining the central point coordinates of each cluster by calculating the average value of the data points in the cluster to obtain the cluster center. And calculating the distance between the cluster center and the data points in the cluster, ensuring that the center point is as close as possible to the data points in the cluster, performing center authentication, and determining whether the cluster center represents the data characteristics in the cluster.

If the authentication result shows that the center can represent the data characteristic in the corresponding cluster, namely, through center authentication, cluster search is carried out on the cluster center which passes center authentication, a K-means algorithm is adopted, a center point is taken as an initial cluster center, then data is distributed to the nearest cluster center according to the distance between the data point and the center, the center point position is updated, iteration is repeated until convergence, and thus the cluster structure of the data can be further organized and identified, and the data point is divided into proper clusters.

According to the convergence condition of the clusters and the indication of the state identification, the clustering result is adjusted and optimized, wherein the adjustment of the clustering parameters is included, namely, according to the feedback of the state identification, the parameter setting of a clustering algorithm is adjusted, for example, the parameters such as the cluster number, the iteration number, the initial clustering center and the like of the K-means algorithm are adjusted so as to optimize the clustering result; the method also comprises the step of correcting the clustering result, namely correcting the clustering result according to the state identification, such as merging or splitting the clustering clusters, adjusting the clustering boundary or the attribution of the data points and the like, so that the clustering result is more reasonable and accurate.

Further, the method further comprises the following steps:

configuring initial clustering gravitation and generating a cluster by using the cluster center;

executing cluster gravitation updating according to the quantity in the clusters, wherein the gravitation updating is obtained through calculating a preset gravitation decreasing coefficient;

when the clustering of any step is completed, the position of the center of the corresponding cluster is updated, and the updated cluster center is used as a new clustering reference;

when any cluster is phagocytized, newly added cluster centers are regenerated in the non-clustered areas;

when the clustering iteration times meet a preset value or no new data exist in all the clustering clusters, ending the clustering;

and evaluating the current clustering result according to the state identifier, and completing convergence adjustment according to the evaluation result.

The initial clustering attraction force is used for determining the attraction force intensity of the cluster center, and the initial attraction force intensity is configured according to application requirements and data characteristics so as to ensure that the clustering process can reasonably distribute data points into corresponding clusters. Generating a cluster by taking the cluster center as an attractive force source based on the configured initial cluster attractive force, specifically, calculating the distance between each data point and each cluster center, distributing the data points to the nearest cluster center according to the calculation result, recalculating the center point position of each cluster according to the distributed data points, repeating the steps until the cluster center does not change significantly any more or reaches the preset iteration times, and thus completing the generation of the cluster.

And for each cluster, counting the number of the data points distributed in the corresponding cluster, and acquiring the number of the data points contained in the cluster. A gravity decreasing coefficient is predefined for measuring the change rate of the cluster gravity along with the change of the number of data points, and the coefficient controls the decreasing degree of the gravity intensity.

According to the quantity of data points in the cluster, calculating the updating value of the cluster gravities, specifically, the cluster with more data points has stronger gravities, so that the gravities cannot be greatly affected by decreasing, and the situation can be represented by using a smaller gravities decreasing coefficient or not decreasing; clusters of fewer data points have weaker attractive forces and therefore the attractive force strength is subject to a greater decrease, which is indicated by the use of a greater attractive force decrease coefficient, ensuring that the attractive force strength gradually decreases as the number of data points decreases.

According to the calculated gravitation updating value, updating the gravitation of each cluster by increasing or decreasing the original gravitation value, which is based on the calculated result of gravitation updating and the preset gravitation decreasing coefficient, so that the gravitation intensity of each cluster can be dynamically adjusted according to the distribution condition of the data points to better reflect the importance of different clusters.

And when the clustering is finished, updating the cluster center position can be performed, and a new cluster center position can be calculated by using the average value of the distributed data points. Taking the updated cluster center as a new cluster benchmark as the starting point of the next round of clustering, which means that in the next round of iteration, the updated cluster center is used as an attractive force source to regenerate the cluster. Therefore, the clustering result can be iterated and optimized continuously, and the distribution situation of the data points can be reflected better by the cluster center.

At the end of each clustering step, it is detected whether there is a situation in which clusters are completely surrounded or phagocytized by analyzing the relative positions and boundaries between clusters. Areas not covered by any cluster are determined as non-clustered areas, which are areas farther from other clusters or to which insufficient data points are assigned. Newly added cluster centers are regenerated in the non-clustered areas, and the newly added cluster centers can be randomly generated in the non-clustered areas or the positions of the newly added cluster centers can be determined based on the distances of the data points in the non-clustered areas. And adding the generated newly added cluster center into a cluster result, and carrying out subsequent cluster allocation and center updating, so that certain data points can be prevented from being ignored or wrongly allocated into other clusters.

Before the clustering process is started, a preset clustering iteration number is set according to the size and application requirements of the data set, the value indicates how many rounds of iteration operations, such as 500 times, are performed by the algorithm, and when the clustering iteration number reaches the preset value, the clustering process is ended. Or in each clustering iteration, if no cluster generates new data point distribution, i.e. all data points are distributed to the clusters, the clustering process can be judged to be converged, no new data points need to be clustered, and the clustering process is also ended.

According to one of the conditions, when the clustering process meets the preset clustering iteration times or all clusters do not have newly added data points, the clustering process is regarded as being completed and can be terminated, so that the clustering process can be ensured to reach certain stability and convergence, and the condition of infinite iteration or no further improvement is avoided.

Evaluating the current clustering result according to the defined state identifier, and judging that the clustering has converged without further adjustment if the clustering result is evaluated to be high-quality and stable; if the clustering result is rated as low quality or unstable, the current clustering algorithm parameters, data preprocessing or other aspects need to be adjusted, and the clustering process is operated again until the preset evaluation standard is met. Therefore, the clustering result can be ensured to reach certain quality and stability, and the application requirement is met.

Further, the method further comprises the following steps:

setting a sealing variable, wherein the sealing variable is any three parameters of initial clustering gravitation, cluster center position updating constraint, cluster center number and preset gravitation decreasing coefficient;

according to the sealing variables, parameter sealing is carried out, direction testing is carried out on the activity parameters, all sealing possibilities are traversed, and initial clustering gravitation, cluster center position updating constraint, cluster center number and optimizing direction of preset gravitation decreasing coefficients are determined;

and carrying out optimizing adjustment on parameters according to the optimizing direction so as to finish convergence adjustment.

Any three parameters of initial clustering gravitation, cluster center position updating constraint, cluster center number and preset gravitation decreasing coefficient are selected as sealing variables, and the rest are used as active parameters, so that certain parameters can be selectively sealed, and tuning and convergence adjustment can be more flexibly carried out in the clustering process.

According to the defined sealing variables, the corresponding parameters are sealed, which means that the values of the parameters are kept unchanged in the subsequent steps and are not adjusted any more. And selecting the remaining active parameters, namely the parameters which are not stored, performing a direction test on the parameters, and observing the change of the clustering result by changing the values of the parameters so as to determine the optimizing direction of the parameters. In the direction test, traversing all possible sealing combinations, for each sealing combination, carrying out value change on the activity parameters according to a predefined range, and then evaluating the quality of the clustering result. Based on the results of the direction test and the evaluation of the clustering results, the initial clustering gravitation, the cluster center position updating constraint, the number of the cluster centers and the optimizing direction of the preset gravitation decreasing coefficient are determined according to the quality, the stability or other preset standards of the clustering results. Thus, the optimal parameter combination can be found, and the clustering result achieves better quality and stability.

And correspondingly adjusting the activity parameters according to the determined optimizing direction, which means changing the values of the parameters in the optimizing direction according to the result of the direction test. The adjusted parameters are applied to a clustering algorithm and the clustering process is restarted, including operations such as re-initializing clusters using new parameter settings, assigning data points to clusters, and updating a cluster center. And evaluating the new clustering result to determine whether better clustering quality or stability is obtained, if the clustering result does not meet the preset requirement, continuing the iterative adjustment process, further improving parameter setting according to the evaluation result, and repeating the steps until the expected convergence effect is achieved. Therefore, the parameter configuration can be optimized in the clustering process, so that the clustering result is more accurate and stable.

Further, the method further comprises the following steps:

three types of verification conversion step sizes of all the parameters are configured according to the initial clustering gravitation, the cluster center position updating constraint, the number of the cluster centers and the parameter characteristics of a preset gravitation decreasing coefficient;

and authenticating the direction test through three types of verification transformation step sizes, and determining the optimizing direction according to an authentication result.

And analyzing the parameters of initial clustering gravitation, cluster center position updating constraint, cluster center number and preset gravitation decreasing coefficient, and acquiring the characteristics and influence on a clustering result according to the value range and sensitivity of the parameters and the relation between the parameters and a clustering algorithm.

Three types of verification conversion step sizes are configured for each parameter according to the parameter characteristics, namely, the step sizes are divided into small, medium and large types, and the specific value of each type of step size is determined according to the range and the sensitivity of the parameter, wherein the small step size is used for carrying out tiny adjustment on the parameter, and tiny increase or decrease is usually carried out in the parameter value range; the middle step length is used for moderately adjusting parameters, and can be increased or reduced in a larger range without exceeding reasonable limit; the large step size is used for carrying out larger adjustment on parameters, searching wider parameter space and increasing and decreasing in a larger range.

Applying the configured three verification transformation step sizes (small, medium and large) to each parameter, respectively carrying out micro adjustment, moderate adjustment and large adjustment on the parameters according to the step sizes, running a clustering algorithm for a plurality of times according to the change of each parameter, observing the change of a clustering result, running the clustering algorithm under each step size for each parameter, recording the quality, stability or other evaluation indexes of the clustering result, and analyzing and comparing to obtain an authentication result. According to the authentication result, determining the optimizing direction of each parameter, selecting the step length which is best in the direction test as the optimizing direction of the parameter, for example, if a certain parameter obtains the best clustering quality and stability under the step length, then the step length is taken as the optimizing direction of the parameter. Thus, the optimal adjustment mode of each parameter under different step sizes can be found, so that better clustering quality and stability are obtained.

Configuring master constraints according to the communication tasks, configuring slave constraints by the communication nodes, and setting identification termination conditions of each communication node through the master constraints and the slave constraints;

configuring a main constraint according to importance, characteristics and other requirements of a communication task, wherein the main constraint is a global requirement of the whole communication system and comprises limitations in aspects of data transmission rate, time delay, reliability and the like; for each communication node, a corresponding subordinate constraint is configured according to the factors such as the functions, the positions, the task requirements and the like, wherein the subordinate constraint is the locality requirements, such as power consumption, bandwidth occupation, security and the like, of each communication node.

When the identification termination condition of each communication node is set, the main constraint and the subordinate constraint are comprehensively considered, and the reasonable termination condition is determined according to the overall performance requirement of the system and the local performance requirement of each node. These conditions may be thresholds regarding data transmission rate, latency, or other performance metrics, and when the communication node reaches or exceeds these thresholds, an identify termination condition is triggered. Such constraint setting can ensure that the whole communication system can control and manage the performance of each node while meeting task requirements.

Performing clustering iterative training on each communication node, and when the clustering result meets the recognition termination condition, ending the clustering and recording the clustering parameters;

for each communication node, acquiring corresponding data sets from a communication database, wherein the data sets comprise communication tasks, characteristic information and the like related to the node, setting initial clustering parameters such as the number of clustering centers, a distance measurement method and the like for each communication node, performing clustering iterative training on the data sets of each communication node by using the clustering algorithm, classifying and clustering the data according to the current clustering parameters in each iteration, and updating the clustering center position.

After each iteration, judging whether the current clustering result meets the identification termination condition according to the defined identification termination condition, and ending the clustering if the clustering result reaches the set termination condition. When the clustering is finished, final clustering parameters including a clustering center position, a distance measurement method and the like are recorded, and the parameters are used for subsequent anomaly detection.

Receiving real-time communication data, and performing cluster analysis on the real-time communication data through cluster parameters to generate a cluster danger value, wherein the cluster parameters are control parameters of nodes corresponding to the real-time communication data;

real-time communication data is acquired from the communication system, and the data comprises information such as a sender, a receiver, data content, a time stamp and the like. According to the related information of the real-time communication data, determining the communication node to which the communication data belongs, and acquiring the clustering parameters of the communication node, wherein the parameters are recorded in the previous step and are used for controlling the clustering analysis. And using the obtained clustering parameters, applying the real-time communication data to a clustering algorithm, and clustering the real-time data according to a distance measurement method and a clustering center position set by the clustering parameters.

Based on the result of the cluster analysis, calculating a dangerous value of each cluster, wherein the dangerous value of the cluster can be determined according to the distribution condition, the degree of abnormality and the like of the data in the cluster, and a higher dangerous value indicates that the data in the cluster may have abnormality. According to the generated cluster dangerous values, corresponding measures can be taken, for example, clusters with dangerous values exceeding a threshold value can be marked as potential anomalies, and further anomaly detection, early warning or processing and the like can be carried out.

And carrying out continuous communication accumulation evaluation on the clustering dangerous values by using the abnormal accumulation constraint, and completing network communication abnormal early warning according to an evaluation result.

And carrying out continuous communication accumulation evaluation on the clustering dangerous values by using the abnormal accumulation constraint, and setting a time window for storing the clustering dangerous values in a certain period, wherein the size of the time window can be set according to actual requirements, accumulating the clustering dangerous values in the time window, carrying out simple summation or calculating an average value, comparing a calculation result with the abnormal accumulation constraint, judging that network communication is abnormal if the calculation result exceeds a threshold value, and generating a corresponding evaluation result.

And carrying out network communication abnormity early warning according to the generated evaluation result, such as triggering an alarm, sending a notification or taking other appropriate actions.

Further, the method further comprises the following steps:

the multidimensional feature weight of the data feature is configured, and feature distance calculation is carried out through a formula, wherein the formula is as follows:

；

wherein,is->And->Manhattan distance between->Characterizing the a-th argument of Manhattan distance under the i-th dimension of features, +.>Characterizing the b-th argument of Manhattan distance under the i-th dimension of features, +.>For characteristic dimension +.>Feature weight for the i-th dimension feature, < +.>And n is the total feature dimension.

The feature distance calculation formula considers the weight among different feature dimensions, so that the similarity among data can be measured more accurately, and the influence degree of different features on distance calculation can be adjusted according to actual requirements by configuring the feature weight, for example, certain important features can be given higher weight, so that the important features have larger influence in the distance calculation.

When the formula is used for calculating the characteristic distance, the distance between the data can be calculated according to the characteristic value of the data and the configured characteristic weight, and the distance calculation method can be used in the clustering process so as to better analyze the data, and the specific characteristic weight and the constant value can be set and adjusted according to the application scene and the requirement.

Further, the method further comprises the following steps:

triggering a danger marking instruction if the evaluation result meets a preset danger threshold;

and generating a shrinkage judgment association through the danger marking instruction, and completing monitoring and identification of subsequent continuous communication accumulated evaluation according to the shrinkage judgment association.

In the continuous communication accumulation evaluation process, each communication is evaluated, if the evaluation result exceeds a preset danger threshold value, namely, the communication has potential abnormal conditions, in the case, the system triggers a danger marking instruction, and the danger marking instruction is used for marking the potential abnormal conditions of the current communication.

By means of the danger marking instruction, a shrinkage judgment association is generated, and the association is used for establishing the relation between the communication evaluation result and the subsequent continuous communication, wherein the relation comprises information such as a time stamp, communication characteristics, related events and the like. Based on the generated shrinkage judgment association, the system monitors and identifies subsequent continuous communication, so as to rapidly check whether the communication continuously shows the same dangerous characteristics or whether further abnormal conditions occur, and through comparison and analysis with the shrinkage judgment association, possible dangerous conditions can be timely found and processed. When the abnormal condition exists in the subsequent communication, the system takes corresponding processing measures, such as stopping or adjusting communication parameters, notifying related personnel to intervene, recording logs and the like, so as to ensure the safety and reliability of the communication.

In summary, the network communication abnormality early warning method and system provided by the embodiments of the present application have the following technical effects:

1. by establishing a communication database and carrying out communication node association according to the analysis result, large-scale communication data can be efficiently organized and managed, so that subsequent analysis and processing are more convenient and efficient;

2. the accuracy and stability of anomaly detection can be improved through feature clustering training and parameter optimization, and the association and anomaly between communication nodes can be better identified through feature clustering realized by calculating feature distances;

3. through executing clustering iterative training on each communication node and evaluating according to the set recognition termination condition, accurate anomaly detection and early warning of real-time communication data can be realized, and the recording and the use of clustering parameters enhance the flexibility and the adaptability of the method;

4. by using the abnormal accumulation constraint to carry out continuous communication accumulation evaluation, the potential abnormal situation in the network communication can be tracked and judged more accurately, and the reliability and timeliness of early warning are improved.

In summary, the network communication abnormality early warning method solves the technical problems of large-scale data processing, abnormality detection accuracy, stability and the like, so that effective early warning of network communication abnormality is realized, and the safety and reliability of a communication system are improved.

Example two

Based on the same inventive concept as the network communication abnormality pre-warning method in the foregoing embodiment, as shown in fig. 2, the present application provides a network communication abnormality pre-warning system, which includes:

the database establishment module 10 is configured to establish a communication database, where the communication database is configured to analyze a communication task, set a communication node according to an analysis result, and then use the communication task as a matching feature, and use the communication node as a data set constructed by matching classification features;

the communication node association module 20, wherein the communication node association module 20 is used for carrying out communication node association according to the analysis result to generate an abnormal accumulation constraint;

the feature clustering training module 30 is used for respectively executing feature clustering training on databases corresponding to the classified features, optimizing feature clustering parameters, wherein feature clustering is realized by calculating feature distances, and each data in the communication databases is provided with a data state identifier;

a termination condition setting module 40, where the termination condition setting module 40 is configured to configure a master constraint according to the communication task, configure a slave constraint with a communication node, and set an identification termination condition of each communication node through the master constraint and the slave constraint;

the clustering iteration training module 50 is used for executing clustering iteration training on each communication node, and when the clustering result meets the identification termination condition, clustering is finished and clustering parameters are recorded;

the cluster analysis module 60 is configured to receive real-time communication data, and perform cluster analysis on the real-time communication data through cluster parameters to generate a cluster risk value, where the cluster parameters are control parameters of nodes corresponding to the real-time communication data;

the abnormality pre-warning module 70 is configured to perform continuous communication accumulation evaluation on the clustered dangerous values with the abnormality accumulation constraint, and complete network communication abnormality pre-warning according to an evaluation result.

Further, the system also comprises a characteristic distance calculation module for executing the following operation steps:

；

Further, the system also comprises a feature cluster training module for executing the following operation steps:

Further, the system also comprises a convergence adjusting module for executing the following operation steps:

Further, the system further comprises an optimizing direction determining module for executing the following operation steps:

Further, the system also comprises a monitoring and identifying module for executing the following operation steps:

In the foregoing description of a network communication abnormality pre-warning method, those skilled in the art may clearly know a network communication abnormality pre-warning method and a system in this embodiment, and for the apparatus disclosed in the embodiment, the description is relatively simple, and relevant places refer to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The network communication abnormality early warning method is characterized by comprising the following steps:

2. The method of claim 1, wherein the method further comprises:

3. The method of claim 1, wherein the method further comprises:

4. A method as claimed in claim 3, wherein the method further comprises:

calculating the distance between each data point and the center of each cluster, distributing the data points to the nearest cluster center according to the calculation result, and recalculating the center point position of each cluster according to the distributed data points;

5. The method of claim 4, wherein the method further comprises:

6. The method of claim 5, wherein the method further comprises:

7. The method of claim 1, wherein the method further comprises:

8. A network communication anomaly early warning system, characterized by being configured to implement a network communication anomaly early warning method as claimed in any one of claims 1 to 7, comprising:

the database establishment module is used for establishing a communication database, and the communication database is used for establishing a data set which is established by taking the communication task as a matching characteristic and taking the communication node as a classification characteristic in a matching way after analyzing the communication task and setting the communication node according to the analysis result;

the communication node association module is used for carrying out communication node association according to the analysis result and generating abnormal accumulated constraint;

the feature clustering training module is used for respectively executing feature clustering training on databases corresponding to the classified features, optimizing feature clustering parameters, realizing feature clustering by calculating feature distances, and enabling each data in the communication databases to have a data state identifier;

the termination condition setting module is used for configuring master constraints according to the communication tasks, configuring slave constraints by the communication nodes, and setting identification termination conditions of all the communication nodes through the master constraints and the slave constraints;

the clustering iteration training module is used for executing clustering iteration training on each communication node, and when the clustering result meets the identification termination condition, clustering is finished and clustering parameters are recorded;

the cluster analysis module is used for receiving the real-time communication data, performing cluster analysis on the real-time communication data through cluster parameters and generating a cluster danger value, wherein the cluster parameters are control parameters of nodes corresponding to the real-time communication data;

and the abnormality early warning module is used for carrying out continuous communication accumulation evaluation on the clustering dangerous values by using the abnormality accumulation constraint and completing network communication abnormality early warning according to an evaluation result.