CN118568426A - Event noise reduction convergence method and system based on attack load clustering - Google Patents

Event noise reduction convergence method and system based on attack load clustering Download PDF

Info

Publication number
CN118568426A
CN118568426A CN202410694711.7A CN202410694711A CN118568426A CN 118568426 A CN118568426 A CN 118568426A CN 202410694711 A CN202410694711 A CN 202410694711A CN 118568426 A CN118568426 A CN 118568426A
Authority
CN
China
Prior art keywords
model
event
noise reduction
data
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410694711.7A
Other languages
Chinese (zh)
Inventor
宋宇宸
房玉东
龙成
李鑫泉
田小龙
马虹斌
陈保江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Big Data Center Of Emergency Management Department
Original Assignee
Big Data Center Of Emergency Management Department
Filing date
Publication date
Application filed by Big Data Center Of Emergency Management Department filed Critical Big Data Center Of Emergency Management Department
Publication of CN118568426A publication Critical patent/CN118568426A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses an event noise reduction convergence method and system based on attack load clustering, which relate to the technical field of data processing, and the method comprises the following steps: collecting historical event load data, and extracting a keyword list by adopting a noise reduction convergence model; after noise reduction treatment, a clustering algorithm is adopted to obtain a clustering output result; generating an event baseline; collecting real-time event load data, and matching by combining an event baseline with a feature matching model; if the two risk marks are not matched, outputting an abnormal risk mark; based on the normal event behavior recognition module, carrying out validity analysis on the security triage result; generating a parameter adjustment instruction; iterative correction is performed through a URL baseline filtering model. The method solves the technical problems that the existing event noise reduction convergence exists due to inaccurate load data characteristic acquisition, so that a noise reduction model is inaccurate, the event noise reduction convergence result is low in accuracy and reliability, and the technical effect of improving the event noise reduction convergence result is achieved.

Description

Event noise reduction convergence method and system based on attack load clustering
Technical Field
The application relates to the technical field of data processing, in particular to an event noise reduction convergence method and system based on attack load clustering.
Background
In modern society, with the increasing degree of informatization, a large amount of data and information is coming into the field of view of people. The data and the information contain valuable information and are mixed with a large amount of noise and redundancy, particularly in the event monitoring and early warning field, how to extract key information from massive data has important significance for improving the efficiency and accuracy of event processing, the attack load clustering is based on the intrinsic characteristics and similarity of the data, the data points with similar load characteristics are integrated into the same category, the important information related to the event can be effectively identified and separated, and the noise data irrelevant to the event is eliminated, but in the traditional event noise reduction convergence process, a large amount of noise and abnormal values often exist in the data due to various reasons, so that the key characteristics of the event cannot be accurately obtained, and the noise and the abnormal values can interfere the accurate judgment and analysis of the event, and influence the accuracy of the event noise reduction result.
Therefore, in the prior art of time noise reduction convergence, the technical problems of inaccurate acquisition of load data characteristics, inaccurate noise reduction model, and low accuracy and reliability of event noise reduction convergence results exist.
Disclosure of Invention
According to the event noise reduction convergence method and system based on attack load clustering, the technical means of feature extraction matching, noise reduction convergence model construction and the like are adopted, so that the technical problems that the noise reduction model is inaccurate due to inaccurate load data feature acquisition in the existing event noise reduction convergence, the event noise reduction convergence result is low in accuracy and reliability are solved, and the technical effects of improving the event noise reduction convergence result accuracy and reliability are achieved.
The application provides an event noise reduction convergence method based on attack load clustering, which comprises the following steps: connecting a security management center, collecting historical event load data, carrying out noise reduction treatment on the historical event load data by adopting a noise reduction convergence model, and extracting a key feature word list; based on the historical event load data, acquiring a clustering output result by adopting a clustering algorithm after noise reduction processing is carried out on the historical event load data; generating an event baseline through the clustering output result and the key feature word list; collecting real-time event load data, carrying out noise reduction processing on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by combining the event baselines with a feature matching model; if the abnormal risk marks are not matched, outputting abnormal risk marks, and carrying out safe triage through the abnormal risk marks; based on a normal event behavior recognition module, carrying out validity analysis on a security triage result, and generating a data tag through validity analysis scoring, wherein the data tag comprises a normal tag and an abnormal tag; generating a parameter adjustment instruction based on the data tag; and adjusting parameters of the key feature word list and parameters of the clustering algorithm through the parameter adjustment instruction, and iteratively correcting the event baseline through a URL baseline filtering model.
In a possible implementation manner, the following processing is performed by performing a security diagnosis through the abnormal risk marker: based on a machine learning algorithm, establishing a vulnerability scanning detection model, wherein the vulnerability scanning detection model comprises a quantity feature identification scanning layer and a trigger frequency feature identification scanning layer of event types; based on the vulnerability scanning detection model, security diagnosis is conducted through the abnormal risk marks, and high-confidence attack behaviors are determined.
In a possible implementation manner, a noise reduction convergence model is adopted to perform noise reduction processing on the real-time event load data, key feature words are extracted, and the following processing is performed: based on the algorithm model management module, configuring an adding, deleting and checking port, wherein the adding, deleting and checking port is used for managing adding, deleting and checking requests; based on an algorithm model management module, configuring a start-stop port, wherein the start-stop port is used for managing start-stop requests; and configuring a data source interface based on the algorithm model management module, wherein the data source interface is used for managing the data acquisition request.
In a possible implementation manner, a noise reduction convergence model is adopted to perform noise reduction processing on the real-time event load data and extract key feature words, and the following processing is further performed: the task management module is connected with the task management module and is used for task registration, task allocation and task execution; and providing request response management corresponding to the start-stop task registration of the addition, deletion and verification port, the start-stop port and the data source interface based on the task management module.
In a possible implementation manner, a noise reduction convergence model is adopted to perform noise reduction processing on the real-time event load data and extract key feature words, and the following processing is further performed: the task management module is in communication connection with a configuration management module, and the configuration management module supports calling of the addition, deletion and verification port, the start and stop port and the data source interface; and in the configuration management module, the acquisition of the new model to the data source is supported, and the new model is synchronized to any one or more corresponding algorithm model modules in the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
In a possible implementation manner, a noise reduction convergence model is adopted to perform noise reduction processing on the real-time event load data and extract key feature words, and the following processing is further performed: acquiring basic management data of the vulnerability scanning detection model, wherein the basic management data comprises a model ID, a model category and a model name; obtaining a model creation time, a model update time and a model state; and generating a model management flow through the model creation time, the model update time, the model state and the basic management data.
In a possible implementation manner, a noise reduction convergence model is adopted to perform noise reduction processing on the real-time event load data and extract key feature words, and the following processing is further performed: calling the start-stop port to perform start-stop operation management on a model operation management platform; calling the adding, deleting and checking port to carry out adding, deleting and checking operation management on the model operation management platform; and acquiring the data source by combining the model management flow in the model operation management platform, wherein the data source interface is used for importing the data source into any one or more corresponding algorithm model modules of the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
The application also provides an event noise reduction convergence system based on attack load clustering, which comprises the following steps:
The historical key feature word extraction module is used for connecting a safety management center, collecting historical event load data, carrying out noise reduction processing on the historical event load data by adopting a noise reduction convergence model, and extracting a key feature word list;
The event baseline generation module is used for acquiring a clustering output result by adopting a clustering algorithm after noise reduction processing is carried out on the historical event load data based on the historical event load data, and generating an event baseline through the clustering output result and the key feature word list;
the feature matching module is used for collecting real-time event load data, carrying out noise reduction treatment on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by combining the event base line with a feature matching model;
The safety triage module is used for outputting an abnormal risk mark if the safety triage module is not matched with the normal event behavior recognition module, carrying out validity analysis on a safety triage result based on the normal event behavior recognition module, and generating a data tag through validity analysis scoring, wherein the data tag comprises a normal tag and an abnormal tag;
And the event baseline iteration correction module is used for adjusting the parameters of the key feature word list and the parameters of the clustering algorithm through the parameter adjustment instruction, and iteratively correcting the event baseline through a URL baseline filtering model.
The event noise reduction convergence method and the system based on the attack load clustering are used for collecting historical event load data, and extracting a keyword list by adopting a noise reduction convergence model; after noise reduction treatment, a clustering algorithm is adopted to obtain a clustering output result; generating an event baseline; collecting real-time event load data, and matching by combining an event baseline with a feature matching model; if the two risk marks are not matched, outputting an abnormal risk mark; based on the normal event behavior recognition module, carrying out validity analysis on the security triage result; generating a parameter adjustment instruction; iterative correction is performed through a URL baseline filtering model. The method solves the technical problems that the existing event noise reduction convergence exists due to inaccurate load data characteristic acquisition, so that a noise reduction model is inaccurate, the event noise reduction convergence result is low in accuracy and reliability, and the technical effect of improving the event noise reduction convergence result is achieved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the following will briefly describe the drawings of the embodiments of the present disclosure, in which flowcharts are used to illustrate operations performed by a system according to embodiments of the present disclosure. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.
Fig. 1 is a schematic flow chart of an event noise reduction convergence method based on attack load clustering according to an embodiment of the present application;
Fig. 2 is a schematic structural diagram of an event noise reduction convergence system based on attack load clustering according to an embodiment of the present application.
Reference numerals illustrate: the system comprises a historical key feature word extraction module 10, an event baseline generation module 20, a feature matching module 30, a security triage module 40 and an event baseline iteration correction module 50.
Detailed Description
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict, the term "first\second" being referred to merely as distinguishing between similar objects and not representing a particular ordering for the objects. The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules that may not be expressly listed or inherent to such process, method, article, or apparatus, and unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. The terminology used herein is for the purpose of describing embodiments of the application only.
The embodiment of the application provides an event noise reduction convergence method based on attack load clustering, as shown in fig. 1, comprising the following steps:
And S100, connecting a security management center, collecting historical event load data, adopting a noise reduction convergence model to carry out noise reduction processing on the historical event load data, and extracting a key feature word list. The data processing system or platform is connected with the safety management center so as to acquire the required historical event load data from the safety management center, the safety management center is usually a system specially responsible for organizing and managing safety work, the safety management center is responsible for monitoring and managing safety risks inside and outside enterprises, and storing relevant historical data, after the safety management center is connected, the historical event load data need to be collected from a data storage of the center, the data possibly comprise relevant information of various safety events, such as types of the events, occurrence time, places, involved personnel, equipment or systems, detailed descriptions of the events and the like, the collected historical event load data is processed by using a noise reduction convergence model to eliminate noise and abnormal values in the data, such as signal analysis, filtering, noise reduction algorithm selection and the like, wherein the noise reduction convergence model is a model used for processing the data noise and extracting key information and is applied to the historical event load data so as to eliminate the noise and redundant information in the data, meanwhile, information related to key feature information is reserved, and a key feature list needs to be extracted from the data after the noise reduction processing is completed, and the key feature list possibly comprises key feature information, such as a key feature extraction algorithm, a key feature can represent key feature information and key feature information based on the key word, and the key feature information.
Step S200, based on the historical event load data, obtaining a clustering output result by adopting a clustering algorithm after noise reduction processing is carried out on the historical event load data. According to the historical event load data, after noise reduction processing is carried out on the historical event load data, clustering is carried out by adopting a clustering algorithm, a clustering output result is obtained, specifically, a proper clustering algorithm such as K-means, hierarchical clustering or DBSCAN is selected, the selected clustering algorithm is applied to the noise reduction processed historical event load data, feature groups are combined into vectors, a baseline is generated by calculating distances among the vectors and using a DBSCAN clustering method, distances or similarity among data points is calculated, then the data points are grouped into different clusters (clusters) according to set conditions, after the clustering algorithm is operated, you can obtain a clustering result, and the clustering algorithm generally comprises information such as a center point of each cluster, the size of the cluster (namely the number of the contained data points), the shape or the distribution of the clusters and the like, and each cluster represents one type of similar or related data points in the historical event load data.
And step S300, generating an event baseline through the clustering output result and the keyword list. The method comprises the steps of utilizing a clustering result obtained by a clustering algorithm and a keyword list extracted from data to jointly construct a datum line for describing and identifying similar events, namely an event datum line, wherein the event datum line is a standard for reference and comparison and comprises main features and modes of specific types of events, through clustering output results, we can know the characteristics and distribution conditions of different categories in historical event load data, so that a keyword cluster related to the specific events is identified, specifically, based on a feature extraction and statistical analysis method, a target IP address and a request catalog are extracted to describe service features, the historical datum line is read, the load length is floated within a certain range of the datum line, the request quantity of the service reaches a certain quantity of services to generate the datum line, real-time warning is carried out according to the datum line matching, the clustering output results are matched and correlated with the keyword list, information capable of representing the main features of the events is extracted from the matched and correlated results, the information capable of representing the size, the shape and distribution of the clusters, the frequency, the relevance of the keyword and the like, and based on the extracted feature information, the construction of the datum line.
And step S400, collecting real-time event load data, carrying out noise reduction processing on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by adopting a feature matching model in combination with the event base line. Collecting real-time event load data, carrying out noise reduction processing on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and then carrying out matching by adopting a feature matching model by combining the event base line, specifically, continuously accumulating and updating a normal internet communication feature library through an algorithm, forming a rule of automatic normal internet access behaviors by adopting a feature matching method based on character strings, further filtering normal internet communication from alarms, acquiring the event load data in real time from the source or related system of event occurrence, wherein the data can come from various sensors, log files, monitoring equipment or other data sources, and because the real-time event load data can contain a large amount of noise, redundant information and abnormal values, the direct use of the data can lead to inaccurate or misleading of results, the noise reduction convergence model is needed to process the data so as to eliminate noise, smooth data and keep key information, the key feature words are needed to be further extracted in the noise reduction processed data, the event base line is one generated by clustering output results and key feature word lists before the event base line is used for describing and identifying similar events, and whether the extracted features belong to the known base line type or not is compared with the real-time data processing process.
In a possible implementation manner, step S400 further includes step S410, based on the algorithm model management module, configuring an add-drop-check port, where the add-drop-check port is used to manage an add-drop-check request. A series of ports for processing add, delete, modify and query operations are set and enabled through a specific algorithm model management module, and are interfaces of a system or an application program for interaction with the outside, and are responsible for receiving and processing the add, delete and modify query requests related to the algorithm model. The method also comprises step S420, wherein a start-stop port is configured based on the algorithm model management module, and the start-stop port is used for managing start-stop requests. The method comprises the steps that a start-stop port is configured based on an algorithm model management module, the start-stop port is mainly used for managing related start-stop requests, specifically, the start-stop port plays a key role in the algorithm model management module and is responsible for receiving and processing requests related to starting and stopping of an algorithm model, when a certain algorithm model needs to be started, an external system or a user can send a starting request through the start-stop port, and after the algorithm model management module receives the requests, corresponding operations such as loading of a model, initializing parameters and the like are carried out so as to ensure that the model can be started correctly and is in a working state; similarly, when it is required to stop a certain algorithm model, a stop request may also be sent through the start-stop port, where the algorithm model management module receives the request and performs a stop operation, such as releasing resources, saving the model state, etc., to ensure that the model can safely stop and release related resources. Step S430 is also included, based on the algorithm model management module, configuring a data source interface for managing the data acquisition request. The data source interface is configured based on the algorithm model management module, and mainly plays a key role in the algorithm model management module to interact with an external data source so as to acquire data required by the algorithm model.
In a possible implementation manner, step S400 further includes step S440, where a task management module is connected, where the task management module is used for task registration, task allocation, and task execution. Integrating a task management module into the system so as to be able to utilize the task registration, task allocation, task execution, etc. functions that it provides, the task management module is a powerful tool that helps to more effectively manage, track, and execute tasks, in particular, the task management module allows users to register or create new tasks; the task management module supports the task allocation function; once the task is assigned, the task management module tracks the execution of the task, including recording the progress of the task, completion, and any possible problems or obstructions. Step S450 is further included, based on the task management module, to provide request response management corresponding to the adding/deleting/modifying/checking port, the start/stop port, and the start/stop task registration of the data source interface. And receiving, processing and returning corresponding responses to the requests related to the start-stop task, registration, addition, deletion and verification through the addition, deletion and verification ports, the start-stop ports and the data source interfaces.
In a possible implementation manner, step S450 further includes step S451, where the task management module is communicatively connected to a configuration management module, and calls to the add-drop port, the start-stop port, and the data source interface are supported in the configuration management module. The method and the device realize information interaction and function call between the task management module and the configuration management module, specifically, an effective data channel is established between the task management module and the configuration management module, so that the task management module and the configuration management module can mutually transmit information, instructions and state update, the information can be realized based on a network protocol, an API (application program interface) or other communication mechanisms, the data transmission between the two modules is ensured to be reliable and efficient, the configuration management module can manage task registration information in the task management module by calling an add-delete-check port, such as adding new task registration, deleting the existing task registration or modifying the attribute of the task registration, and the like, and meanwhile, the configuration management module can call a start-stop port to control the start and stop operation of tasks in the task management module; the configuration management module may also invoke a data source interface to interact with a data source in the task management module, including retrieving data from the data source that is required for task management, or storing the status or results of task management in the data source. Step S452 is further included, in the configuration management module, supporting the acquisition of the new model to the data source, and synchronizing the new model to any one or more corresponding algorithm model modules of the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model, and the vulnerability scanning detection model. The configuration management module has the capability of acquiring new model data from a data source and synchronizing the new model data to a designated algorithm model module, specifically, the configuration management module firstly acquires the data of a new model from the designated data source, after acquiring the model data, the configuration management module analyzes the data to ensure that the format and structure of the data are matched with the requirements of a target algorithm model module, and the configuration management module synchronizes the data to the designated algorithm model module, which can be unidirectional, namely, the data are pushed to the algorithm model module from the data source; or bi-directional, allowing the algorithm model module to pull data from the data source when needed; once the model data is synchronized into the algorithm model module, new data can be used for operation, reasoning or analysis, for example, the noise reduction convergence model can utilize the new data for noise reduction treatment, and the purity of the data is improved; the normal event behavior recognition module can update a behavior pattern library by using new data to more accurately recognize normal behaviors; the feature matching model can utilize new data to perform feature learning and matching; the URL baseline filtering model can update baseline rules by using new data, so that filtering precision is improved; the vulnerability scanning detection model can utilize new data to carry out vulnerability scanning and risk assessment, and if the data needs to be synchronized into a plurality of algorithm model modules, the configuration management module can ensure consistency and synchronism of the data so that each model can operate and analyze based on the same data.
In a possible implementation manner, step S450 further includes step S453, where basic management data of the vulnerability scanning detection model is obtained, where the basic management data includes a model ID, a model category, and a model name. Extracting basic information related to the vulnerability scanning detection models from a related management system or database, including model IDs, model categories, model names, which are commonly used for identifying, classifying and managing different models to ensure correct use and effective management of the models, specifically, each vulnerability scanning detection model has a unique identifier, namely a model ID; model class refers to the type or class to which the vulnerability scanning detection model belongs, e.g., it may belong to different scan types such as network scan, web application scan, host scan, etc.; model names are descriptive names of vulnerability scanning detection models for visually representing the characteristics or purposes of the models, and the model names may be the same and the model IDs necessarily differ. Step S454 is also included, wherein the model creation time, the model update time and the model state are obtained. Step S455 is further included, where a model management flow is generated by the model creation time, the model update time, the model status, and the basic management data. The system can automatically record the creation time of the model, the vulnerability scanning detection model possibly needs to be updated or optimized along with the time, the system can record the update time of the model and possibly generate a new model version when each update is carried out, the model state is important information in the model management flow, the system can carry out real-time or periodic monitoring according to the running state (such as normal, abnormal, pause and the like) of the model and generate a corresponding state report, and an automatic model management flow can be constructed based on the basic management data, for example, when the model needs to be updated, the system can automatically trigger the update operation and inform related personnel to carry out verification and audit; when the model state is abnormal, the system can automatically send an alarm to remind an administrator to process.
In a possible implementation manner, step S400 further includes step S460, where the start-stop port is called to perform start-stop operation management on the model operation management platform. The method comprises the steps that operation control for starting and stopping a vulnerability scanning detection model is carried out through a specific interface or a function point, a start-stop port is a functional component in a task management module, control capability of a model execution state is provided, specifically, a start-stop request for the vulnerability scanning detection model is initiated through a front-end interface or an API interface of a model operation management platform and is sent to a back-end server of the model operation management platform, the back-end server analyzes information in the request and identifies a start-stop port needing to be called, once the start-stop port needing to be called is determined, the back-end server can communicate with the start-stop port in a target task management module, the data packet or the message containing an operation instruction is sent, the start-stop port executes corresponding operation according to the content of the instruction after receiving the instruction, for example, if the start-stop port is the start instruction, the port can trigger a start flow of the vulnerability scanning detection model; if the instruction is a stop instruction, the stop flow of the model is triggered. Step S470 is also included, in the model operation management platform, the add-drop-check port is called to perform add-drop-check operation management. And performing operations such as adding, deleting, modifying and inquiring on the vulnerability scanning detection model through a specific interface or a functional point. And step S480, in the model operation management platform, the data source is acquired in combination with the model management flow, and the data source interface is used for importing the data source into any one or more corresponding algorithm model modules of the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model. The model operation management platform integrates a model management flow generated before, the model operation management platform acquires relevant data from a designated data source, the platform realizes data interaction with the data source by calling a data source interface, the functions of data reading, data conversion, data format verification and the like can be included, once the data is acquired through the data source interface, the platform can import the data into a designated algorithm model module according to the requirements of the model management flow, the corresponding model module can update or operate by using the data, for example, a noise reduction convergence model can perform noise reduction processing by using new data so as to improve the purity of the data; the vulnerability scanning detection model may utilize new data for model training or scanning detection.
And S500, if the abnormal risk marks are not matched, outputting an abnormal risk mark, and performing safe diagnosis through the abnormal risk mark. If the characteristics of the real-time event load data do not match, an abnormal risk flag is output for identifying those situations where a safety risk may exist or a normal behavior pattern is not met, and a safety triage is performed through the abnormal risk flag, where in particular, the safety triage is generally to further classify, analyze and process the detected safety event or abnormal behavior, through the abnormal risk flag, a safety team or system can more efficiently identify which events need to be preferentially processed, which events may need to be further investigated, and which events may be misreported or ignored.
In a possible implementation manner, step S500 further includes step S510, based on a machine learning algorithm, of building a vulnerability scanning detection model, where the vulnerability scanning detection model includes a number feature recognition scanning layer of event types and a trigger frequency feature recognition scanning layer. The vulnerability scanning process generally has the characteristics of multiple objective IPs, multiple objective ports, high request failure rate, complex alarm types and the like, characteristic engineering is designed aiming at the characteristics, vulnerability scanning behaviors are identified, detection of one-to-one and one-to-many vulnerability scanning is realized by the module based on the characteristic engineering and a machine learning algorithm (random forest model), the whole process comprises characteristic identification scanners based on sip and dip groups according to the number of event types, the event triggering frequency and the like, so that potential vulnerabilities are detected and analyzed more comprehensively, wherein the number characteristic identification scanning layers of the event types and the triggering frequency characteristic identification scanning layers are two key components of the model, specifically, the number characteristic identification scanning layers of the event types can collect and analyze security events occurring in a system and classify and count the events, for example, the number of occurrence times of different types of attack attempts (such as SQL injection, cross-site script attack and the like) can be counted, and the method is very key for identifying potential vulnerability modes; the trigger frequency feature recognition scan layer analyzes the frequency of occurrence of security events in the system, including the average rate of occurrence of events, the time interval at which events occur, etc., which may be an indication of a potential vulnerability if a particular security event is frequently triggered within a short period of time, for example. Step S520 is further included, based on the vulnerability scanning detection model, performing security diagnosis through the abnormal risk mark, and determining a high-confidence attack behavior. The vulnerability scanning detection model can carry out comprehensive vulnerability scanning on a system or a network according to an algorithm and a rule built in the model, when the model detects a behavior or an event matched with a known vulnerability pattern, the model can output an abnormal risk mark, analyze the abnormal risk marks, prioritize and classify the abnormal risk marks, and determine which abnormal risk marks are most likely to indicate real attack behaviors, namely high-confidence attack behaviors.
Step S600, based on the normal event behavior recognition module, validity analysis is carried out on the security triage result, and a data label is generated through validity analysis grading, wherein the data label comprises a normal label and an abnormal label. Wherein the normal event behavior recognition module is a component based on a machine learning or deep learning algorithm, which has been trained to recognize normal or expected behavior patterns in the system, the normal event behavior recognition module further analyzes the security triage results, compares each triage result with a predefined normal behavior pattern to determine whether the behavior meets the expected normal behavior characteristics, and generates a validity analysis score based on the comparison result, which may be a value or a level, for quantifying the degree of matching of the triage result with the normal behavior pattern, e.g., the higher the score, the higher the degree of matching of the behavior with the normal behavior, and thus more likely to be normal; conversely, the lower the score, the more likely it is that abnormality, and based on this effectiveness analysis score, the module will generate a data tag for each triage result, including normal tags, abnormal tags.
Step S700, generating a parameter adjustment instruction based on the data tag. According to the information reflected by the data labels, parameter adjustment instructions are generated, relevant model parameters are adjusted and optimized in a targeted mode, and specifically, when the data labels find that certain parameters are improperly set or need to be optimized, the parameter adjustment instructions can be generated based on the data labels, and the instructions can clearly indicate which parameters need to be adjusted and the specific values or ranges of adjustment.
Step S800, adjusting parameters of the key feature word list and parameters of the clustering algorithm through the parameter adjustment instruction, and iteratively correcting the event baseline through a URL baseline filtering model. According to the generated parameter adjustment instruction, parameters of the key feature word list and parameters of the clustering algorithm are adjusted, so that related security events can be more accurately identified; parameters such as distance measurement, clustering quantity, similarity threshold and the like in a clustering algorithm are changed to optimize the clustering effect and efficiency; the URL baseline filtering model is generally used for identifying URLs which are inconsistent with normal network traffic or a behavior mode, when new security events or data are detected, the URL baseline filtering model can compare the new security events or data with baseline data to judge whether the new security events or the data are abnormal, parameters of the baseline filtering model can be optimized through parameter adjustment instructions to enable the abnormal URLs to be more accurately identified, and meanwhile, as new data are continuously input, the system can also iteratively correct event baselines to adapt to the continuously changing network environment and attack methods.
In the above, the event noise reduction convergence method based on the attack load clustering according to the embodiment of the present invention is described in detail with reference to fig. 1. Next, an event noise reduction convergence system based on attack load clustering according to an embodiment of the present invention will be described with reference to fig. 2.
According to the event noise reduction convergence system based on attack load clustering, which is disclosed by the embodiment of the invention, the technical problems that the noise reduction model is inaccurate due to inaccurate acquisition of load data characteristics in the existing event noise reduction convergence, so that the accuracy and reliability of the event noise reduction convergence result are low are solved, and the technical effects of improving the accuracy and reliability of the event noise reduction convergence result are realized. The event noise reduction convergence system based on attack load clustering comprises: the system comprises a historical key feature word extraction module 10, an event baseline generation module 20, a feature matching module 30, a security triage module 40 and an event baseline iteration correction module 50.
The historical key feature word extraction module 10 is used for connecting a security management center, collecting historical event load data, carrying out noise reduction processing on the historical event load data by adopting a noise reduction convergence model, and extracting a key feature word list;
The event baseline generation module 20 is configured to obtain a clustering output result by using a clustering algorithm after noise reduction processing is performed on the historical event load data based on the historical event load data, and generate an event baseline through the clustering output result and the keyword list;
The feature matching module 30 is used for collecting real-time event load data, carrying out noise reduction treatment on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by using a feature matching model by combining the event base line;
The safety triage module 40 is configured to output an abnormal risk mark if the safety triage module 40 is not matched, perform safety triage through the abnormal risk mark, perform validity analysis on a safety triage result based on a normal event behavior recognition module, and generate a data tag through validity analysis score, where the data tag includes a normal tag and an abnormal tag;
the event baseline iteration correction module 50 is configured to adjust parameters of the keyword list and parameters of the clustering algorithm according to the parameter adjustment instruction, and iteratively correct the event baseline according to a URL baseline filtering model by using the event baseline iteration correction module 50.
Next, the specific configuration of the feature matching module 30 will be described in detail. The feature matching module 30 further includes: the task management module is connected with the task management module and is used for task registration, task allocation and task execution; and providing request response management corresponding to the start-stop task registration of the addition, deletion and verification port, the start-stop port and the data source interface based on the task management module.
Next, the specific configuration of the feature matching module 30 will be described in further detail. The feature matching module 30 may further include: the task management module is connected with the task management module and is used for task registration, task allocation and task execution; and providing request response management corresponding to the start-stop task registration of the addition, deletion and verification port, the start-stop port and the data source interface based on the task management module.
Next, the specific configuration of the feature matching module 30 will be described in further detail. The feature matching module 30 may further include: the task management module is in communication connection with a configuration management module, and the configuration management module supports calling of the addition, deletion and verification port, the start and stop port and the data source interface; and in the configuration management module, the acquisition of the new model to the data source is supported, and the new model is synchronized to any one or more corresponding algorithm model modules in the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
Next, the specific configuration of the feature matching module 30 will be described in further detail. The feature matching module 30 further includes: acquiring basic management data of the vulnerability scanning detection model, wherein the basic management data comprises a model ID, a model category and a model name; obtaining a model creation time, a model update time and a model state; and generating a model management flow through the model creation time, the model update time, the model state and the basic management data.
Next, the specific configuration of the feature matching module 30 will be described in further detail. The feature matching module 30 further includes: calling the start-stop port to perform start-stop operation management on a model operation management platform; calling the adding, deleting and checking port to carry out adding, deleting and checking operation management on the model operation management platform; and acquiring the data source by combining the model management flow in the model operation management platform, wherein the data source interface is used for importing the data source into any one or more corresponding algorithm model modules of the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
Next, the specific configuration of the security triage module 40 will be described in detail. The security triage module 40 may further include: based on a machine learning algorithm, establishing a vulnerability scanning detection model, wherein the vulnerability scanning detection model comprises a quantity feature identification scanning layer and a trigger frequency feature identification scanning layer of event types; based on the vulnerability scanning detection model, security diagnosis is conducted through the abnormal risk marks, and high-confidence attack behaviors are determined.
The event noise reduction convergence system based on the attack load clustering provided by the embodiment of the invention can execute the event noise reduction convergence method based on the attack load clustering provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server, including units and modules that are merely partitioned by functional logic, but are not limited to the above-described partitioning, so long as the corresponding functionality is enabled; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (8)

1. The event noise reduction convergence method based on attack load clustering is characterized by comprising the following steps:
connecting a security management center, collecting historical event load data, carrying out noise reduction treatment on the historical event load data by adopting a noise reduction convergence model, and extracting a key feature word list;
Based on the historical event load data, acquiring a clustering output result by adopting a clustering algorithm after noise reduction processing is carried out on the historical event load data;
Generating an event baseline through the clustering output result and the key feature word list;
collecting real-time event load data, carrying out noise reduction processing on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by combining the event baselines with a feature matching model;
if the abnormal risk marks are not matched, outputting abnormal risk marks, and carrying out safe triage through the abnormal risk marks;
based on a normal event behavior recognition module, carrying out validity analysis on a security triage result, and generating a data tag through validity analysis scoring, wherein the data tag comprises a normal tag and an abnormal tag;
generating a parameter adjustment instruction based on the data tag;
and adjusting parameters of the key feature word list and parameters of the clustering algorithm through the parameter adjustment instruction, and iteratively correcting the event baseline through a URL baseline filtering model.
2. The method for event noise reduction convergence based on attack load clustering according to claim 1, wherein security triage is performed by the abnormal risk marker, the method comprising:
Based on a machine learning algorithm, establishing a vulnerability scanning detection model, wherein the vulnerability scanning detection model comprises a quantity feature identification scanning layer and a trigger frequency feature identification scanning layer of event types;
based on the vulnerability scanning detection model, security diagnosis is conducted through the abnormal risk marks, and high-confidence attack behaviors are determined.
3. The event noise reduction convergence method based on attack load clustering as set forth in claim 1, wherein the method comprises:
based on the algorithm model management module, configuring an adding, deleting and checking port, wherein the adding, deleting and checking port is used for managing adding, deleting and checking requests;
based on an algorithm model management module, configuring a start-stop port, wherein the start-stop port is used for managing start-stop requests;
and configuring a data source interface based on the algorithm model management module, wherein the data source interface is used for managing the data acquisition request.
4. The event noise reduction convergence method based on attack load clustering as set forth in claim 3, wherein said method further comprises:
the task management module is connected with the task management module and is used for task registration, task allocation and task execution;
And providing request response management corresponding to the start-stop task registration of the addition, deletion and verification port, the start-stop port and the data source interface based on the task management module.
5. The event noise reduction convergence method based on attack load clustering as set forth in claim 4, wherein said method comprises:
The task management module is in communication connection with a configuration management module, and the configuration management module supports calling of the addition, deletion and verification port, the start and stop port and the data source interface;
and in the configuration management module, the acquisition of the new model to the data source is supported, and the new model is synchronized to any one or more corresponding algorithm model modules in the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
6. The event noise reduction convergence method based on attack load clustering as set forth in claim 5, wherein said method further comprises:
Acquiring basic management data of the vulnerability scanning detection model, wherein the basic management data comprises a model ID, a model category and a model name;
Obtaining a model creation time, a model update time and a model state;
and generating a model management flow through the model creation time, the model update time, the model state and the basic management data.
7. The event noise reduction convergence method based on attack load clustering as set forth in claim 6, wherein said method further comprises:
calling the start-stop port to perform start-stop operation management on a model operation management platform;
Calling the adding, deleting and checking port to carry out adding, deleting and checking operation management on the model operation management platform;
And acquiring the data source by combining the model management flow in the model operation management platform, wherein the data source interface is used for importing the data source into any one or more corresponding algorithm model modules of the noise reduction convergence model, the normal event behavior recognition module, the feature matching model, the URL baseline filtering model and the vulnerability scanning detection model.
8. An event noise reduction convergence system based on attack load clustering, characterized in that the system is configured to implement the event noise reduction convergence method based on attack load clustering according to any one of claims 1 to 7, and the system comprises:
The historical key feature word extraction module is used for connecting a safety management center, collecting historical event load data, carrying out noise reduction processing on the historical event load data by adopting a noise reduction convergence model, and extracting a key feature word list;
The event baseline generation module is used for acquiring a clustering output result by adopting a clustering algorithm after noise reduction processing is carried out on the historical event load data based on the historical event load data, and generating an event baseline through the clustering output result and the key feature word list;
the feature matching module is used for collecting real-time event load data, carrying out noise reduction treatment on the real-time event load data by adopting a noise reduction convergence model, extracting key feature words, and matching by combining the event base line with a feature matching model;
The safety triage module is used for outputting an abnormal risk mark if the safety triage module is not matched with the normal event behavior recognition module, carrying out validity analysis on a safety triage result based on the normal event behavior recognition module, and generating a data tag through validity analysis scoring, wherein the data tag comprises a normal tag and an abnormal tag;
And the event baseline iteration correction module is used for adjusting the parameters of the key feature word list and the parameters of the clustering algorithm through the parameter adjustment instruction, and iteratively correcting the event baseline through a URL baseline filtering model.
CN202410694711.7A 2024-05-31 Event noise reduction convergence method and system based on attack load clustering Pending CN118568426A (en)

Publications (1)

Publication Number Publication Date
CN118568426A true CN118568426A (en) 2024-08-30

Family

ID=

Similar Documents

Publication Publication Date Title
US11049056B2 (en) Discovery of sensitive data location in data sources using business/enterprise application data flows
WO2018195252A1 (en) Field content based pattern generation for heterogeneous logs
CN114385391A (en) NFV virtualization device operation data analysis method and device
WO2021159834A1 (en) Abnormal information processing node analysis method and apparatus, medium and electronic device
CN104090941A (en) Database auditing system and database auditing method
CN103577514A (en) Method and apparatus automated data exploration
CN107347016B (en) Signaling flow model identification method and abnormal signaling flow identification method
CN111078512A (en) Alarm record generation method and device, alarm equipment and storage medium
CN112528279A (en) Method and device for establishing intrusion detection model
KR101444250B1 (en) System for monitoring access to personal information and method therefor
CN114023076B (en) Specific vehicle tracking method based on multi-source heterogeneous data
CN112187914A (en) Remote control robot management method and system
CN114461864A (en) Alarm tracing method and device
CN117114420B (en) Image recognition-based industrial and trade safety accident risk management and control system and method
CN106446720A (en) IDS rule optimization system and optimization method
CN112068979B (en) Service fault determination method and device
CN110909380B (en) Abnormal file access behavior monitoring method and device
CN110956030B (en) Method and system for comparing configuration information of remote machine of transformer substation
CN102521378A (en) Real-time intrusion detection method based on data mining
CN117093556A (en) Log classification method, device, computer equipment and computer readable storage medium
CN108989086B (en) Open vSwitch illegal port operation automatic discovery and tracing system in OpenStack platform
CN118568426A (en) Event noise reduction convergence method and system based on attack load clustering
CN117240522A (en) Vulnerability intelligent mining method based on attack event model
CN117009180A (en) Log and abnormal alarm information processing method and device
CN115296888A (en) Data radar monitoring system

Legal Events

Date Code Title Description
PB01 Publication