CN115904369B - Method and system for efficiently aggregating and associated analysis of network security source data - Google Patents

Method and system for efficiently aggregating and associated analysis of network security source data Download PDF

Info

Publication number
CN115904369B
CN115904369B CN202211362682.1A CN202211362682A CN115904369B CN 115904369 B CN115904369 B CN 115904369B CN 202211362682 A CN202211362682 A CN 202211362682A CN 115904369 B CN115904369 B CN 115904369B
Authority
CN
China
Prior art keywords
data
aggregation
time
strategy
window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211362682.1A
Other languages
Chinese (zh)
Other versions
CN115904369A (en
Inventor
闫印强
孙俊虎
赵威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changyang Technology Beijing Co ltd
Original Assignee
Changyang Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changyang Technology Beijing Co ltd filed Critical Changyang Technology Beijing Co ltd
Priority to CN202211362682.1A priority Critical patent/CN115904369B/en
Publication of CN115904369A publication Critical patent/CN115904369A/en
Application granted granted Critical
Publication of CN115904369B publication Critical patent/CN115904369B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for high-efficiency aggregation and association analysis of network security source data, which comprise the steps of accessing a data source by using a real-time queue, storing the result, configuring and storing an aggregation strategy by using a storage medium, aggregating the data based on the source data and the aggregation strategy, storing the data, starting an asynchronous thread to regularly read XML file attributes by adopting a main program of the method, reading file contents if the file is changed at last in a modification time, and informing an observer of the latest aggregation strategy to update the aggregation strategy set in a memory of the observer. Different from the traditional method that server time is taken as a time reference, the method takes data time as a time reference, data with the same aggregation characteristic are aggregated and sent to the downstream in a delay-free mode, a fixed time window step is not required to wait, and the real-time performance and the data authenticity of data processing are ensured.

Description

Method and system for efficiently aggregating and associated analysis of network security source data
Technical Field
The invention relates to the technical field of industrial network security, in particular to a method and a system for efficiently aggregating and associated analyzing network security source data.
Background
The invention relates to the field of industrial network security, in particular to a correlation analysis early warning method and a data efficient aggregation method based on network traffic source data and log source data.
With the high-speed development of the industrial Internet in China, network security accidents occur internationally frequently, and the construction of a network security monitoring system in enterprises becomes more important. In a network security system, a source data rule early warning module occupies an extremely important position, but due to huge enterprise network scale and a large number of network devices, the realization of the module faces the problems of huge source data quantity, complex flow characteristics, multiple early warning rules and the like, so that the early warning module has the defects of instantaneity, high efficiency and accuracy. In order to solve the problems, the preprocessing of the same-characteristic sampling is carried out on mass data by adopting a high-efficiency aggregation method to solve the delay problem of data backlog, and the aim of simple, convenient and rapid online early warning rules and high early warning event accuracy is achieved by adopting a source data association analysis method.
In industrial networks, there are security devices such as firewalls, IDS, forward isolation devices, longitudinal encryption devices, etc.; network devices such as switches and routers; the host equipment such as an operator station, an engineer station, an interface machine and the like needs to analyze port images of the aggregation switch besides log monitoring of the equipment, has huge data volume, and has very poor processing capacity when a single equipment product performs rule early warning under the condition of considering hardware cost, so that real-time preprocessing aggregation is needed for source data. The main aggregation mode at present is time window aggregation, namely waiting is carried out according to a fixed time interval, after a section taking the server time as a reference is met, traversing is carried out on a data set in the interval, and a certain piece of result data is taken for aggregation.
The abundant equipment types lead to multi-type log formats, and complex flow characteristics need more flexible and accurate early warning rules. The main early warning method at present fixes a data source data structure in a code form, and basically realizes early warning judgment on a source data set mainly through a simple black-white list, confidence level, threshold configuration and the like. The data structure cannot be flexibly configured, the application needs to be recoded and restarted when a new data source and a new rule are accessed, the method is very low in efficiency, the original early warning method cannot be used for complex rule configuration, the alarm is output after the rule is matched, the capability of proving the alarm for carrying out association analysis on the source data is not provided, and false alarm is easy to generate.
The current method for data aggregation is mainly realized based on the fixed time interval waiting of server time, and the main realization flow is as follows:
step one, starting a time window task according to an aggregation strategy;
step two, the received data are stored in the corresponding window caches;
step three, judging whether the time window triggering execution condition is met, if not, waiting, repeating the step two, and if so, executing the step four;
and step four, executing aggregation logic.
The implementation of the early warning rule is mainly realized by a recoding method and an SQL statement making method at present.
The method for making the SQL sentence is realized by writing the SQL sentence with corresponding early warning requirement through the SQL-CLI according to the query language mode given by the early warning module.
The recoding method mainly comprises the following steps:
step one, receiving early warning rule requirements;
step two, encoding a source data structure and an early warning rule;
compiling codes, uploading the codes to a server, and restarting the application;
step four, receiving source data to perform rule processing, and directly sending an alarm to the downstream after meeting the rule;
and fifthly, if new online requirements exist or existing requirements are modified, repeating the first step to the fourth step.
The aggregation method has the problems of large data delay, data distortion and the like; the early warning rule method has the problems of certain SQL knowledge, low rule making efficiency, low accuracy and the like, and cannot meet the operation scenes of large data volume, high real-time requirement, various rules, quick making and high warning accuracy.
In order to solve the problems, the invention provides a high-efficiency aggregation method and a correlation analysis early warning method. The invention realizes dynamic loading aggregation rule by any data storage mode (XML configuration, MYSQL, REDIS and the like) configuration, carries out time window trigger calculation based on the time of the data and sends the data with the same aggregation characteristic to the downstream without delay, thereby solving the problems of data delay processing, data distortion and the like; for rule early warning of source data, the invention can make association analysis strategies through a UI interface (comprising a source data structure and early warning rules), all the making can be dynamically loaded without coding and restarting, and complex association, time sequence analysis and association evidence of multiple data sources are supported, thereby solving the problems of low efficiency of online new rules and existing rule modification, single early warning rule making, high false alarm rate and the like.
Disclosure of Invention
The invention provides a method and a system for efficiently aggregating and correlating network security source data, which are used for solving the defects of the prior art.
In one aspect, the present invention provides a method for efficient aggregation and association analysis of network security source data, the method comprising the steps of:
s1: selecting a storage medium to configure an aggregation policy of network security source data, simultaneously selecting a message queue to access the network security source data and store results, starting a file monitoring asynchronous thread to regularly read file attributes of the storage medium, and starting an observer asynchronous thread to wait for notification containing the aggregation policy;
s2: if the last modification time of the storage medium is read and changed, the content of the storage medium is read, the latest strategy in the storage medium is notified to an observer, then the storage medium modification time is continuously monitored, and the observer analyzes the latest strategy and updates the configured aggregation strategy according to the latest strategy;
s3: reading the field corresponding values in the received source data, merging the field corresponding values into key field keys, comparing the key field keys with keys of the existing aggregation strategies, matching the key field keys with the corresponding aggregation strategies, acquiring window objects from a time window manager by combining a source data carrying time stamp and the starting time and the ending time of the aggregation strategies, creating and maintaining the window objects to the window manager if the window does not exist, and acquiring an execution handle of the window based on the acquired window objects;
S4: calculating an aggregation characteristic according to the characteristics of the key fields of the received data source, triggering the time window based on the execution handle, and executing the following aggregation flow based on the time window: and sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, executing the corresponding aggregation strategy and sending the aggregation strategy to a processing body of downstream auxiliary data in real time.
The method utilizes a real-time queue to access a data source and store a result, utilizes a storage medium to configure and store an aggregation strategy, aggregates data based on source data and the aggregation strategy and stores the data, adopts an observer mode, starts an asynchronous thread to regularly read XML file attributes, reads file contents if the file is changed in the last modification time, and informs an observer of the latest aggregation strategy to update a set of the aggregation strategy in a memory of the observer. Different from the traditional method that server time is taken as a time reference, the method takes data time as a time reference, data with the same aggregation characteristic are aggregated and sent to the downstream in a delay-free mode, a fixed time window step is not required to wait, and the real-time performance and the data authenticity of data processing are ensured.
In a specific embodiment, the storage medium is a storage component comprising an XML file and Mysql.
In a specific embodiment, the message queue includes Kafka and rabkitmq.
In a specific embodiment, the S3 specifically includes:
reading received source data and matching with a corresponding aggregation strategy, extracting a time stamp of the received source data, calculating and obtaining the starting time and the ending time of a time window to which the received source data belongs according to the aggregation time characteristics in the time stamp, executing a manager handle of the time window, and inquiring whether the time window exists or not;
if yes, all execution handles of the time window are directly acquired;
if not, creating a new time window based on the starting time and the ending time, storing the new time window into a time window manager, and acquiring all execution handles of the new time window.
In a specific embodiment, the S4 specifically includes:
according to the corresponding aggregation policy, calculating based on the characteristics of the key fields to obtain the aggregation characteristics of the received data source, and operating all the execution handles acquired in the step S3 so as to execute the corresponding aggregation policy to judge whether the data with the same aggregation characteristics exist in the window storage body;
If yes, the received data source is the aggregated data, the received data source is stored in a window storage body, and the received data source is sent to a processing body of downstream auxiliary data;
and if the data source does not exist, transmitting the received data source to a processing body of downstream main data.
In a specific embodiment, the method further comprises the step of carrying out association analysis on the network security source data, and specifically comprises the following steps:
a1: dynamically injecting a source data structure through a user interface, selecting a certain number of data sources in the input source data, and selecting a plurality of key fields from the selected data sources according to association requirements to generate an association strategy;
a2: the association strategy is used for asynchronous data reading when being started, so that the association strategy is matched with the aggregation strategy, and if the association strategy is successfully matched, a pre-warning event is generated;
a3: and automatically associating the original log in the query message queue according to the label of the key field in the pre-alarm event, corroborating the pre-alarm event, and if the corroborating result accords with the alarm characteristic, generating a quasi-alarm event and sending the quasi-alarm event to the downstream.
And the association analysis provides a user interface for realizing hot loading of the data source data structure and drawing a rule diagram hot deployment early warning strategy. And the strategy can be formulated by selecting one or more data sources, carrying out parallel association analysis or time sequence association analysis according to conditions, and automatically carrying out association evidence on early warning information when source data is successfully matched with rules, so that the on-line high efficiency and the warning accuracy of the rules are ensured, and the multi-scenario applicability capability of rule writing is provided.
In a specific embodiment, the user interface is constructed based on a Java language design UI user interaction interface.
In a specific embodiment, the A1 specifically includes:
and maintaining information including sources and data structures of the data sources through the user interface, storing the information into a data source table, selecting a plurality of key fields from the selected data sources according to association requirements, constructing an association graph according to the selected key fields, and storing the association graph into an association analysis model table.
In a specific embodiment, the A2 specifically includes:
and the monitoring thread scans the data source table and the association analysis model table at regular time, if the data source table and the association analysis model table are newly added or modified, a new asynchronous thread is automatically started, an association analysis task is executed or an existing task thread is restarted, and the change of the data source and the association strategy is updated to the execution task in real time.
In a specific embodiment, the performing of the association analysis includes:
and reading the association analysis model in an association analysis model table, converting the association analysis model into an executable logic diagram by using a rule compiler, converting the executable logic diagram into an execution task diagram by using a rule executor, and acquiring an execution physical resource for starting a related thread so as to start an association analysis task.
According to a second aspect of the present invention, a computer-readable storage medium is presented, on which a computer program is stored, which computer program, when being executed by a computer processor, carries out the above-mentioned method.
According to a third aspect of the present invention, a system for efficient aggregation and association analysis of network security source data is provided, the system comprising:
an aggregation policy configuration module: the method comprises the steps of configuring an aggregation strategy for selecting a storage medium to carry out network security source data, simultaneously selecting a message queue to carry out access and result storage on the network security source data, starting a file monitoring asynchronous thread to regularly read file attributes of the storage medium, and starting an observer asynchronous thread to wait for notification containing the aggregation strategy;
an aggregation policy updating module: the method comprises the steps of configuring the content of a storage medium to be read if the last modification time of the storage medium is read to be changed, informing a viewer of the latest strategy in the storage medium, analyzing the latest strategy by the viewer and updating the configured aggregation strategy according to the latest strategy;
an aggregation policy matching module: the method comprises the steps of configuring an execution handle for reading received source data and matching a corresponding aggregation strategy, and acquiring a time window based on a time stamp of the received data source;
An aggregation policy enforcement module: the method comprises the steps of calculating an aggregation characteristic according to the characteristic of the key field of the received data source, triggering the time window based on the execution handle, and executing the following aggregation flow based on the time window: and sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, executing the corresponding aggregation strategy and sending the aggregation strategy to a processing body of downstream auxiliary data in real time.
The method comprises the steps of accessing a data source by using a real-time queue and storing the result, configuring and storing an aggregation strategy by using a storage medium, aggregating data based on the source data and the aggregation strategy and storing the data, starting an asynchronous thread to regularly read XML file attributes by adopting an observer mode, reading file contents if the file is changed in the last modification time, and informing an observer of the latest aggregation strategy to update a set of the aggregation strategy in a memory of the observer. Different from the traditional method that server time is taken as a time reference, the method takes data time as a time reference, data with the same aggregation characteristic are aggregated and sent to the downstream in a delay-free mode, a fixed time window step is not required to wait, and the real-time performance and the data authenticity of data processing are ensured. In addition, the association analysis method provides a user interface for realizing hot loading of the data source data structure and drawing a rule diagram hot deployment early warning strategy. And the strategy can be formulated by selecting one or more data sources, carrying out parallel association analysis or time sequence association analysis according to conditions, and automatically carrying out association evidence on early warning information when source data is successfully matched with rules, so that the on-line high efficiency and the warning accuracy of the rules are ensured, and the multi-scenario applicability capability of rule writing is provided.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the application. Many of the intended advantages of other embodiments and embodiments will be readily appreciated as they become better understood by reference to the following detailed description. Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of a method of efficient aggregation and association analysis of network security source data in accordance with one embodiment of the present application;
FIG. 3 is a flow chart of an aggregation method hot-load aggregation policy according to a specific embodiment of the present application;
FIG. 4 is an aggregate policy enforcement flow chart of a specific embodiment of the application;
FIG. 5 is a flowchart of a correlation analysis method in accordance with a specific embodiment of the present application;
FIG. 6 is a diagram of the overall implementation of the aggregation method of one particular embodiment of the present application;
FIG. 7 is a diagram of an overall implementation of correlation analysis in accordance with a specific embodiment of the present application;
FIG. 8 is a block diagram of a system for efficient aggregation and association analysis of network security source data in accordance with one embodiment of the present application;
fig. 9 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which a method of efficient aggregation and association analysis of network security source data of embodiments of the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various applications, such as a data processing class application, a data visualization class application, a web browser application, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smartphones, tablets, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background information processing server providing support for source data presented on the terminal devices 101, 102, 103. The background information processing server may process the acquired aggregation policy and generate a processing result (e.g., an aggregation feature).
It should be noted that, the method provided in the embodiment of the present application may be executed by the server 105, or may be executed by the terminal devices 101, 102, 103, and the corresponding apparatus is generally disposed in the server 105, or may be disposed in the terminal devices 101, 102, 103.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present application is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 shows a flowchart of a method for efficient aggregation and association analysis of network security source data according to an embodiment of the present application. As shown in fig. 2, the method comprises the steps of:
S1: selecting a storage medium to configure an aggregation policy of network security source data, simultaneously selecting a message queue to access the network security source data and store results, starting a file monitoring asynchronous thread to read file attributes of the storage medium at regular time, and starting an observer asynchronous thread to wait for notification containing the aggregation policy.
In a specific embodiment, the storage medium is a storage component comprising an XML file and Mysql.
In a specific embodiment, the message queue includes Kafka and rabkitmq.
S2: if the last modification time of the storage medium is read and changed, the content of the storage medium is read, the latest strategy in the storage medium is notified to an observer, then the storage medium modification time is continuously monitored, and the observer analyzes the latest strategy and updates the configured aggregation strategy according to the latest strategy.
S3: reading the field corresponding values in the received source data, merging the field corresponding values into key field keys, comparing the key field keys with keys of the existing aggregation strategies, matching the key field keys with the corresponding aggregation strategies, acquiring window objects from a time window manager by combining the source data carrying time stamps and the starting time and the ending time of the aggregation strategies, creating and maintaining the window objects to the window manager if the window does not exist, and acquiring an execution handle of the window based on the acquired window objects. For example, according to the field combinations selected in the UI interface, such as f1, f2, etc., specific values corresponding to the fields f1, f2, etc. in the source data are read and combined into a key field key.
In a specific embodiment, the S3 specifically includes:
reading received source data and matching with a corresponding aggregation strategy, extracting a time stamp of the received source data, calculating and obtaining the starting time and the ending time of a time window to which the received source data belongs according to the aggregation time characteristics in the time stamp, executing a manager handle of the time window, and inquiring whether the time window exists or not;
if yes, all execution handles of the time window are directly acquired;
if not, creating a new time window based on the starting time and the ending time, storing the new time window into a time window manager, and acquiring all execution handles of the new time window.
S4: calculating an aggregation characteristic according to the characteristics of the key fields of the received data source, triggering the time window based on the execution handle, and executing the following aggregation flow based on the time window: and sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, executing the corresponding aggregation strategy and sending the aggregation strategy to a processing body of downstream auxiliary data in real time.
In a specific embodiment, the S4 specifically includes:
according to the corresponding aggregation policy, calculating based on the characteristics of the key fields to obtain the aggregation characteristics of the received data source, and operating all the execution handles acquired in the step S3 so as to execute the corresponding aggregation policy to judge whether the data with the same aggregation characteristics exist in the window storage body;
if yes, the received data source is the aggregated data, the received data source is stored in a window storage body, and the received data source is sent to a processing body of downstream auxiliary data;
and if the data source does not exist, transmitting the received data source to a processing body of downstream main data.
In a specific embodiment, the method further comprises the step of carrying out association analysis on the network security source data, and specifically comprises the following steps:
a1: dynamically injecting a source data structure through a user interface, selecting a certain number of data sources in the input source data, and selecting a plurality of key fields from the selected data sources according to association requirements to generate an association strategy;
a2: the association strategy is used for asynchronous data reading when being started, so that the association strategy is matched with the aggregation strategy, and if the association strategy is successfully matched, a pre-warning event is generated;
A3: and automatically associating the original log in the query message queue according to the label of the key field in the pre-alarm event, corroborating the pre-alarm event, and if the corroborating result accords with the alarm characteristic, generating a quasi-alarm event and sending the quasi-alarm event to the downstream.
In a specific embodiment, the user interface is constructed based on a Java language design UI user interaction interface.
In a specific embodiment, the A1 specifically includes:
and maintaining information including sources and data structures of the data sources through the user interface, storing the information into a data source table, selecting a plurality of key fields from the selected data sources according to association requirements, constructing an association graph according to the selected key fields, and storing the association graph into an association analysis model table.
In a specific embodiment, the A2 specifically includes:
and the monitoring thread scans the data source table and the association analysis model table at regular time, if the data source table and the association analysis model table are newly added or modified, a new asynchronous thread is automatically started, an association analysis task is executed or an existing task thread is restarted, and the change of the data source and the association strategy is updated to the execution task in real time.
In a specific embodiment, the performing of the association analysis includes:
And reading the association analysis model in an association analysis model table, converting the association analysis model into an executable logic diagram by using a rule compiler, converting the executable logic diagram into an execution task diagram by using a rule executor, and acquiring an execution physical resource for starting a related thread so as to start an association analysis task.
FIG. 3 is a flow chart of an aggregation method hot-loading aggregation policy according to a specific embodiment of the present invention, where configuration and update of the aggregation policy are implemented as shown in the figure, specifically as follows: the aggregation policy may be configured by any storage medium, in the simplest way by means of an XML file. The method adopts an observer mode, starts an asynchronous thread to read XML file attributes at regular time, reads file contents if the file is changed in the last modification time, informs an observer of the latest strategy, and changes a strategy set in a memory of the observer.
FIG. 4 is a flowchart showing the implementation of an aggregation policy according to a specific embodiment of the present invention, where the implementation of the aggregation policy is implemented as shown in the figure, specifically including the matching and implementation of the aggregation policy, and the steps thereof are as follows in FIG. 4:
reading source data, and inquiring a corresponding aggregation strategy according to the source data; extracting a timestamp (st) of the received data, calculating the starting time (bt) and the ending time (et) of a time window to which the data belongs according to the aggregation time characteristics of the received data, executing a time window manager handle, inquiring whether the attribute window exists, directly acquiring all execution handles of the window if the attribute window exists, creating a time window if the attribute window does not exist, creating the time window with the starting time bt and the ending event et, storing the time window with the starting time bt and the ending event et in a time window manager, and obtaining all the execution handles of the time window; taking out each field value combination by using the aggregated field characteristics as the aggregated characteristics, executing the query handle of the time window, and transmitting the data to a processing body of the downstream main data if the values of the same aggregated characteristics exist, and transmitting the data to a processing body of the downstream auxiliary data if the values of the same aggregated characteristics exist.
Fig. 5 shows a flowchart of a correlation analysis method according to a specific embodiment of the present invention, in which the flow of the correlation analysis method is as follows:
supporting UI interface and user interaction, maintaining data sources through the UI interface, including sources and data structures of the data sources, and storing the sources and the data structures into a data source table;
selecting one or more data sources required by the association requirement through an interface, selecting a certain number of fields according to an association strategy, constructing an association graph and storing the association graph into an association analysis model table;
the monitoring thread scans the data source table and the associated analysis model table at regular time, if the data source table and the associated analysis model table are newly added or modified, a new asynchronous thread is automatically started, an associated analysis task is executed or an existing task thread is restarted, the change of the data source or the associated strategy is ensured, and the data source or the associated strategy is updated to an execution task in real time;
after the association analysis method is read from the model table, the model is converted into an executable logic diagram by using a rule compiler, the logic execution diagram is converted into an execution task diagram by using a rule executor, the execution physical resource is acquired to start a related thread, and the association analysis task is started. The task execution reads the configuration data source, converts the configuration data source into POJO type data stream, carries out rule judgment according to the drawn association strategy, automatically searches an original log in a period of time according to a key field label if a pre-alarm event is generated, judges whether the original log accords with alarm characteristics, and generates the alarm event to be sent to the downstream if the original log accords with the alarm characteristics.
Fig. 6 shows an overall implementation architecture diagram of an aggregation method according to a specific embodiment of the present invention, where the aggregation method may dynamically configure an aggregation policy based on other storage components such as XML file configuration, and improve the conventional manner by using server time as a time reference, and using data time as a time reference, and aggregate and send the same feature data to downstream in a delay-free manner, without waiting for a fixed window step, so as to ensure real-time performance and data fidelity of data processing.
FIG. 7 illustrates an overall implementation architecture diagram of association analysis for a specific embodiment of the present invention, where the association analysis provides a UI interface for implementing hot loading of data source data structures, drawing a rule diagram hot deployment early warning strategy. And the strategy can be formulated by selecting one or more data sources, carrying out parallel association analysis or time sequence association analysis according to conditions, and automatically carrying out association evidence on early warning information when source data is successfully matched with rules, so that the on-line high efficiency and the warning accuracy of the rules are ensured, and the multi-scenario applicability capability of rule writing is provided.
The invention has the following advantages:
1) The method provided by the invention can be deployed in different complex industrial networks, and has strong adaptability;
2) The method adopts the central manager to execute tasks, can be jointly deployed by multiple hosts, and has strong expansibility;
3) The aggregation method ensures the real-time performance and the authenticity of the source data, and the association method provides user interaction, is simple and convenient to operate and flexible and rich in rule definition.
FIG. 8 illustrates a framework diagram of a system for efficient aggregation and association analysis of network security source data in accordance with one embodiment of the present invention. The system includes an aggregate policy configuration module 801, an aggregate policy update module 802, an aggregate policy matching module 803, and an aggregate policy enforcement module 804.
In a specific embodiment, the aggregation policy configuration module 801 is configured to select a storage medium to configure an aggregation policy of network security source data, select a message queue to access the network security source data and store a result, start a file monitoring asynchronous thread to regularly read file attributes of the storage medium, and start an observer asynchronous thread to wait for a notification containing the aggregation policy;
the aggregation policy updating module 802 is configured to read the content of the storage medium if the last modification time of the storage medium is read to change, notify the observer of the latest policy in the storage medium, and then continuously monitor the storage medium modification time, and the observer analyzes the latest policy and updates the configured aggregation policy according to the latest policy;
The aggregation policy matching module 803 is configured to read the field corresponding values in the received source data, merge the field corresponding values into key field keys, compare the key field keys with the key of the existing aggregation policy, match the corresponding aggregation policy, acquire a window object from a time window manager by combining the source data carrying a time stamp and the start time and the end time of the aggregation policy, and create and maintain the window object to the window manager if the window does not exist, and acquire an execution handle for the window based on the acquired window object;
the aggregation policy execution module 804 is configured to calculate an aggregation feature according to the feature of the key field of the received data source, trigger the time window based on the execution handle, and execute the following aggregation procedure based on the time window: and sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, executing the corresponding aggregation strategy and sending the aggregation strategy to a processing body of downstream auxiliary data in real time.
The system uses a real-time queue to access a data source and store a result, uses a storage medium to configure and store an aggregation policy, aggregates data based on source data and the aggregation policy, stores the data, adopts an observer mode in a main program, starts an asynchronous thread to regularly read XML file attributes, reads file contents if the file is changed in the last modification time, and informs an observer of the latest aggregation policy to update a set of the aggregation policy in a memory of the observer. Different from the traditional method that server time is taken as a time reference, the method takes data time as a time reference, data with the same aggregation characteristic are aggregated and sent to the downstream in a delay-free mode, a fixed time window step is not required to wait, and the real-time performance and the data authenticity of data processing are ensured.
Referring now to FIG. 9, there is illustrated a schematic diagram of a computer system 900 suitable for use in implementing an electronic device of an embodiment of the present application. The electronic device shown in fig. 9 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
As shown in fig. 9, the computer system 900 includes a Central Processing Unit (CPU) 901, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 901. The computer readable storage medium according to the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present application may be implemented in software or in hardware. The units described may also be provided in a processor, and the names of these units do not in some case constitute a limitation of the unit itself.
Embodiments of the present application also relate to a computer readable storage medium having stored thereon a computer program which, when executed by a computer processor, implements the method as described above. The computer program contains program code for performing the method shown in the flow chart. The computer readable medium of the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two.
The method comprises the steps of accessing a data source by using a real-time queue and storing the result, configuring and storing an aggregation strategy by using a storage medium, aggregating data based on the source data and the aggregation strategy and storing the data, starting an asynchronous thread to regularly read XML file attributes by adopting an observer mode, reading file contents if the file is changed in the last modification time, and informing an observer of the latest aggregation strategy to update a set of the aggregation strategy in a memory of the observer. Different from the traditional method that server time is taken as a time reference, the method takes data time as a time reference, data with the same aggregation characteristic are aggregated and sent to the downstream in a delay-free mode, a fixed time window step is not required to wait, and the real-time performance and the data authenticity of data processing are ensured. In addition, the association analysis method provides a user interface for realizing hot loading of the data source data structure and drawing a rule diagram hot deployment early warning strategy. And the strategy can be formulated by selecting one or more data sources, carrying out parallel association analysis or time sequence association analysis according to conditions, and automatically carrying out association evidence on early warning information when source data is successfully matched with rules, so that the on-line high efficiency and the warning accuracy of the rules are ensured, and the multi-scenario applicability capability of rule writing is provided.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (11)

1. A method for efficiently aggregating network security source data, comprising the steps of:
s1: selecting a storage medium to configure an aggregation policy of network security source data, simultaneously selecting a message queue to access the network security source data and store results, starting a file monitoring asynchronous thread to regularly read file attributes of the storage medium, and starting an observer asynchronous thread to wait for notification containing the aggregation policy;
s2: if the last modification time of the storage medium is read and changed, the content of the storage medium is read, the latest strategy in the storage medium is notified to an observer, then the storage medium modification time is continuously monitored, and the observer analyzes the latest strategy and updates the configured aggregation strategy according to the latest strategy;
S3: reading the field corresponding values in the received source data, merging the field corresponding values into key field keys, comparing the key field keys with keys of the existing aggregation strategies, matching the key field keys with the corresponding aggregation strategies, acquiring window objects from a time window manager by combining a source data carrying time stamp and the starting time and the ending time of the aggregation strategies, creating and maintaining the window objects to the window manager if the window does not exist, and acquiring an execution handle of the window based on the acquired window objects;
s4: calculating an aggregation characteristic according to the characteristics of the key fields of the received data source, triggering the time window based on the execution handle, and executing the following aggregation flow based on the time window: sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, transmitting the received data source to a processing body of downstream auxiliary data in real time by executing the corresponding aggregation strategy;
performing association analysis on network security source data, which specifically comprises the following steps:
a1: dynamically injecting a source data structure through a user interface, selecting a preset number of data sources in input source data, and selecting a plurality of key fields from the selected data sources according to association requirements to generate an association strategy;
A2: the association strategy is used for asynchronous data reading when being started, so that the association strategy is matched with the aggregation strategy, and if the association strategy is successfully matched, a pre-warning event is generated;
a3: and automatically associating the original log in the query message queue according to the label of the key field in the pre-alarm event, corroborating the pre-alarm event, and if the corroborating result accords with the alarm characteristic, generating a quasi-alarm event and sending the quasi-alarm event to the downstream.
2. The method of claim 1, wherein the storage medium is a storage component comprising an XML file and Mysql.
3. The method of claim 1, wherein the message queue comprises Kafka and rabkitmq.
4. The method according to claim 1, wherein S3 specifically comprises:
reading received source data and matching with a corresponding aggregation strategy, extracting a time stamp of the received source data, calculating and obtaining the starting time and the ending time of a time window to which the received source data belongs according to the aggregation time characteristics in the time stamp, executing a manager handle of the time window, and inquiring whether the time window exists or not;
If yes, all execution handles of the time window are directly acquired;
if not, creating a new time window based on the starting time and the ending time, storing the new time window into a time window manager, and acquiring all execution handles of the new time window.
5. The method according to claim 1, wherein S4 specifically comprises:
according to the corresponding aggregation policy, calculating based on the characteristics of the key fields to obtain the aggregation characteristics of the received data source, and operating all the execution handles acquired in the step S3 so as to execute the corresponding aggregation policy to judge whether the data with the same aggregation characteristics exist in the window storage body;
if yes, the received data source is the aggregated data, the received data source is stored in a window storage body, and the received data source is sent to a processing body of downstream auxiliary data;
and if the data source does not exist, transmitting the received data source to a processing body of downstream main data.
6. The method of claim 1, wherein the user interface is constructed based on a Java language design UI user interaction interface.
7. The method according to claim 1, wherein A1 comprises in particular:
and maintaining information including sources and data structures of the data sources through the user interface, storing the information into a data source table, selecting a plurality of key fields from the selected data sources according to association requirements, constructing an association graph according to the selected key fields, and storing the association graph into an association analysis model table.
8. The method according to claim 7, wherein the A2 specifically comprises:
and the monitoring thread scans the data source table and the association analysis model table at regular time, if the data source table and the association analysis model table are newly added or modified, a new asynchronous thread is automatically started, an association analysis task is executed or an existing task thread is restarted, and the change of the data source and the association strategy is updated to the execution task in real time.
9. The method of claim 1, wherein the performing of the association analysis comprises:
and reading the association analysis model in an association analysis model table, converting the association analysis model into an executable logic diagram by using a rule compiler, converting the executable logic diagram into an execution task diagram by using a rule executor, and acquiring an execution physical resource for starting a related thread so as to start an association analysis task.
10. A system for efficient aggregation of network security source data, comprising:
an aggregation policy configuration module: the method comprises the steps of configuring an aggregation strategy for selecting a storage medium to carry out network security source data, simultaneously selecting a message queue to carry out access and result storage on the network security source data, starting a file monitoring asynchronous thread to regularly read file attributes of the storage medium, and starting an observer asynchronous thread to wait for notification containing the aggregation strategy;
an aggregation policy updating module: the method comprises the steps of configuring the content of a storage medium to be read if the last modification time of the storage medium is read to be changed, informing an observer of the latest strategy in the storage medium, then continuing to monitor the storage medium modification time, and analyzing the latest strategy and updating the configured aggregation strategy according to the latest strategy by the observer;
an aggregation policy matching module: the method comprises the steps of configuring a key field key for combining field corresponding values in received source data, comparing the key field key with keys of an existing aggregation strategy, matching the key field key with the corresponding aggregation strategy, acquiring a window object from a time window manager by combining a source data carrying time stamp and the starting time and the ending time of the aggregation strategy, creating and maintaining the window object to the window manager if the window does not exist, and acquiring an execution handle of the window based on the acquired window object;
An aggregation policy enforcement module: the method comprises the steps of calculating an aggregation characteristic according to the characteristic of the key field of the received data source, triggering the time window based on the execution handle, and executing the following aggregation flow based on the time window: sequentially judging whether the aggregation features exist in a window storage body according to the time window, if so, transmitting the received data source to a processing body of downstream auxiliary data in real time by executing the corresponding aggregation strategy;
the association analysis module is configured to dynamically inject a source data structure through a user interface, select a preset number of data sources in the input source data, and select a plurality of key fields from the selected data sources according to association requirements to generate an association strategy; the association strategy is used for asynchronous data reading when being started, so that the association strategy is matched with the aggregation strategy, and if the association strategy is successfully matched, a pre-warning event is generated; and automatically associating the original log in the query message queue according to the label of the key field in the pre-alarm event, corroborating the pre-alarm event, and if the corroborating result accords with the alarm characteristic, generating a quasi-alarm event and sending the quasi-alarm event to the downstream.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a computer processor, implements the method of any one of claims 1 to 9.
CN202211362682.1A 2022-11-02 2022-11-02 Method and system for efficiently aggregating and associated analysis of network security source data Active CN115904369B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211362682.1A CN115904369B (en) 2022-11-02 2022-11-02 Method and system for efficiently aggregating and associated analysis of network security source data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211362682.1A CN115904369B (en) 2022-11-02 2022-11-02 Method and system for efficiently aggregating and associated analysis of network security source data

Publications (2)

Publication Number Publication Date
CN115904369A CN115904369A (en) 2023-04-04
CN115904369B true CN115904369B (en) 2023-10-13

Family

ID=86471785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211362682.1A Active CN115904369B (en) 2022-11-02 2022-11-02 Method and system for efficiently aggregating and associated analysis of network security source data

Country Status (1)

Country Link
CN (1) CN115904369B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756197B (en) * 2023-08-23 2023-11-07 中国电信股份有限公司 Method, system and communication equipment for realizing dynamic window and aggregation parameters

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729559A (en) * 2017-11-08 2018-02-23 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium of data base read-write asynchronous access
CN110597891A (en) * 2018-06-12 2019-12-20 武汉斗鱼网络科技有限公司 Device, system, method and storage medium for aggregating MySQL into PostgreSQL database
CN110612716A (en) * 2017-01-20 2019-12-24 十位数通信有限责任公司 Intermediate device for network routing of data messages
CN110806958A (en) * 2019-10-24 2020-02-18 长城计算机软件与系统有限公司 Monitoring method, monitoring device, storage medium and electronic equipment
CN113179267A (en) * 2021-04-27 2021-07-27 长扬科技(北京)有限公司 Network security event correlation analysis method and system
CN115221116A (en) * 2022-07-25 2022-10-21 深圳市网心科技有限公司 Data writing method, device and equipment and readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8365198B2 (en) * 2008-12-09 2013-01-29 Microsoft Corporation Handling exceptions in a data parallel system
US9509765B2 (en) * 2014-07-31 2016-11-29 Splunk Inc. Asynchronous processing of messages from multiple search peers

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110612716A (en) * 2017-01-20 2019-12-24 十位数通信有限责任公司 Intermediate device for network routing of data messages
CN107729559A (en) * 2017-11-08 2018-02-23 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium of data base read-write asynchronous access
CN110597891A (en) * 2018-06-12 2019-12-20 武汉斗鱼网络科技有限公司 Device, system, method and storage medium for aggregating MySQL into PostgreSQL database
CN110806958A (en) * 2019-10-24 2020-02-18 长城计算机软件与系统有限公司 Monitoring method, monitoring device, storage medium and electronic equipment
CN113179267A (en) * 2021-04-27 2021-07-27 长扬科技(北京)有限公司 Network security event correlation analysis method and system
CN115221116A (en) * 2022-07-25 2022-10-21 深圳市网心科技有限公司 Data writing method, device and equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RAMSYS: Resource-Aware Asynchronous Data Transfer with Multicore SYStems;T. Li 等;《IEEE Transactions on Parallel and Distributed Systems》;第28卷(第05期);1430-1444 *
一种异步多线程的Web数据流高效处理模型;张天庆 等;《四川大学学报(自然科学版)》;第42卷(第02期);264-269 *

Also Published As

Publication number Publication date
CN115904369A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
US20230004434A1 (en) Automated reconfiguration of real time data stream processing
CN107506451B (en) Abnormal information monitoring method and device for data interaction
CN111339186B (en) Workflow engine data synchronization method, device, medium and electronic equipment
CN110807067B (en) Data synchronization method, device and equipment for relational database and data warehouse
US11502930B2 (en) Method and system for generating alerts using parameter based network monitoring for alert conditions
CN111309550A (en) Data acquisition method, system, equipment and storage medium of application program
CN110532322B (en) Operation and maintenance interaction method, system, computer readable storage medium and equipment
CN110704290A (en) Log analysis method and device
US11934287B2 (en) Method, electronic device and computer program product for processing data
CN111240940B (en) Real-time service monitoring method and device, electronic equipment and storage medium
CN115904369B (en) Method and system for efficiently aggregating and associated analysis of network security source data
CN110895534A (en) Data splicing method, device, medium and electronic equipment
CN114091704A (en) Alarm suppression method and device
CN111104214B (en) Workflow application method and device
CN111241189A (en) Method and device for synchronizing data
CN117271584A (en) Data processing method and device, computer readable storage medium and electronic equipment
CN115514618A (en) Alarm event processing method and device, electronic equipment and medium
CN114661807A (en) Method, device, equipment and medium for processing abnormity of flight management system
CN115495740A (en) Virus detection method and device
CN114756301A (en) Log processing method, device and system
CN114090514A (en) Log retrieval method and device for distributed system
CN112749204B (en) Method and device for reading data
CN114546780A (en) Data monitoring method, device, equipment, system and storage medium
CN113779017A (en) Method and apparatus for data asset management
CN110262756B (en) Method and device for caching data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant