CN115801353A

CN115801353A - Linkage script processing method after real-time aggregation of safety event logs based on big data level

Info

Publication number: CN115801353A
Application number: CN202211367585.1A
Authority: CN
Inventors: 朱琪; 周淼淼; 方波; 梁忠辉
Original assignee: Smart Net Anyun Wuhan Information Technology Co ltd
Current assignee: Smart Net Anyun Wuhan Information Technology Co ltd
Priority date: 2022-11-03
Filing date: 2022-11-03
Publication date: 2023-03-14

Abstract

The invention discloses a linkage script processing method after real-time aggregation of safety event logs based on a big data level, which comprises the following steps: accepting logs from kafka; judging whether the log conforms to the safety event rule, if so, entering a step S3; otherwise, not processing; performing attribute enhancement operation on the log to obtain an enhanced log; checking the occurrence time of the log and generating a unique ID of the security event; inserting the security event ID into a security event table of the kudu, and simultaneously forwarding the security event to a service theme corresponding to the kafka; and utilizing a script disposal program in the business theme to carry out safety event disposal. The invention has the advantages of greatly reducing the load of the security event log and script processing program on the storage layer and improving the real-time performance of the whole system.

Description

Linkage script processing method after real-time aggregation of safety event logs based on big data level

Technical Field

The invention relates to the field of computer data processing, in particular to a linkage script processing method after real-time aggregation of safety event logs based on a big data level.

Background

At present, in the prior art, if security events need to be generated by aggregation according to occurrence time, log warehousing is usually adopted, and then a timing task is called to generate security events by aggregation of logs (for example, query 3.

Disclosure of Invention

In order to solve the problems of data omission and poor real-time performance in the process of processing the security logs in the prior art, the invention provides a linkage script processing method based on real-time aggregation of large-data-level security event logs, which comprises the following steps:

s1, receiving logs from kafka;

s2, judging whether the log conforms to the safety event rule, and if so, entering a step S3; otherwise, not processing;

s3, performing attribute enhancement operation on the log to obtain an enhanced log;

s4, checking the occurrence time of the log and generating a unique ID of the security event;

s5, inserting the security event ID into a security event table of the kudu, and simultaneously forwarding the security event to a service theme corresponding to the kafka;

and S6, carrying out safety event handling by using a script handling program in the business theme.

The beneficial effects provided by the invention are as follows: the method has the advantages that the logic of generating the security events by the common timing query library aggregation logs is modified into the logic of generating the security events in real time, and the script program does not depend on a table look-up mode to process the security events, so that the load of the security event logs and the script processing program on a storage layer is greatly reduced, and the real-time performance of the whole system is improved.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.

Before elaborating on the present application, a unified description of related concepts is first provided, as follows:

1. kudu: the novel column-type storage system is a novel column-type storage system with an open source of Cloudera, is one of members (incubating) of an Apache Hadoop ecosphere, is specially used for quickly analyzing quickly changing data, and fills the vacancy of the conventional Hadoop storage layer.

2. Kafka: is an open source stream processing platform developed by the Apache software foundation and written by Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that can handle all the action flow data of a consumer in a web site.

3. topic: is merely a logical concept, representing a class of messages, and may also be considered a place to which a message is sent; topic is usually used to distinguish actual services, such as service a using one topic and service B using another topic.

4. Postgres database: the system is a free software object-relational database management system with very complete characteristics.

Referring to fig. 1, fig. 1 is a flow chart of the method of the present invention. The invention provides a linkage script processing method after real-time aggregation of safety event logs based on a big data level, which comprises the following steps:

s1, receiving logs from kafka;

it should be noted that the log from kafka is obtained by a log collector, or collected by a database collection service logstack, managed by corresponding configuration, normalized, and then forwarded by http, so as to obtain a log file. This is not important to the present application and is only explained briefly.

it should be noted that, for example, the original log is:

<14> -Mon Oct 31 2022 14: 2022-10-31: brute force cracking alarm name: brute force alarm rating: high risk alarm category: other trigger conditions: content [ A ] "id": content [ A ] "jsonrpc": content [ A ] "error" content [ A ] "result" content [ A ] "status": alarm type: data flow characteristic value alert source IP address: 37.59.65.41 source port: 10 destination IP address: 192.168.181.236 destination port: 1212 cumulative number of triggers: 1;

security event rules:

1. the log risk level is high risk, medium risk and low risk

2. The alarm name and the link name contain brute force cracking, and the log type is classified into a brute force cracking type security event.

If the two rules are met, the log is classified as a safety event log and is subsequently processed. Then the log is in accordance with the safety event rule and is further processed; since the security events are mainly processed in the application, the non-security events are not processed in the application.

it should be noted that, the attribute enhancement specifically refers to supplementing some information to the acquired log, for example, the original log is:

<14> Mon Oct 31 2022 14 family 40.192.168.184.111 family network full traffic safety analysis System-Server trigger time: 2022-10-31: brute force cracking alarm name: brute force alarm rating: high-risk alarm types: other trigger conditions: content: [ A ] "id": content: [ A ] "jsonrpc": content: [ A ] "error" content: [ A ] "result" content: [ A ] "status": alarm type: data flow characteristic value alert source IP address: 37.59.65.41 source port: 10 destination IP address: 192.168.181.236 destination port: 1212 cumulative number of triggers: 1;

log structure after enhancing attributes:

"id": 37.59.65.41, 10, 192.168.182.173, 1212| password attack | violence crack |2022-10-29, 05: "password attack", "dst _ ip": 4325 ": result _ port", 1212"," depth ": 192.168.184.111", "alarm _ name", violence crack "," category ":": password attack "," event _ chip _ type ", etc", "attack _ stage", attack "," type strand ",", "hierarchy": height "," trigger _ start _ e ", 2022-10-29", 09 "02", "pointer _ end _ time", 2022-10-19 ", and" pointer _ end _ pointer ": result": "event _ name": brute force crack "," rule _ id ": null", "src _ ip _ long":624640297 "," dst _ ip _ long ": 3232282285", "date _ time": 2022-10-29"}

Therefore, after processing, the attributes such as the ip address, the ip associated asset, the event category to which the log belongs and the like are enhanced.

as an embodiment, the unique ID of the security event in this application is composed of a source IP, a destination IP, a source port, a destination port, an event type, an event subtype and the event section of the log; for example, the ID is: 37.59.65.41 inducing no current 192.168.184.1 inducing no current 1212| password attack | brute force cracking | 2020-10-20; this application is intended to be illustrative only. Of course, in some other embodiments, other components may be included, which are not intended to be limiting herein.

it should be noted that the warehousing of the security event and the forwarding of the security event are simultaneously performed, so that the real-time property of log processing is ensured, and a scenario handling program is not required to look up a security event table (network _ attack) stored in kudu to process the security event, which is a key point of the present application.

The description about simultaneous operation is as follows: the Spark computing task itself supports two degrees of parallelism:

1. data parallel: each job in spark is equivalent to one application, each application execution can generate a plurality of jobs, when one jobs is triggered by one attach, one jobs can be divided into a plurality of stages, one jobb can be divided when a shuffle occurs, and data parallelism is achieved, and the task is completed.

2. Physical parallelism: the spark task itself can be distributed into a plurality of tasks, and run on different machines.

Since the security event is already confirmed when the security event is generated, there is no context between the two actions (actions) of warehousing the security event and forwarding the security event, and the security event can be executed simultaneously by using spark characteristics.

The actions are performed simultaneously, when the processed event data is inserted into the database by the code, a copy of the security data is made at the same time and is forwarded to topic of kafka.

The whole process can be simplified into a security event generation part and a script disposal part, after the log arrives, the security event can be directly generated according to the occurrence time of the log and stored into a security event table of kudu after analysis and enhancement operation, meanwhile, the security event is forwarded to topic corresponding to kafka, the security event storage and the security event forwarding run simultaneously, and the real-time performance of log processing is guaranteed.

The scenario processing procedure in step S6 is performed by using a postgres database scenario table; and executing corresponding handling operation after filtering the corresponding safety event according to the information provided by the script table.

Specifically, first accept the forwarded security event from kafka; reading a database filtering rule by adopting a postgres database script table, and judging whether the security event meets the rule or not;

judging whether the security event is processed or not according to the unique ID of the security event, if not, adding security event information in a script table of a postgres database; if the processing is finished, corresponding processing operations are executed, such as closing the security event, blocking the ip which initiates the attack through a firewall or informing related personnel of the security event information.

And finally, updating the handling state of the related security events, and simultaneously saving the related security events to a postgres database.

The script table comprises a script configuration table, a script node table and a script sub node table.

As an example, please refer to table 1, where table 1 is a field description of the scenario configuration table.

Scenario configuration table: the method mainly comprises the steps of recording script id, script name, script description, data source type, script node set, creation time, modification time and latest response time, and distinguishing data according to the data source type.

Table 1 script configuration table

Referring to table 2, table 2 is a field description of the scenario node table.

Script node table: the method mainly comprises the steps of recording node id, node type, node name, child node set and node sequence, and distinguishing rules and dispositions according to the node type, wherein the child node set records attributes and data of child nodes.

Table 2 script node table

Referring to table 3, table 3 is a description of the script child node table field.

A script child node table: the method mainly comprises the steps of recording child node id, child node name, attribute, operational character and value, and recording the attribute and the value according to the node name.

Table 3 script node table

Finally, for the implementation of the present invention, the timed task may be written into the jar packet run by spark, and the spark executes the security event generation program (i.e., steps S1 to S5), while enabling another scenario processing program (step S6), which may be a distributed program or a java program.

In conclusion, the beneficial effects of the invention are as follows: the method has the advantages that the logic of generating the security events by the common timing query library aggregation logs is modified into the logic of generating the security events in real time, and the script program does not depend on a table look-up mode to process the security events, so that the load of the security event logs and the script processing program on a storage layer is greatly reduced, and the real-time performance of the whole system is improved.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims

1. The linkage script processing method after real-time aggregation of the safety event logs based on the big data level is characterized by comprising the following steps of: the method comprises the following steps:

s1, receiving logs from kafka;

2. The big data level security event log based real-time aggregation post-linkage scenario processing method as claimed in claim 1, wherein: in step S5, if the corresponding security event ID exists in the kudu security event table, the corresponding security event is updated, and the updated security event is forwarded to the service topic corresponding to kafka.

3. The big data level security event log based real-time aggregation post-linkage scenario processing method as claimed in claim 1, wherein: the script processing procedure in the step S6 is carried out by adopting a postgres database script table; and executing corresponding handling operation after filtering the corresponding safety event according to the information provided by the script table.

4. The big data level security event log real-time aggregation based linkage scenario processing method of claim 3, wherein: the script table comprises a script configuration table, a script node table and a script sub node table.

5. The big data level security event log based real-time aggregation post-linkage scenario processing method of claim 4, wherein: the scenario configuration table includes: the method comprises the steps of filtering information of the script, and distinguishing data by adopting the data source type.

6. The big data level security event log based real-time aggregation post-linkage scenario processing method of claim 4, wherein: the scenario node table includes: the method comprises the steps of node id, node type, node name, child node set and node sequence, and distinguishing rules and treatment according to the node type, wherein the child node set records attributes and data of child nodes.

7. The big data level security event log based real-time aggregation post-linkage scenario processing method of claim 6, wherein: the script child node table includes: and recording the attributes and the values according to the node names.