CN111858251A - Big data computing technology-based data security audit method and system - Google Patents

Big data computing technology-based data security audit method and system Download PDF

Info

Publication number
CN111858251A
CN111858251A CN202010713842.7A CN202010713842A CN111858251A CN 111858251 A CN111858251 A CN 111858251A CN 202010713842 A CN202010713842 A CN 202010713842A CN 111858251 A CN111858251 A CN 111858251A
Authority
CN
China
Prior art keywords
data
log
log data
real
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010713842.7A
Other languages
Chinese (zh)
Other versions
CN111858251B (en
Inventor
刘迎风
冯桂安
梁满
冯骏
何怡
傅行晓
周亚美
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Big Data Center
Original Assignee
Shanghai Big Data Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Big Data Center filed Critical Shanghai Big Data Center
Priority to CN202010713842.7A priority Critical patent/CN111858251B/en
Publication of CN111858251A publication Critical patent/CN111858251A/en
Application granted granted Critical
Publication of CN111858251B publication Critical patent/CN111858251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a data security auditing method and system based on big data computing technology, belonging to the field of big data security, and comprising the steps of collecting log data of a server and sending the log data to an first-class processing platform; receiving one or more log data, analyzing, and sending the analyzed log data to at least one data target place; classifying the analyzed log data, judging whether the log data is real-time data or non-real-time data, and sending the real-time data to a stream processing platform for storage; sending the non-real-time data to a data center for storage; analyzing and processing the log data respectively to obtain an analysis result; and generating and outputting corresponding alarm information according to the analysis result. The invention has the beneficial effects that: log collection and storage are realized based on the flash, task scheduling and task monitoring are introduced, and flash log collection sources and output sources are enriched; and data security audit, alarm monitoring management and processing and security risk identification are realized based on the flink.

Description

Big data computing technology-based data security audit method and system
Technical Field
The invention relates to the field of big data security, in particular to a data security auditing method and system based on big data computing technology.
Background
In recent years, data security audit systems have become more important, and the data audit systems are mainly used for monitoring and recording various operation behaviors of a data server, analyzing network data, intelligently analyzing various operations of the data server in real time, and recording the operations into an audit database for query, analysis and filtration in the future, so that monitoring and audit of user operations of a target data audit system are realized, and particularly, when public data resources of various industries are integrated and utilized, the data security audit systems are urgently needed to provide guarantees for security application, shared exchange and opening of data.
In the existing data circulation use process, due to the lack of audit safeguard measures, the query mode for safety events in work is conditional screening by means of a large number of manual workers, retrieval is carried out in a massive log library, the audit efficiency is low, the result is greatly interfered by human factors, so that the problems of untimely audit, insufficient audit strength and the like exist, the requirement of data safety audit cannot be met, the safety risk exists in the data circulation use process, the traditional big data calculation method is limited by related constraints of disk read-write performance and network performance, and the data safety audit method and the data safety audit system based on the big data calculation technology are not highly efficient in the aspects of query, calculation, storage and the like of real-time data, so that the data safety audit method and the data safety audit system based on the big data calculation technology are urgently needed to be designed to meet the requirement of actual use.
Disclosure of Invention
In order to solve the technical problems, the invention provides a data security audit method and a data security audit system based on a big data computing technology.
The technical problem solved by the invention can be realized by adopting the following technical scheme:
the invention provides a data security auditing method based on big data computing technology, which comprises the following steps:
step S1, collecting the log data of the server, and sending the collected log data to an first-stream processing platform;
step S2, receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the log data to at least one data target place;
step S3, classifying the analyzed log data, and determining whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to the stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
step S4, according to the classification in the step S3, analyzing the log data respectively to obtain an analysis result, and outputting the analysis result;
and step S5, generating and outputting corresponding alarm information according to the analysis result.
Preferably, in the step S1, during the log data collection process, the collection status and the collection amount of the stream processing platform and the log data are continuously managed and monitored.
Preferably, the real-time data is analyzed and processed online, and the non-real-time data is analyzed and processed offline;
the online analysis step comprises:
step A1, classifying the real-time data and storing the real-time data in a cluster of the stream processing platform, wherein the cluster comprises a global event and at least one internal event;
step A2, performing real-time correlation analysis on the global event and at least one internal event;
step a3, determine whether it is an internal event:
if yes, go to step A4;
if not, generating the internal event and storing the internal event in one of the internal events of the cluster;
step A4, when it is judged that debugging and monitoring are needed, outputting a first analysis result;
the offline analyzing step comprises:
step B1, storing offline rules in advance, and issuing the offline rules to the stream processing platform;
step B2, receiving the offline rule, and calling the log data of the data center according to the offline rule;
step B3, performing batch analysis on the non-real-time log data, outputting a second analysis result and issuing the second analysis result to the stream processing platform;
and step B4, receiving the second analysis result and sending the second analysis result to the document database.
Preferably, in step S2, at least one parsing node parses the log data, and the parsing step is as follows:
step 21: initializing the log data;
step 22: extracting effective log information from the log data;
step 23: and processing the log information to obtain the log data of at least one data type, and respectively sending the log data to at least one data target place.
Preferably, in step S1, the log collection system is controlled to collect the log data by performing a functional configuration with the log collection system, where the functional configuration includes collection frequency, collection time period, and on and off of a task.
The invention also provides a data security auditing system based on the big data computing technology, which is applied to the data security auditing method based on the big data computing technology, and comprises the following steps:
the task scheduling module is connected with the log acquisition system and used for acquiring log data of the server and sending the acquired log data to the first-class processing platform;
the analysis module is connected with the stream processing platform and used for receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the analyzed log data to at least one data target place;
the audit analysis module is connected with the analysis module and used for classifying the analyzed log data and judging whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to the stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
the audit analysis module analyzes and processes the log data to obtain an analysis result and outputs the analysis result;
and the alarm module is connected with the audit analysis module and used for generating and outputting corresponding alarm information according to the analysis result.
Preferably, the data security audit system further includes a monitoring module, which is respectively connected to the log collection system and the stream processing platform, and is configured to continuously manage and monitor collection conditions and collection amounts of the stream processing platform and the log collection system during the log data collection process.
Preferably, the audit analysis module comprises:
the online analysis engine is connected with the stream processing platform and is used for performing real-time correlation analysis on the global events and the plurality of internal events of the stream processing platform and outputting a first analysis result;
and the offline analysis engine is connected with the data center and used for calling the log data in the data center according to the issued offline rule, carrying out batch analysis on the non-real-time log data and outputting a second analysis result.
Preferably, the alarm module comprises:
the first alarm unit is connected with the online analysis engine and used for generating and outputting corresponding first alarm information according to the first analysis result;
and the second alarm unit is connected with the offline analysis engine and used for generating and outputting corresponding second alarm information according to the second analysis result.
Preferably, the parsing module includes a plurality of parsing nodes, and each parsing node is correspondingly provided with a parser for initializing the log data, extracting effective log information from the log data, obtaining the log data of at least one data type according to the log information, and sending the log data to at least one data destination respectively.
The invention has the beneficial effects that:
the log collection and storage capacity is realized by the log collection and storage capacity based on the open source frame log collection system (flash), the log collection system is subjected to function iteration, the concepts of task scheduling and task monitoring are introduced, and the log collection source and the output target place of the log collection system are enriched; the development of functional modules such as data security audit capability, alarm monitoring management and processing capability, security risk identification capability access and the like is realized through modeling based on an open source assembly stream processing engine (flink), the construction of a data security audit system is realized, security audit is carried out in the whole life cycle of data acquisition, transmission, storage, processing, exchange and destruction, relatively comprehensive security management service is provided for a large data resource platform, and the normal use of data circulation is ensured; meanwhile, the system can continuously check, find and early warn various abnormal and illegal behaviors in the service supporting system, timely find the confidential operation event, accurately and quickly position the operator of the confidential event, and store the relevant evidence which can be used for pursuing accountability.
Drawings
FIG. 1 is a flow chart of a data security auditing method based on big data computing technology in the present invention;
FIG. 2 is a flow chart of log data parsing in the present invention;
FIG. 3 is a flow chart of an on-line analysis in the present invention;
FIG. 4 is a flow chart of an off-line analysis in the present invention;
FIG. 5 is a block diagram of a task scheduling and monitoring architecture according to the present invention;
FIG. 6 is a schematic diagram of the operation of the stream processing engine (Flink) according to the present invention;
FIG. 7 is a block diagram of the flow of an online policy in the present invention;
FIG. 8 is a block diagram of the flow of an offline policy in the present invention;
FIG. 9 is a block diagram of a data security audit system according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
The invention provides a data security auditing method based on big data computing technology, which belongs to the field of big data security, and as shown in figures 1 and 5, the data security auditing method comprises the following steps:
step S1, collecting the log data of the server, and sending the collected log data to an first-stream processing platform;
step S2, receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the log data to at least one data target place;
step S3, classifying the analyzed log data, and determining whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to a stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
step S4, according to the classification in step S3, the log data are respectively analyzed and processed to obtain an analysis result, and the analysis result is output;
and step S5, generating and outputting corresponding alarm information according to the analysis result.
Specifically, the function configuration is performed between the web end and the log acquisition system 1, the log acquisition system 1 is controlled to gather log data in batches, the log data acquired by the log acquisition system 1 is called, and the log data is sent to the stream processing platform to be issued.
Further, in this embodiment, the log collection System 1 provided by the present invention is a Distributed, highly reliable, and highly available collection System, and the log collection System 1 is based on an open source framework flash, and can collect, aggregate, and move a large amount of log data of different data sources to a data center (Hadoop Distributed File System) for storage.
The open source stream processing platform can adopt Apache Kafka and is written by Scala and Java. Kafka is a distributed publish-subscribe messaging system with high throughput that can handle all action flow data of a consumer in a web site. Kafka unifies online and offline message processing through a parallel loading mechanism of a Hadoop Distributed File System (HDFS for short), and can also provide real-time messages through clustering.
Specifically, a certain one of the stream processing platforms or a plurality of log data subscribed in batch is read in a subscription mode, and the read log data is respectively analyzed, wherein the analysis method comprises the following steps: multilevel JSON flat conversion, irregular text regular analysis and database table field mapping; the method comprises the steps of converting and outputting analyzed log data, and sending the converted and output log data to at least one different data target place, wherein the data target place comprises an open source flow processing platform (Kafka), a distributed file system (HDFS), a Lucene-based Search server (Elastic Search), httpfs, an open source database (hbase), a file, a relational database and the like.
Specifically, in the log auditing process, log data are classified, classified and stored according to whether the log data are real-time data or non-real-time data, the real-time log data are sent to a stream processing platform for storage, and the non-real-time log data are sent to a data center for storage;
and auditing and analyzing the log data by respectively calling real-time data in the stream processing platform or calling non-real-time data in the data center, obtaining an analysis result after analysis and processing, outputting the analysis result, and generating and outputting corresponding alarm information and/or monitoring and debugging information according to the analysis result.
And log auditing and safety identification warning adopt a calculation engine based on an open source component Apache flight. Flink is a streaming media technology computing engine implemented by java. Flink is very powerful, can process both stream data (stream data) and batch data (batch data), and can also have the functions of a general purpose computing engine (Spark) and Spark stream, but unlike the general purpose computing engine (Spark), Flink essentially has only the concept of stream, and batch is considered as special stream.
A further preferred embodiment, as shown in fig. 6, wherein Flink essentially comprises three components: JobClient, JobManager and TaskManager.
The user submits a Flink program to the JobClient, the JobClient sends the program to the JobManager, and the JobManager receives the Job program and then feeds back the program to the JobClient. The JobManager plans to execute the received job program, firstly, resources required by the job program are distributed, and the resources are mainly slots to be executed on the TaskManagers; after resource allocation, the JobManager submits an individual Task to the responding TaskManager. The TaskManager receives a task and generates a task for the thread to perform. When the state changes, such as starting a computation or completing a computation, it will be sent back to the JobManager to report the state of the Task at regular time. Once a job program is executed, JobManager returns task results to JobClient.
The invention realizes log collection and storage capacity based on flash, and realizes data security audit capacity, alarm monitoring management and processing capacity and security risk identification capacity access based on flash engine modeling; in the whole life cycle circulation process of data acquisition, transmission, storage, processing, exchange and destruction, safety audit is carried out, the normal use of the circulation of the data is ensured, the construction of a data safety audit system is realized, and relatively comprehensive safety management service is provided for a big data resource platform. Meanwhile, the system can find out secret-related operation events in time and accurately position event operators; various anomalies and violations in the business support system are inspected, discovered, and pre-warned, providing relevant evidence that can be used for pursuit.
As a preferred embodiment, in the data security auditing method, during the data security auditing process, during the log data collecting process, the log collecting system 1 and the stream processing platform (Kafka) are continuously managed and monitored, the collecting condition and collecting amount of the log data are monitored, and the log collecting and storing are monitored in real time, so that a user can know the collecting condition and collecting amount of the log data in real time.
As a preferred embodiment, the data security auditing method performs online analysis processing on real-time data and performs offline analysis processing on non-real-time data;
as shown in fig. 7, the online analyzing step includes:
step A1, classifying the real-time data and storing the real-time data in a cluster of a stream processing platform, wherein the cluster comprises a global event and at least one internal event;
step A2, performing real-time correlation analysis on the global event and at least one internal event;
step a3, determine whether it is an internal event:
if yes, go to step A4;
if not, generating an internal event and storing the internal event in one of the internal events of the cluster;
step A4, when it is judged that debugging and monitoring are needed, outputting a first analysis result;
as shown in fig. 8, the offline analysis step includes:
the method comprises the steps of storing a plurality of offline rules in advance, issuing the offline rules through a stream processing platform, receiving the offline rules, calling log data of a data center according to the offline rules, carrying out batch processing analysis on non-real-time original log data through parameters of a configuration list DB and a base line DB, outputting a second analysis result, issuing the second analysis result to Kafka, subscribing and sending the second analysis result to a document database (ES). The off-line analysis can analyze past logs in batch and generate different alarm information according to different parameter configurations.
As a preferred embodiment, the data security auditing method is as shown in fig. 2, where in step S2, at least one parsing node parses log data respectively, and the parsing steps are as follows:
step 21: carrying out initialization processing on log data;
step 22: extracting effective log information from log data;
step 23: and processing the log information to obtain log data of at least one data type, and respectively sending the log data to at least one data target place.
Specifically, in this embodiment, the original log data is formatted, and effective log information is extracted from the text, so that the difficulty of parsing is reduced. Analyzing the extracted log information in a multi-level JSON flat conversion, irregular text regular analysis or database table field mapping mode, and dynamically completing the obtained log after analysis, wherein the completed content comprises regions and countries completed according to IP addresses.
As a preferred embodiment, the data security auditing method configures acquisition frequency, acquisition time period and the opening and closing of tasks with the log acquisition system 1, controls the condition of the log acquisition system 1 for acquiring log data in a web server by configuring parameters, controls the opening and closing of log acquisition tasks by configuring the time, the time period and the opening frequency of the task for opening and closing, and configures the acquisition frequency, the acquisition time period and the acquisition amount for controlling the condition of acquisition in the acquisition process.
The invention also provides a data security auditing system based on big data computing technology, which is applied to the data security auditing method based on big data computing technology, as shown in fig. 9, and comprises the following steps:
the task scheduling module 2 is connected with the log acquisition system 1 and used for acquiring log data of the server and sending the acquired log data to the first-class processing platform;
the analysis module 3 is connected with the stream processing platform and used for receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the analyzed log data to at least one data target place;
the audit analysis module 5 is connected with the analysis module 3 and used for classifying the analyzed log data and judging whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to a stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
the audit analysis module 5 analyzes and processes the log data to obtain an analysis result and outputs the analysis result;
and the alarm module 4 is connected with the audit analysis module 5 and used for generating and outputting corresponding alarm information according to the analysis result.
Specifically, in this embodiment, the data security audit system includes a task scheduling module 2, an analysis module 3, an audit analysis module 5, and an alarm module 4;
the task scheduling module 2 is used for controlling the log acquisition system 1 to acquire log data of the server by configuring acquisition frequency and acquisition time period with the log acquisition system 1 based on the flash frame and by starting and closing tasks, sending the acquired log data to the stream processing platform, performing function iteration on the flash and introducing a task scheduling concept;
and the analysis module 3 is used for subscribing one or more log data in the stream processing platform, analyzing the log data, performing multi-source output on the analyzed log data, and enriching flash log acquisition sources and output sources.
And the audit analysis module 5 is used for storing the analyzed log data and judging whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to a stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
the audit analysis module 5 calls the stored log data to perform audit analysis and then outputs an analysis result;
and the alarm module is used for generating and outputting corresponding alarm information according to the analysis result.
As a preferred embodiment, the data security audit system further includes a monitoring module 6, which is respectively connected to the log collection system 1 and the stream processing platform, and is configured to continuously manage and monitor collection conditions and collection amounts of the stream processing platform and the log collection system during a log data collection process.
In a preferred embodiment, the data security auditing system includes an audit analysis module 5:
the online analysis engine 51 is connected with the stream processing platform, is based on a Flink framework, and is used for performing real-time correlation analysis on the global events and the plurality of internal events of the stream processing platform to output a first analysis result;
and the offline analysis engine 52 is connected with the data center, is based on a Flink framework, and is used for calling original log data in the data center according to the issued offline rule, performing batch analysis on the non-real-time original log data, and outputting a second analysis result.
Specifically, the online analysis engine 51 and the offline analysis engine 52 are based on a Flink framework, where the Flink framework includes predefined window distributors, such as a rolling window, a sliding window, a conversation window, and a global window, and the real-time online analysis engine can create windows and perform windowing analysis on real-time log stream data, so as to generate an alarm signal and monitoring and debugging information in real time, and send the alarm signal and the monitoring and debugging information to an operator, so that the operator can process the alarm signal and the monitoring and debugging information in time, and loss is reduced.
In a preferred embodiment, the data security audit system, wherein the alarm module 4 comprises:
the first alarm unit 41 is connected to the online analysis engine 51, and configured to generate and output corresponding first alarm information according to the first analysis result;
and the second alarm unit 42 is connected to the offline analysis engine 52, and is configured to generate and output corresponding second alarm information according to the second analysis result.
As a preferred implementation manner, the data security audit system further includes an audit report module, which is respectively connected to the audit analysis module 5 and the alarm module, and is configured to generate a corresponding audit report according to the analysis result and the alarm information.
As a preferred embodiment, in the data security audit system, the parsing module 3 includes a plurality of parsing nodes, each parsing node is correspondingly provided with a Parser host as a Parser, and the Parser host is configured to initialize log data, extract effective log information from the log data, obtain log data of at least one data type according to the log information, and respectively send the log data to at least one data destination.
Specifically, the log information is processed to obtain data of different data types, and the data is sent to a data target, wherein the data target comprises various output sources such as an elastic search, an HBse/HDFS, a Druid and a CVS.
The invention has the beneficial effects that:
the log collection and storage capacity is realized by the log collection and storage capacity based on the open source frame log collection system (flash), the log collection system is subjected to function iteration, the concepts of task scheduling and task monitoring are introduced, and the log collection source and the output target place of the log collection system are enriched; the development of functional modules such as data security audit capability, alarm monitoring management and processing capability, security risk identification capability access and the like is realized through modeling based on an open source assembly stream processing engine (flink), the construction of a data security audit system is realized, security audit is carried out in the whole life cycle of data acquisition, transmission, storage, processing, exchange and destruction, relatively comprehensive security management service is provided for a large data resource platform, and the normal use of data circulation is ensured; meanwhile, the system can continuously check, find and early warn various abnormal and illegal behaviors in the service supporting system, timely find the confidential operation event, accurately and quickly position the operator of the confidential event, and store the relevant evidence which can be used for pursuing accountability.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. A data security audit method based on big data computing technology is characterized by comprising the following steps:
step S1, collecting the log data of the server, and sending the collected log data to an first-stream processing platform;
step S2, receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the log data to at least one data target place;
step S3, classifying the analyzed log data, and determining whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to the stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
step S4, according to the classification in the step S3, analyzing the log data respectively to obtain an analysis result, and outputting the analysis result;
and step S5, generating and outputting corresponding alarm information according to the analysis result.
2. The big data computing technology-based data security audit method according to claim 1, wherein in the step S1, during the log data collection process, the collection status and the collection amount of the stream processing platform and the log data are continuously managed and monitored.
3. The big data computing technology-based data security audit method according to claim 1, wherein the real-time data is analyzed and processed online, and the non-real-time data is analyzed and processed offline;
the online analysis step comprises:
step A1, classifying the real-time data and storing the real-time data in a cluster of the stream processing platform, wherein the cluster comprises a global event and at least one internal event;
step A2, performing real-time correlation analysis on the global event and at least one internal event;
step a3, determine whether it is an internal event:
if yes, go to step A4;
if not, generating the internal event and storing the internal event in one of the internal events of the cluster;
step A4, when it is judged that debugging and monitoring are needed, outputting a first analysis result;
the offline analyzing step comprises:
step B1, storing offline rules in advance, and issuing the offline rules to the stream processing platform;
step B2, receiving the offline rule, and calling the log data of the data center according to the offline rule;
step B3, performing batch analysis on the non-real-time log data, outputting a second analysis result and issuing the second analysis result to the stream processing platform;
and step B4, receiving the second analysis result and sending the second analysis result to the document database.
4. The big data computing technology-based data security audit method according to claim 1, wherein in step S2, at least one parsing node parses the log data respectively, and the parsing steps are as follows:
step 21: initializing the log data;
step 22: extracting effective log information from the log data;
step 23: and processing the log information to obtain the log data of at least one data type, and respectively sending the log data to at least one data target place.
5. The big data computing technology-based data security audit method according to claim 1, wherein in step S1, the log collection system is controlled to collect the log data by performing a functional configuration with the log collection system, where the functional configuration includes collection frequency, collection time period and task on and off.
6. A big data computing technology-based data security auditing system, which is applied to the big data computing technology-based data security auditing method of any one of claims 1 to 5, and comprises the following steps:
the task scheduling module is connected with the log acquisition system and used for acquiring log data of the server and sending the acquired log data to the first-class processing platform;
the analysis module is connected with the stream processing platform and used for receiving one or more log data in the stream processing platform, analyzing the log data, outputting the analyzed log data and sending the analyzed log data to at least one data target place;
the audit analysis module is connected with the analysis module and used for classifying the analyzed log data and judging whether the log data is real-time data or non-real-time data:
if the data is real-time data, sending the data to the stream processing platform for storage;
if the data is non-real-time data, sending the data to a data center for storage;
the audit analysis module analyzes and processes the log data to obtain an analysis result and outputs the analysis result;
and the alarm module is connected with the audit analysis module and used for generating and outputting corresponding alarm information according to the analysis result.
7. The big data computing technology-based data security audit system according to claim 6, further comprising a monitoring module, respectively connected to the log collection system and the stream processing platform, for continuously managing and monitoring the collection status and collection amount of the stream processing platform and the log collection system during the log data collection process.
8. The big data computing technology-based data security audit system according to claim 6, wherein the audit analysis module includes:
the online analysis engine is connected with the stream processing platform and is used for performing real-time correlation analysis on the global events and the plurality of internal events of the stream processing platform and outputting a first analysis result;
and the offline analysis engine is connected with the data center and used for calling the log data in the data center according to the issued offline rule, carrying out batch analysis on the non-real-time log data and outputting a second analysis result.
9. The big data computing technology-based data security audit system according to claim 8, wherein the alarm module comprises:
the first alarm unit is connected with the online analysis engine and used for generating and outputting corresponding first alarm information according to the first analysis result;
and the second alarm unit is connected with the offline analysis engine and used for generating and outputting corresponding second alarm information according to the second analysis result.
10. The big data computing technology-based data security audit system according to claim 6, wherein the parsing module includes a plurality of parsing nodes, and each parsing node is correspondingly provided with a parser for initializing the log data, extracting effective log information from the log data, obtaining the log data of at least one data type according to the log information, and sending the log data to at least one data destination respectively.
CN202010713842.7A 2020-07-22 2020-07-22 Data security audit method and system based on big data computing technology Active CN111858251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010713842.7A CN111858251B (en) 2020-07-22 2020-07-22 Data security audit method and system based on big data computing technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010713842.7A CN111858251B (en) 2020-07-22 2020-07-22 Data security audit method and system based on big data computing technology

Publications (2)

Publication Number Publication Date
CN111858251A true CN111858251A (en) 2020-10-30
CN111858251B CN111858251B (en) 2024-04-19

Family

ID=72949658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010713842.7A Active CN111858251B (en) 2020-07-22 2020-07-22 Data security audit method and system based on big data computing technology

Country Status (1)

Country Link
CN (1) CN111858251B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112738189A (en) * 2020-12-24 2021-04-30 航天信息股份有限公司 Cluster resource management method and device, storage medium and electronic equipment
CN114048213A (en) * 2021-11-16 2022-02-15 盐城金堤科技有限公司 Data auditing method and device, computer storage medium and electronic equipment
CN114205215A (en) * 2021-12-06 2022-03-18 湖北天融信网络安全技术有限公司 Data pre-analysis method and device
CN115460072A (en) * 2022-08-25 2022-12-09 浪潮云信息技术股份公司 Log processing system integrating log collection, analysis, storage and service
CN116541202A (en) * 2023-06-14 2023-08-04 深圳壹师城科技有限公司 Scientific and technological risk management system and risk early warning device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070283194A1 (en) * 2005-11-12 2007-12-06 Phillip Villella Log collection, structuring and processing
CN107908690A (en) * 2017-11-01 2018-04-13 南京欣网互联网络科技有限公司 A kind of data processing method based on big data OA operation analysis
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN113496032A (en) * 2020-04-03 2021-10-12 中国信息安全测评中心 Big data operation abnormity monitoring system based on distributed computation and rule engine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070283194A1 (en) * 2005-11-12 2007-12-06 Phillip Villella Log collection, structuring and processing
CN107908690A (en) * 2017-11-01 2018-04-13 南京欣网互联网络科技有限公司 A kind of data processing method based on big data OA operation analysis
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN113496032A (en) * 2020-04-03 2021-10-12 中国信息安全测评中心 Big data operation abnormity monitoring system based on distributed computation and rule engine

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112738189A (en) * 2020-12-24 2021-04-30 航天信息股份有限公司 Cluster resource management method and device, storage medium and electronic equipment
CN114048213A (en) * 2021-11-16 2022-02-15 盐城金堤科技有限公司 Data auditing method and device, computer storage medium and electronic equipment
CN114205215A (en) * 2021-12-06 2022-03-18 湖北天融信网络安全技术有限公司 Data pre-analysis method and device
CN115460072A (en) * 2022-08-25 2022-12-09 浪潮云信息技术股份公司 Log processing system integrating log collection, analysis, storage and service
CN116541202A (en) * 2023-06-14 2023-08-04 深圳壹师城科技有限公司 Scientific and technological risk management system and risk early warning device
CN116541202B (en) * 2023-06-14 2023-10-03 深圳壹师城科技有限公司 Scientific and technological risk management system and risk early warning device

Also Published As

Publication number Publication date
CN111858251B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
CN111858251B (en) Data security audit method and system based on big data computing technology
US20180129579A1 (en) Systems and Methods with a Realtime Log Analysis Framework
CN108039959B (en) Data situation perception method, system and related device
CN108521339B (en) Feedback type node fault processing method and system based on cluster log
CN111817891A (en) Network fault processing method and device, storage medium and electronic equipment
CN115809183A (en) Method for discovering and disposing information-creating terminal fault based on knowledge graph
CN105323111A (en) Operation and maintenance automation system and method
CN114267178B (en) Intelligent operation maintenance method and device for station
CN110659307A (en) Event stream correlation analysis method and system
CN111259073A (en) Intelligent business system running state studying and judging system based on logs, flow and business access
CN105184886A (en) Cloud data center intelligence inspection system and cloud data center intelligence inspection method
CN113656245B (en) Data inspection method and device, storage medium and processor
CN109005162B (en) Industrial control system security audit method and device
US20090164407A1 (en) Monitoring a Service Oriented Architecture
CN112291266B (en) Data processing method, device, server and storage medium
CN112148561A (en) Service system running state prediction method and device and server
CN107463490B (en) Cluster log centralized collection method applied to platform development
CN117422434A (en) Wisdom fortune dimension dispatch platform
CN110609761B (en) Method and device for determining fault source, storage medium and electronic equipment
CN117792864A (en) Alarm processing method and device, storage medium and electronic device
CN115185768A (en) Fault recognition method and system of system, electronic equipment and storage medium
CN115168297A (en) Bypassing log auditing method and device
CN112579391A (en) Distributed database automatic operation and maintenance method and system based on artificial intelligence
CN116582462B (en) Converged service monitoring method and device
KR102426889B1 (en) Apparatus, method and program for analyzing and processing data by log type for large-capacity event log

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant