CN107818150B - Log auditing method and device - Google Patents

Log auditing method and device Download PDF

Info

Publication number
CN107818150B
CN107818150B CN201710994900.6A CN201710994900A CN107818150B CN 107818150 B CN107818150 B CN 107818150B CN 201710994900 A CN201710994900 A CN 201710994900A CN 107818150 B CN107818150 B CN 107818150B
Authority
CN
China
Prior art keywords
log
data platform
big data
auditing
standardized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710994900.6A
Other languages
Chinese (zh)
Other versions
CN107818150A (en
Inventor
何庆
李冠道
严敏
周乐坤
高峰
张建军
苏砫
罗波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ultrapower Information Safety Technology Co ltd
China Mobile Group Guangdong Co Ltd
Original Assignee
Beijing Ultrapower Information Safety Technology Co ltd
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ultrapower Information Safety Technology Co ltd, China Mobile Group Guangdong Co Ltd filed Critical Beijing Ultrapower Information Safety Technology Co ltd
Priority to CN201710994900.6A priority Critical patent/CN107818150B/en
Publication of CN107818150A publication Critical patent/CN107818150A/en
Application granted granted Critical
Publication of CN107818150B publication Critical patent/CN107818150B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/034Test or assess a computer or a system

Abstract

The invention discloses a log auditing method and a device, and particularly relates to a method and a device for standardizing initial logs of various components in large data platforms with different sources and formats by analyzing the original logs collected from various large data platform components, mapping fields in a standardized manner, and dividing operation types and operation details of the logs; and then, according to the auditing requirement of the big data security management and control, adopting corresponding auditing rules and analysis strategies to automatically audit and analyze the standardized logs of each component in the big data platform so as to determine whether the management and data access operation of the big data platform and the components meet the security technical specification and the management requirement. Compared with a manual auditing mode, the log auditing method provided by the invention can be used for comprehensively and timely auditing the operation of the large data platform assembly through standardized processing and centralized auditing of the logs of the large data platform assembly, so that potential safety hazards can be quickly found and safety problems can be quickly positioned.

Description

Log auditing method and device
Technical Field
The invention relates to the technical field of information security, in particular to a log auditing method and device.
Background
With the continuous improvement of social informatization technology and the rapid popularization of internet technology, more and more data need to be processed, and a big data platform has incomparable superiority in the fields of large-scale storage of data and high-performance calculation, and can provide efficient big data storage, calculation, operation and maintenance and monitoring services. However, the current security protection measures of the big data platform lack standards and requirements, cannot keep up with the development of the service requirements of the big data platform, and have the defect of unmatched high-value services of data concentration and data sharing. Therefore, exploring the big data platform itself and data security risks and solving measures in the platform deepens the big data security control range and application field, and is the key point of current research.
Aiming at core components of a large data platform, such as HDFS (Hadoop Distributed File System), Hive, HBase, YARN & MR and the like, a large amount of information can be stored in an operation log File. Specifically, the operation logs can be divided into two types, including a component maintenance log and a data access log, the former records platform management operations such as node extension removal, node start/stop, component service start/stop and the like, and the latter records user activity information and user operation instruction information. Thus, the oplogs of a large data platform can be used to locate problem causes and divide accident liability in security events. Correspondingly, the centralized log audit of the big data platform researches the recording, storage, collection, standardized processing, audit and alarm of operation logs of all components of the big data platform, and the promotion of the audit strategy falling to the ground under the big data environment is an important link of the safety control of the big data platform.
Currently, for operation log audit of a big data platform, security management personnel of an enterprise usually check original log information from a node of a service component regularly, or check and audit partial logs through some big data management platforms in a manual mode to determine whether management and data access operations of the platforms and the components meet safety technical specifications and management requirements.
However, due to the characteristics of large scale and numerous components and nodes of the big data platform, the operation logs of the big data platform are audited manually, and the big data platform has the defects of dispersed logs, large volume, time and labor waste and low efficiency and accuracy. In addition, the manual mode has high requirements on the professional level of the auditors, so that not only the safety service is understood, but also a big data platform, including the infrastructure environment of the platform and the management and operation mechanism of each component and service of the platform, needs to be known.
Disclosure of Invention
The invention provides a log auditing method and device, which are used for comprehensively and timely auditing the operation of a large data platform assembly, quickly discovering potential safety hazards and positioning safety problems.
According to a first aspect of the embodiments of the present invention, there is provided a log auditing method, including:
analyzing an original log collected from a big data platform assembly to obtain an effective log field in the original log and an attribute value corresponding to the effective log field;
according to a preset field mapping rule, classifying the attribute value corresponding to the effective log field in the original log to obtain an initial log;
dividing the operation type and operation detail of the initial log according to keywords in an effective log field in the initial log to obtain a standardized log;
and auditing the standardized logs of the big data platform assembly according to a preset auditing rule and an analysis strategy.
Optionally, auditing the standardized log of the big data platform component according to a preset auditing rule and an analysis policy, including:
selecting a corresponding log content audit point and an audit rule according to the operation type and the operation detail in the standardized log of the big data platform component;
and auditing the log content of the standardized log according to the log content audit point and the audit rule.
Optionally, before parsing the raw log collected from the big data platform component, the method further includes:
opening a storage option of the big data platform assembly log, and setting a storage operation option of the big data platform assembly log, wherein the storage operation option comprises a name of a log file, a storage path of the log file, the size of the log file and the number of the log file.
Optionally, the selecting method of the storage path of the log file includes:
and selecting the node with the least number of nodes in the big data platform assembly as a storage path of the log file.
Optionally, the collection manner of the original log of the big data platform component includes:
acquiring an original log sent by the big data platform component through a syslog protocol;
alternatively, the first and second electrodes may be,
and collecting original logs saved by the big data platform component from the big data platform component in an FTP/SFTP mode.
Optionally, after auditing the standardized log of the big data platform component, the method further includes:
and when the log content in the standardized log does not accord with the preset auditing rule, generating corresponding safety early warning information.
According to a second aspect of the embodiments of the present invention, there is also provided a log auditing apparatus, including:
an original log analysis module: the system comprises a big data platform assembly, a big data platform assembly and a big data platform assembly, wherein the big data platform assembly is used for acquiring an original log and a plurality of effective log fields in the original log;
a standardized field classification module: the attribute value corresponding to the effective log field in the original log is classified into the corresponding standardized field according to a preset field mapping rule to obtain an initial log;
an operation type division module: the method comprises the steps of dividing the operation type and operation detail of the initial log according to keywords in an effective log field in the initial log to obtain a standardized log;
a standardized log audit module: and the system is used for auditing the standardized logs of the big data platform assembly according to preset auditing rules and analysis strategies.
Optionally, the standardized log audit module comprises:
an audit strategy selection submodule: the system is used for selecting a corresponding log content audit point and an audit rule according to the operation type and the operation detail in the standardized log of the big data platform component;
a log content auditing submodule: and auditing the log content of the standardized log according to the log content audit point and the audit rule.
Optionally, the apparatus further comprises:
a log option setting module: and the log saving operation options comprise the name of a log file, the storage path of the log file, the size of the log file and the number of the log file.
Optionally, the apparatus further comprises:
the log early warning module: and the safety early warning device is used for generating corresponding safety early warning information when log contents in the standardized logs do not accord with preset auditing rules.
According to the technical scheme, the log auditing method and the log auditing device provided by the embodiment of the invention can be used for analyzing the original logs collected from each big data platform component, mapping the field standardization, and dividing the operation types and operation details of the logs, so that the standardization of the initial logs of each component in big data platforms with different sources and formats can be realized; and then, according to the auditing requirement of the big data security management and control, adopting corresponding auditing rules and analysis strategies to automatically audit and analyze the standardized logs of each component in the big data platform so as to determine whether the management and data access operation of the big data platform and the components meet the security technical specification and the management requirement. Compared with a manual auditing mode, the log auditing method provided by the embodiment of the invention can be used for performing comprehensive and timely auditing on the operation of the big data platform assembly through standardized processing and centralized auditing on the big data platform assembly logs, and rapidly discovering potential safety hazards and positioning safety problems.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any inventive exercise.
FIG. 1 is a schematic diagram of a deployment architecture of a log auditing system for large data platform components according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a log auditing method according to an embodiment of the present invention;
fig. 3 is a schematic view of a scenario in which log parsing and standardized field mapping are performed on an original log according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of another log auditing method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a log auditing apparatus according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
Due to the characteristics of large scale and numerous components and nodes of the big data platform, the operation log of the big data platform is audited manually, and the requirements of automation and comprehensive audit cannot be met. In view of this, the present invention provides a log auditing method and apparatus for big data components, and the basic implementation principle is as follows: the method comprises the steps of firstly, carrying out standardized processing on original logs collected from a big data assembly by using a regular expression and a field standardized mapping rule, and then, carrying out audit analysis on the standardized logs of the big data platform assembly by combining an audit requirement and an audit strategy of big data safety control, thereby realizing standardized processing and centralized audit of operation logs of the big data platform assembly.
Fig. 1 is a schematic diagram of a deployment architecture of a log auditing system for large data platform components according to an embodiment of the present invention. As shown in fig. 1, in this embodiment, taking a Hadoop-based big data platform as an example, in a Hadoop distributed file system HDFS, one HDFS cluster includes one named node NameNode and a plurality of data nodes DataNode. The NameNode is the main server that maintains the file system namespace, specifies client access to files, and provides operations on file directories. The DataNode is responsible for managing storage space on the storage nodes and read and write requests from clients. Therefore, the log collection server 10 is designed in this embodiment to be in communication connection with a big data platform component (such as HDFS, HBase, Hive, Sqoop, and the like) and configured to collect log files stored in the big data platform component; in addition, the log analysis server 20 is communicatively connected to the log collection server 10, and is configured to analyze and audit the logs collected by the log collection server 10.
Based on the above basic implementation principle and system architecture, the following describes in detail the log auditing method provided by the embodiment of the present invention. Fig. 2 is a schematic flowchart of a log auditing method according to an embodiment of the present invention, which is applied to the log analysis server 20 and the log collection server 10 in fig. 1, and it should be noted that the log analysis server 20 and the log collection server 10 in this embodiment may also be integrated in the same device main body. As shown in fig. 2, the method specifically includes the following steps:
step S110: analyzing an original log collected from a big data platform assembly to obtain an effective log field in the original log and an attribute value corresponding to the effective log field.
Aiming at the characteristic that original logs from all components of a big data platform have different formats, the operation logs are subjected to standardized analysis before log audit, so that the speed and effectiveness of subsequent audit are improved. Specifically, the original logs collected from the big data platform component can be analyzed and attribute-extracted by adopting a regular expression matching mode. For example, a matching mode of the log is constructed through a series of special characters of a regular expression rule matching algorithm, then the original log is matched according to the matching mode, variable values in the regular expression are extracted after matching is successful, and attribute variables and attribute values thereof are stored, so that effective log fields in the original log and attribute values corresponding to the effective log fields are obtained.
Fig. 3 is a schematic view of a scenario in which log parsing and standardized field mapping are performed on an original log according to an embodiment of the present invention. As shown in fig. 3, after the log samples in the graph are matched by the regular expression, the analysis result in the following table one can be obtained:
table one:
Figure GDA0003067077720000041
Figure GDA0003067077720000051
step S120: and according to a preset field mapping rule, performing belonging standardized field classification on the attribute value corresponding to the effective log field in the original log to obtain the initial log.
Fig. 3 is a schematic view of a scenario in which log parsing and standardized field mapping are performed on an original log according to an embodiment of the present invention. As shown in fig. 3, if the attribute value corresponding to the extraction valid log field is create, the action is performed.
Step S130: and dividing the operation type and the operation detail of the initial log according to the keywords in the effective log field in the initial log to obtain a standardized log.
For example, according to the operation command mkdir, the operation detail item can be divided into directory creation, the completion operation subclass is HDFS data access, and the operation type is data access; if the operation command is a start name, the operation detail item is the component start, the operation subclass is the component operation, and the operation type is the platform maintenance.
Through the processing of the three steps, the operation log standardization of the large data platform assembly is further realized. The normalized results of the big data component oplog are shown in table two below:
table two:
Figure GDA0003067077720000052
Figure GDA0003067077720000061
Figure GDA0003067077720000071
step S140: and auditing the standardized logs of the big data platform assembly according to a preset auditing rule and an analysis strategy.
Based on the standardized operation logs of the large data platform assembly, auditing is carried out according to corresponding auditing rules and auditing strategies, wherein the auditing strategies comprise dictionaries or baselines of access time, access places, important directory files, key operation commands, operation frequency, operation total amount and the like, meanwhile, corresponding auditing rules are designed for each auditing point in the auditing strategies, and through auditing the standardized operation logs based on the strategy auditing points, data access of non-working time, non-use places, non-safe special areas and the like, and illegal behaviors of abnormal frequency of accessing the important directory files, exceeding threshold value of accessing the important data files and the like can be found and positioned, so that the safety problems and hidden dangers of management operation and data access operation of the large data platform are found and positioned.
Further, in order to implement fast audit of numerous operation logs of the large data platform component, the embodiment further provides another log audit method, which specifically includes:
step S141: and selecting a corresponding log content audit point and an audit rule according to the operation type and the operation detail in the standardized log of the big data platform component.
According to the operation type and the operation details in the standardized log, the log audit point and the audit rule of the log are matched from the preset audit library, further, part of contents in the log can be selectively audited, and the log audit speed is improved.
Step S142: and auditing the log content of the standardized log according to the log content audit point and the audit rule.
According to the embodiment, the log auditing method provided by the implementation can realize the standardization of the initial logs of the components in the big data platforms with different sources and formats by analyzing the original logs collected from the components of the big data platforms, mapping the standardized fields, and dividing the operation types and operation details of the logs; and then, according to the auditing requirement of the big data security management and control, adopting corresponding auditing rules and analysis strategies to automatically audit and analyze the standardized logs of each component in the big data platform so as to determine whether the management and data access operation of the big data platform and the components meet the security technical specification and the management requirement. Through standardized processing and centralized auditing of big data platform subassembly log, can carry out comprehensive and timely audit to big data platform subassembly's operation, discover the potential safety hazard fast, fix a position the safety problem.
In order to realize comprehensive collection of the logs of the big data platform assembly and improve the audit coverage, the embodiment configures the operation logs of the big data platform assembly before the log collection. Fig. 4 is a schematic flowchart of another log auditing method according to an embodiment of the present invention. As shown in fig. 4, the method specifically includes the following steps:
step S210: opening a storage option of the big data platform assembly log, and setting a storage operation option of the big data platform assembly log, wherein the storage operation option comprises a name of a log file, a storage path of the log file, the size of the log file and the number of the log file.
Because some logs of the large data platform component are closed to the outside by default or have no log record, for example, the HDFS audit log records all HDFS requests, and is usually written into the log of the NameNode, and the log is closed by default. In order to realize comprehensive collection, the embodiment also configures the operation log of the big data assembly, and opens the storage option of the log of the big data platform assembly so as to be used for collecting the process log of the subsequent collection server.
In addition, in order to facilitate the collection server to accurately and effectively collect the logs stored by the big data platform assembly, in this embodiment, a storage operation option of the big data platform assembly log is further set, where the storage operation option includes a name of a log file, a storage path of the log file, a size of the log file, and a number of the log files.
When the saving operation option is set, the log of the big data component is written into the log file in an additional mode, when one log file reaches the set file size, the next log file is generated, and when the number of the generated log files reaches the set number of the reserved latest files, the historical log files are deleted according to the generation time sequence of the files. In consideration of the memory overhead and the collection efficiency when the file is opened during log collection, the setting of a single log file is not suitable to be overlarge, for example, the size of the single file is recommended to be configured to be 256M, and meanwhile, in consideration of the overhead of a local storage disk of the log, the storage number of the log files is not suitable to be configured to be overlarge, for example, 20 log files are recommended to be configured, so that the setting of the storage operation options gives consideration to the resource overhead and the collection efficiency.
The following is exemplified by an HDFS oplog configuration:
the HDFS log comprises logs of service output of the system such as NameNode, DataNode, ResourceMenage and the like, and is stored in a $ { HADOOP _ HOME }/logs by default. The HDFS audit log records all HDFS requests, usually written into the NameNode log, and is closed by default. By setting the sizes of the start audit and configuration log files, the number of saved files and the like in the $ { HADOOP _ HOME }/etc./HADOOP/log4j.properties attribute file, the corresponding configuration options are as shown in the following table three
Table three:
Figure GDA0003067077720000081
Figure GDA0003067077720000091
as another example, MapReduce audit is absent by default in the log4j. properties configuration, and log4j. loader. org. apache. hadoop. horn. server. resource manager. rmauditlogger $, $ { compressed.
Correspondingly, the log is output in a logs/mapred-audio log file of the resourcemager host, and the log format is as shown in the following table four:
table four:
Figure GDA0003067077720000092
further, in order to facilitate log management and subsequent log collection, in this embodiment, when setting a storage path of a log file, a node with the least number of nodes in the big data platform component is preferably selected as the storage path of the log file.
For example, the HDFS audit log may be stored on datade or namenode, but the datade nodes are numerous, the log distribution is very dispersed and not suitable for collection, and the namenode nodes are few, and are more suitable for storing the HDFS audit log so as to facilitate log management and collection. Therefore, when saving the path setup, $ { HADOOP _ HOME }/etc/HADOOP/HADOOP-hash-env.sh hdfs. audio. loader configuration export HADOOP NAMENODE _ ops ═ is modified
… -D-Dhdfs. audio. loader $ { HDFS _ AUDIT _ LOGGER: -INFO, RFAAUDIT } $ HADOOP _ NAMENODE _ OPTS, so that the HDFS AUDIT log is saved on the NAMENODE host.
Correspondingly, the log outputs namenode host logs/hdfs-audio.log, and the log format is shown in table five:
table five:
Figure GDA0003067077720000093
step S220: analyzing an original log collected from a big data platform assembly to obtain an effective log field in the original log and an attribute value corresponding to the effective log field.
When the logs of the big data platform assembly are collected, the logs stored by the big data platform assembly can be actively sent to a log collection probe through syslog protocol configuration, or operation log recording files of the big data assembly are collected in an incremental mode in an FTP/SFTP mode.
In the two acquisition modes, the timeliness of the syslog acquisition mode is very high, and logs are sent to the log acquisition server through the syslog acquisition mode as soon as the logs are generated, which can be considered as real-time basically, but the syslog is based on the udp protocol, is influenced by the network environment and the like, and can possibly cause log packet loss. In contrast, FTP/SFTP is a TCP protocol, can reliably and efficiently transmit data, and can reestablish the connection and continue to collect data from the breakpoint even if the network connection is abnormal. The other difference between the two collection modes is that syslog collection is passively accepted by the collection component, and FTP is actively initiated by the collection component and is more controllable and managed, so that the embodiment preferably adopts the FTP/SFTP mode to incrementally collect logs.
Step S230: and according to a preset field mapping rule, performing belonging standardized field classification on the attribute value corresponding to the effective log field in the original log to obtain the initial log.
Step S240: and dividing the operation type and the operation detail of the initial log according to the keywords in the effective log field in the initial log to obtain a standardized log.
Step S250: and auditing the standardized logs of the big data platform assembly according to a preset auditing rule and an analysis strategy.
Further, in order to implement automatic alarm on an audit result, after the step S260 audits the standardized log of the big data platform component, the embodiment further includes the following steps:
step S260: and when the log content in the standardized log does not accord with the preset auditing rule, generating corresponding safety early warning information.
Specifically, the corresponding safety early warning information can be generated according to the non-conforming item in the log and the attribute information of the big data platform component from which the log is sourced.
Furthermore, the generated safety early warning information can be sent to a corresponding safety early warning platform, so that information display is carried out through a page of the safety early warning platform. Or, the safety early warning information can be presented in a mode of a graph, a table and the like according to the asset type, a service system, an asset principal and other dimensions, and can be sent to a preset terminal by combining a system of a short message, a mail and the like so as to inform the asset principal of the threatened equipment, so that the asset principal can be informed in time, and the best opportunity can be provided for accurately and quickly eliminating the threat.
According to the scheme, the logs of the big data platform assembly are collected automatically and intensively, the operation logs are standardized in a flexible standardized mode, automatic audit of the operation of the big data platform assembly is achieved according to the analysis rules and the alarm strategy, and the audit coverage, the system performance, the timeliness and the like are improved comprehensively.
Based on the log auditing method, the embodiment of the invention also provides a log auditing device. Fig. 5 is a schematic structural diagram of a log auditing apparatus according to an embodiment of the present invention. As shown in fig. 5, the apparatus has:
original log parsing module 510: the system comprises a big data platform assembly, a big data platform assembly and a big data platform assembly, wherein the big data platform assembly is used for acquiring an original log and a plurality of effective log fields in the original log;
standardized field classification module 520: the attribute value corresponding to the effective log field in the original log is classified into the corresponding standardized field according to a preset field mapping rule to obtain an initial log;
the operation type division module 530: the method comprises the steps of dividing the operation type and operation detail of the initial log according to keywords in an effective log field in the initial log to obtain a standardized log;
standardized log audit module 540: and the system is used for auditing the standardized logs of the big data platform assembly according to preset auditing rules and analysis strategies.
To implement fast audit of a plurality of operation logs of a large data platform component, the standardized log audit module 540 may include:
an audit strategy selection sub-module 541: the system is used for selecting a corresponding log content audit point and an audit rule according to the operation type and the operation detail in the standardized log of the big data platform component;
log content auditing sub-module 542: and auditing the log content of the standardized log according to the log content audit point and the audit rule.
For the realization realizes the comprehensive collection to the log of big data platform subassembly, improves the audit coverage, the log audit device that this embodiment provided still includes:
the log option setting module 550: and the log saving operation options comprise the name of a log file, the storage path of the log file, the size of the log file and the number of the log file.
In order to implement automatic alarm on an audit result, the log audit device provided by this embodiment further includes:
log warning module 560: and the safety early warning device is used for generating corresponding safety early warning information when log contents in the standardized logs do not accord with preset auditing rules.
The log auditing device provided by the embodiment automatically and intensively collects logs of the big data platform assembly, adopts a flexible standardization mode to standardize the operation logs, realizes automatic auditing of the operation of the big data platform assembly according to analysis rules and an alarm strategy, and comprehensively improves the auditing coverage, system performance, timeliness and other aspects.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are merely illustrative, wherein units described as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is merely a detailed description of the invention, and it should be noted that modifications and adaptations by those skilled in the art may be made without departing from the principles of the invention, and should be considered as within the scope of the invention.

Claims (8)

1. A log auditing method, the method comprising:
analyzing an original log collected from a big data platform component by using a regular expression matching mode to obtain an effective log field in the original log and an attribute value corresponding to the effective log field;
according to a preset field mapping rule, classifying the attribute value corresponding to the effective log field in the original log to obtain an initial log;
dividing the operation type and operation detail of the initial log according to keywords in an effective log field in the initial log to obtain a standardized log;
auditing the standardized logs of the big data platform assembly according to a preset auditing rule and an analysis strategy;
auditing the standardized logs of the big data platform assembly according to preset auditing rules and analysis strategies, wherein the auditing comprises the following steps:
selecting corresponding log content audit points and audit rules according to operation types and operation details in the standardized logs of the big data platform assembly, wherein the content audit points comprise access time, access places, important directory files, key operation commands, operation frequency and operation total amount;
and auditing the log content of the standardized log according to the log content audit point and the audit rule.
2. The method of claim 1, wherein prior to parsing the raw logs collected from the big data platform assembly, the method further comprises:
opening a storage option of the big data platform assembly log, and setting a storage operation option of the big data platform assembly log, wherein the storage operation option comprises a name of a log file, a storage path of the log file, the size of the log file and the number of the log file.
3. The method of claim 2, wherein the selecting of the storage path of the log file comprises:
and selecting the node with the least number of nodes in the big data platform assembly as a storage path of the log file.
4. The method of claim 1, wherein the collection of raw logs of the big data platform component comprises:
acquiring an original log sent by the big data platform component through a syslog protocol;
alternatively, the first and second electrodes may be,
and collecting original logs saved by the big data platform component from the big data platform component in an FTP/SFTP mode.
5. The method of claim 1, wherein after auditing the standardized logs for the big data platform component, the method further comprises:
and when the log content in the standardized log does not accord with the preset auditing rule, generating corresponding safety early warning information.
6. An apparatus for auditing a log, the apparatus comprising:
an original log analysis module: the log analysis method comprises the steps that a regular expression matching mode is used for analyzing an original log collected from a big data platform assembly to obtain effective log fields in the original log and attribute values corresponding to the effective log fields;
a standardized field classification module: the attribute value corresponding to the effective log field in the original log is classified into the corresponding standardized field according to a preset field mapping rule to obtain an initial log;
an operation type division module: the method comprises the steps of dividing the operation type and operation detail of the initial log according to keywords in an effective log field in the initial log to obtain a standardized log;
a standardized log audit module: the system comprises a big data platform assembly, a data storage module, a data processing module and a data processing module, wherein the big data platform assembly is used for storing a standard log of the big data platform assembly;
the standardized log audit module comprises:
an audit strategy selection submodule: the system comprises a large data platform assembly, a content audit point and an audit rule, wherein the content audit point is used for selecting a corresponding log content audit point and an audit rule according to an operation type and an operation detail item in a standardized log of the large data platform assembly, and the content audit point comprises access time, an access place, an important directory file, a key operation command, operation frequency and operation total amount;
a log content auditing submodule: and auditing the log content of the standardized log according to the log content audit point and the audit rule.
7. The apparatus of claim 6, further comprising:
a log option setting module: and the log saving operation options comprise the name of a log file, the storage path of the log file, the size of the log file and the number of the log file.
8. The apparatus of claim 6, further comprising:
the log early warning module: and the safety early warning device is used for generating corresponding safety early warning information when log contents in the standardized logs do not accord with preset auditing rules.
CN201710994900.6A 2017-10-23 2017-10-23 Log auditing method and device Active CN107818150B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710994900.6A CN107818150B (en) 2017-10-23 2017-10-23 Log auditing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710994900.6A CN107818150B (en) 2017-10-23 2017-10-23 Log auditing method and device

Publications (2)

Publication Number Publication Date
CN107818150A CN107818150A (en) 2018-03-20
CN107818150B true CN107818150B (en) 2021-11-26

Family

ID=61607466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710994900.6A Active CN107818150B (en) 2017-10-23 2017-10-23 Log auditing method and device

Country Status (1)

Country Link
CN (1) CN107818150B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108768929B (en) * 2018-04-09 2021-04-13 平安科技(深圳)有限公司 Electronic device, credit investigation feedback message analysis method and storage medium
CN108959659B (en) * 2018-08-14 2021-09-07 杭州安恒信息技术股份有限公司 Log access analysis method and system for big data platform
CN109040110B (en) * 2018-08-31 2021-10-22 新华三信息安全技术有限公司 Outgoing behavior detection method and device
CN109325009B (en) * 2018-09-19 2021-11-30 亚信科技(成都)有限公司 Log analysis method and device
CN109656894A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 Log standardization storage method, device, equipment and readable storage medium storing program for executing
CN111339050B (en) * 2018-12-03 2023-07-18 国网宁夏电力有限公司信息通信公司 Centralized security audit method and system based on big data platform
CN109885543A (en) * 2018-12-24 2019-06-14 航天信息股份有限公司 Log processing method and device based on big data cluster
CN110109809B (en) * 2019-04-08 2020-04-10 武汉思普崚技术有限公司 Method and equipment for testing log auditing function according to syslog
CN110515792B (en) * 2019-07-23 2022-11-25 平安科技(深圳)有限公司 Monitoring method and device based on web version task management platform and computer equipment
CN112347066B (en) * 2019-08-08 2023-10-13 腾讯科技(深圳)有限公司 Log processing method and device, server and computer readable storage medium
CN112346938B (en) * 2019-08-08 2023-05-26 腾讯科技(深圳)有限公司 Operation auditing method and device, server and computer readable storage medium
CN112347165B (en) * 2019-08-08 2023-11-03 腾讯科技(深圳)有限公司 Log processing method and device, server and computer readable storage medium
CN110764971A (en) * 2019-10-30 2020-02-07 杭州安恒信息技术股份有限公司 Auxiliary database operation and maintenance auditing method and device and electronic equipment
CN112636957B (en) * 2020-12-11 2023-02-21 微医云(杭州)控股有限公司 Early warning method and device based on log, server and storage medium
CN113111037A (en) * 2021-04-30 2021-07-13 杭州远石科技有限公司 Log audit warning method, device and storage medium
CN113792076A (en) * 2021-09-17 2021-12-14 甘肃同兴智能科技发展有限责任公司 Data auditing system
CN114338352A (en) * 2021-12-31 2022-04-12 南通机敏软件科技有限公司 Audit log configuration and analysis method, storage medium and processor

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073579B (en) * 2011-01-24 2015-04-22 复旦大学 Method for merging and optimizing audit events of Linux file system
CN102780726B (en) * 2011-05-13 2016-12-07 国网山东省电力公司蒙阴县供电公司 A kind of log analysis method based on WEB platform and system
CN104717085B (en) * 2013-12-16 2018-05-01 中国移动通信集团湖南有限公司 A kind of daily record analysis method and device
CN104636494A (en) * 2015-03-04 2015-05-20 浪潮电子信息产业股份有限公司 Spark-based log auditing and reversed checking system for big data platforms
CN106815125A (en) * 2015-12-02 2017-06-09 阿里巴巴集团控股有限公司 A kind of log audit method and platform
CN107147639B (en) * 2017-05-08 2018-07-24 国家电网公司 A kind of actual time safety method for early warning based on Complex event processing
CN107273267A (en) * 2017-06-09 2017-10-20 环球智达科技(北京)有限公司 Log analysis method based on elastic components

Also Published As

Publication number Publication date
CN107818150A (en) 2018-03-20

Similar Documents

Publication Publication Date Title
CN107818150B (en) Log auditing method and device
US10140453B1 (en) Vulnerability management using taxonomy-based normalization
RU2601201C2 (en) Method and device for analysis of data packets
CN104063473B (en) A kind of database audit monitoring system and its method
CN107579874B (en) Method and device for detecting data collection missing report of flow collection equipment
TW200836080A (en) Storing log data efficiently while supporting querying to assist in computer network security
CN109587125B (en) Network security big data analysis method, system and related device
CN111104680B (en) Safe and intelligent experimental data management system and method
CN107786551B (en) Method for accessing intranet server and device for controlling access to intranet server
Sanjappa et al. Analysis of logs by using logstash
US10083070B2 (en) Log file reduction according to problem-space network topology
CN111241104A (en) Operation auditing method and device, electronic equipment and computer-readable storage medium
CN113505048A (en) Unified monitoring platform based on application system portrait and implementation method
CN112235253A (en) Data asset combing method and device, computer equipment and storage medium
CN114297661A (en) Bug duplicate removal processing method, bug duplicate removal processing device, bug duplicate removal processing equipment and bug duplicate removal storage medium
Dweikat et al. Digital Forensic Tools Used in Analyzing Cybercrime
CN111913937B (en) Database operation and maintenance method and device
CN111737102A (en) Safety early warning method and computer readable storage medium
CN111309986A (en) Big data acquisition and sharing system
Song et al. A framework for digital forensic investigation of big data
US11366712B1 (en) Adaptive log analysis
CN109714199B (en) Network traffic analysis and traceability system based on big data architecture
CN111858251B (en) Data security audit method and system based on big data computing technology
CN114039965B (en) High-speed data acquisition system and method based on edge computing technology
Jandaeng Comparison of RDBMS and document oriented database in audit log analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant