Summary of the invention
The objective of the invention is to overcome deficiency of the prior art, a kind of information security auditing method based on data warehouse is provided, make it solve deficiency and the defective that exists in the background technology, improved the efficient of extendability, opening and the audit analysis of auditing system greatly.
The present invention is achieved by the following technical solutions, the inventive method employing Syslog standard agreement reaches the method for mode matching real-time collecting log information based on regular expression, by means of data warehouse comprehensive analysis processing environment and operational processes environment separation are come, make database be absorbed in the collection of various audit logs, data warehouse then carries out integrated to various Source logs, extract, and by the subject area integrative organization data of audit analysis, adopt the modeling method of information security multidimensional model simultaneously, each audit analysis theme is carried out association by common analysis dimension, formed multi-dimensional constellation towards whole information security field; By the multidimensional model in the data warehouse, adopt online on-line analysis disposal route to carry out multidimensional analysis, on the basis of data warehouse, adopt data digging method and association analysis method to carry out data mining simultaneously, find a large amount of inner link between the audit log of various audits source, thereby find potential security breaches and problem in the network; Generate available audit analysis form according to analysis result at last.
The present invention adopts the method for " data-driven ", it or not application-oriented demand, but the audit log of the existing Secure Application of utilization, operating system daily record etc. are audited, from existing Secure Application and related data, according to the audit analysis field contact between Audit data and the data is investigated again, organize the audit analysis theme in the data warehouse,, create the multidimensional model in the data warehouse according to analysis result.
The present invention adopts the Syslog standard agreement and based on the method for mode matching real-time collecting log information of regular expression, by the multidimensional model in the data warehouse, has improved the extendability and the opening of auditing system greatly; Adopt online on-line analysis disposal route to carry out multidimensional analysis, improved the efficient of audit analysis greatly; In addition, the data in the data warehouse are redundant, and can not revise, institute thinks to investigate and collect evidence provides effective believable chasing after to entangle evidence.
Below the inventive method is further described, method step is as follows:
(1) data and the audit analysis theme that need audit are determined in the audit source in the phase-split network.The current information security audit is analyzed theme and mainly comprised: Firewall Events is analyzed theme, intrusion detection event analysis theme, anti-virus event analysis theme, SSLServer event analysis theme.
(2) according to the angle of audit analysis theme and analysis, create multidimensional model, organize data in the multidimensional mode.Because the maturation of relevant database and widely-used is used the expression and the storage of star-like modeling multidimensional model.Multidimensional structure with multidimensional model in star-like model is divided into two class tables: a class is a fact table, be used for depositing the code value of the needed metric of audit analysis and each analysis dimension, another kind of is the dimension table, it be distributed in fact table around, be audit analysis and special angle.Fact table connects by the value and the dimension table of each dimension.Analyze theme according to information audit, each multidimensional model is as follows:
● the fire wall multidimensional model: the metric data in the fact table comprises: continuous access time, transmitted traffic, reception flow.The dimension of analyzing comprises: time dimension, fire wall action dimension (refuse, pass through etc.), access protocal dimension (http, ftp, telnet, smtp, pop3 etc.), source address dimension, destination address dimension.
● the intrusion detection multidimensional model: the metric data in the fact table comprises: the processing suggestion of alert event detail data, alert event.The dimension of audit analysis comprises: time dimension, alert event grade dimension (high, medium and low), alarm detector dimension, source address dimension, destination address dimension, serve port dimension.
● the anti-virus multidimensional model: the metric data in the fact table comprises: virus outburst position (path).The dimension of audit analysis comprises: virus infections time dimension, virus infections machine dimension, system user dimension, Virus Name dimension, Virus Type dimension (file virus, mail virus, macrovirus etc.), scan type dimension (autoscan, manual scanning), viral operating result dimension (removing, isolation, deletion etc.).
● the SSLServer multidimensional model: the metric data in the fact table comprises: the access resources number of times.The dimension of audit analysis comprises: access resources time dimension, source address dimension, resource name dimension, user's dimension, resource dimension.
(3) related multidimensional model.Each above multidimensional model also is mutual independent auditing entity, each audit analysis theme can't be associated.In order to carry out related audit analysis, must between each a plurality of dimension types, common audit dimension be arranged.In above multidimensional model, each multidimensional model all has time dimension, and each multidimensional model all has address dimension or the dimension relevant with user profile, all have the address dimension as fire wall multidimensional model, intrusion detection multidimensional model and SSLServer multidimensional model, and the anti-virus multidimensional model has the compromised machines dimension.These dimensions are abstract and according to actual user's information creating user dimension in the network, and each user has its essential information, comprises IP address, machine name etc.By this user's dimension and time dimension, connect each multidimensional model, the constellation of configuration information safety.
(4) finish the establishment of data warehouse after, the starting log server is monitored the UDP514 port, receives network security and uses the daily record that sends.
(5) the network security application configuration after the Syslog service, the real-time Syslog agreement of passing through standard sends log information to log server when producing log information.Syslog is the built-in services that has on most linux/Unix platforms; on other platforms (as Windows), similar products like is arranged also recently; and all use the Syslog mode to send daily record in most Secure Application equipment; therefore use the Syslog agreement all system journals and safety equipment daily record can be sent in the shielded central controlled server; thereby provide a kind of retractility reasonable scheme; and, avoided daily record to be kept at the local danger of being distorted, deleting because the Network Transmission of passing through that daily record is real-time is come out.
(6) after log server receives log information, the regular expression pre-configured by the keeper mates parsing to the pattern of journal format, therefrom extract the needed information of audit, and unify integrated and conversion work, the inconsistent data in the unified audit log data.The system that makes can resolve the device log of any form by this technology, has guaranteed the opening and the extendability of system, is the comparatively feasible mode of at present miscellaneous safety equipment being unified acquisition of information.
(7) carry out olap analysis and data mining analysis.Data integration is carried out olap analysis and data mining analysis on the basis of data warehouse behind data warehouse.Olap analysis comprises section, stripping and slicing, rotates, drills through.The inner link of data in the audit log is then excavated in data mining by traditional decision-tree, association analysis method, sequence pattern analytical approach, find potential security breaches and problem in the network.
(8) generate available audit analysis form according to analysis result.Whole audit analysis process is a dynamic feedback and round-robin process.The information of returning according to the user is constantly improved and efficient and the performance of adjustment model with the raising audit analysis on the one hand, constantly understands the audit analysis demand on the other hand, provides more useful audit decision information to the user.
The present invention has mainly adopted the method for mode matching based on regular expression, the modeling method of information security multidimensional model, online on-line analysis disposal route, data digging method and association analysis method.
● based on the method for mode matching of regular expression
The variation of daily record kind causes the variation of journal format.Because the security audit daily record is stored with text mode mostly, can adopt text-processing mode for the audit log of the type based on regular expression (Regular Expression), carry out the extraction and the subsequent treatment of textview field by the mode of Pattern, to reach flexibility ratio and the opening that daily record is resolved.
● the modeling method of information security multidimensional model
Create the method for designing that the information security multidimensional model will adopt " data-driven ".At first, " data-driven " is exactly to utilize existing daily record data to carry out system's construction, and clearly which type of data recognition network, various operating system and Secure Application will produce, and they have what influence or the like to current system design.Secondly, " data-driven " no longer is application-oriented, from application demand, but from analysis field the data of various Secure Application and the contact between the data investigated again, organizes the theme in the data.The 3rd, " data-driven " is the common point of utilizing the daily record of data model effective recognition and analyzing subject data.In the method, data will be the cores of whole architecture environment, so on the basis of abundant research information security fields professional knowledge, conclude the also analysis subject area of abstracted information security fields, determine that the granularity level divides and the data segmentation strategy, create a highly compatible, extendible security audit multidimensional data model is very crucial.
● online on-line analysis disposal route
Online on-line analysis disposal route comprises the memory technology of multidimensional data, the section of multidimensional data and stripping and slicing, drill through, rotation technique.By the OLAP technology, can carry out correlation analysis to the daily record of separate sources, thereby reflect this group equipment information inherence, that have certain value.Such as search the record that a machine stays on various safety equipment, its active situation can be described out more clearly.
● data digging method and association analysis method
That data mining (Dara Mining) is excavated from data warehouse is implicit, previous unknown, the knowledge and the rule that security decision are had potential value.Data mining mainly contains prediction/authentication function and representation function.Prediction/authentication function refers to the Given information prediction or verifies other unknown messages.Forecasting Methodology has statistical analysis technique, correlation rule and decision tree Forecasting Methodology, regression tree Forecasting Methodology etc.Representation function refers to find the pattern understood of data of description.Describing method comprises following several: data qualification, regretional analysis, gather, summarize, construct dependence pattern, variation and variance analysis, mode discovery, path discovery etc.By these data digging methods can the network crime from a large amount of log informations clues and traces, and find the problem that potential leak in the network and device management configurations exist.In addition, can also from the data in difference source, analyze the correlativity that draws between the data by data digging method; By grasping the security risk situation of whole network with the contrast of standard security strategy, these analyses turn back in the formulation of safety precaution strategy the most at last, guarantee the consistance and the standardization of security strategy, these security strategies are applied in the concrete safety precaution execution the most at last, and can enrich the network-wide security policy storehouse.
The present invention has substantive distinguishing features and marked improvement, and the present invention has following remarkable result:
(1) Gao Du extendability is with compatible: by means of the Syslog agreement and the regular expression method of standard, can support various types of daily records fast.With current main-stream network security application seamless compatibility, need not to adopt proxy mode, in complexity that has reduced network management greatly and system maintenance cost.
(2) provide the audit analysis view of multidimensional to the user.From network manager's angle, the view of whole network is a multidimensional, so the conceptual model of audit analysis also should be multidimensional, and audit analysis should be to carry out from different angles.
(3) analysis efficiency and performance efficiently.Data in the data warehouse be through integrated, comprehensively reach pretreatedly, it keeps apart operational processes environment and audit analysis environment, has solved the collision problem in the varying environment, has improved the efficient and the performance of statistical study greatly.
(4), make that dissimilar audit logs are carried out association analysis becomes possibility, has excavated the inner link between the audit log in the network by each multidimensional model of customer-centric related information safety.
(5) because data warehouse is read-only, the data source that can not change so guaranteed the credibility of audit log, is entangled evidence for investigation and evidence collection provides effective believable chasing after.
Embodiment
Employing the inventive method has been developed the easy extended pattern safety auditing system based on data warehouse, and this safety auditing system separates the operational processes environment in the information security audit by data warehouse server with comprehensive analysis processing environment.In the operational processes environment, system requirements disparate networks Secure Application sends daily record in the Syslog mode in real time to the log server of far-end.The UDP514 port that log server is monitored Syslog receives daily record.Receive after the daily record with the pre-configured regular expression of keeper and carry out pattern match, qualified log information is integrated, purify and import data warehouse.In data warehouse, in the face of many numerous and diverse, heavy, discrete low level raw information, the multidimensional model of establishment information security carries out higher level in-depth analyses such as OLAP, data mining on the basis of this multidimensional model, thereby from numerous information, find inner link hiding in the original log, and problem that exists in the discovery network and potential safety hazard, auxiliary network the keeper make a strategic decision, and adjusts security strategy.At monitoring client, the soap protocol by standard carries out the Remote configuration management to log server on the one hand, thereby greatly facilitates keeper's work; Monitoring client reads audit information from data warehouse on the other hand, manages concentratedly and audit analysis.
Be that an application example based on the easy extended pattern safety auditing system of data warehouse is described below, specific as follows:
For high-tech enterprise, the employee often needs Internet inquiry data, has also increased the chance of infective virus simultaneously.Enterprise kills the virus by anti-virus software (promise pause anti-virus) on the one hand, on the other hand by the visit of fire wall (OLM's fire wall) management employee to Internet.By the easy extended pattern safety auditing system based on data warehouse, the keeper can carry out related audit analysis with the record of infective virus with the record of employee access Internet, finds the potential leak in the network.
(1) data warehouse of establishment information security is comprising fire wall multidimensional model, anti-virus multidimensional model.Data in the fire wall multidimensional model comprise: continuous access time, transmitted traffic, reception flow.The dimension of analyzing comprises: time dimension, fire wall action dimension (refuse, pass through etc.), access protocal dimension (http, ftp, telnet, smtp, pop3 etc.), source address dimension, destination address dimension.The anti-virus multidimensional model comprises virus outburst position (path).The dimension of audit analysis comprises: virus infections time dimension, virus infections machine dimension, system user dimension, Virus Name dimension, Virus Type dimension (file virus, mail virus, macrovirus etc.), scan type dimension (autoscan, manual scanning), viral operating result dimension (removing, isolation, deletion etc.).
(2) log server of startup auditing system is monitored the UDP514 port.
(3) configuration OLM (Orient LengendMaker) fire wall, the address for log server is joined in the address that daily record is received.
(4) the configuration promise fire wall that pauses sends to log server with daily record in the Syslog mode.
(5) when user capture Internet or during infective virus, produce related application daily record and the real-time log server that sends to.
(6) after log server receives daily record from the UDP514 port, carry out pattern match with the regular expression that configures.The logging mode of OLM's fire wall is as follows:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*?kernel:id=\″firewall\″[\s]+time=\″(.*?)\″[\s]+.*?proto=\″(.*?)\″[\s]+src=\″(.*?)\”[\s]+srcport=\″(.*?)\″[\s]+dst=\″(.*?)\″[\s]+dstport=\″(.*?)\″[\s]+action=\″(.*?)\″
Be the needed information of audit in the bracket, log server carries out integrated after according to regular expression pattern taking-up information and imports in the data warehouse.Take same mode, log server takes out the information in the anti-virus daily record, filters the Virus Logs of " manual scanning ", and the daily record of " real time scan " is imported in the data warehouse.
(7) in the anti-virus daily record, comprise the machine name and the user name of infective virus, and comprised the source address when member's formula is visited Internet in the fire wall daily record, promptly visited the address of machine.Concern one to one according to machine and IP address in user's dimension, we can use OLAP technology inquiry at one time in the section compromised machines in ever accessed which website, thereby to know the employee be the virus that infect what website of visit to the destination address in the slave firewall daily record.
(8) keeper can increase rule according to this audit information in fire wall, and the packet that blocking-up is sent from the resultant destination address of above-mentioned inquiry guarantees that other machines is not subjected to the infection of this website virus.