CN102164050B - Log parsing method and log parsing node device - Google Patents

Log parsing method and log parsing node device Download PDF

Info

Publication number
CN102164050B
CN102164050B CN201110125560.6A CN201110125560A CN102164050B CN 102164050 B CN102164050 B CN 102164050B CN 201110125560 A CN201110125560 A CN 201110125560A CN 102164050 B CN102164050 B CN 102164050B
Authority
CN
China
Prior art keywords
parse node
daily record
subordinate
parse
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110125560.6A
Other languages
Chinese (zh)
Other versions
CN102164050A (en
Inventor
丁兆杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ruishan Network Co., Ltd
Original Assignee
Beijing Star Net Ruijie Networks Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Star Net Ruijie Networks Co Ltd filed Critical Beijing Star Net Ruijie Networks Co Ltd
Priority to CN201110125560.6A priority Critical patent/CN102164050B/en
Publication of CN102164050A publication Critical patent/CN102164050A/en
Application granted granted Critical
Publication of CN102164050B publication Critical patent/CN102164050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a log parsing method and a log parsing node device, used for providing a high-efficiency strong-adaptability log parsing scheme. The method comprises the steps that a parsing node obtains a log and offset to be matched for indicating unparsed contents in the log; the parsing node parses the unparsed contents indicated by the offset to be matched by using a stored first regular expression so as to obtain field information matched with the first regular expression; the parsing node judges whether a subordinate parsing node exists; if judgment results shows no subordinate parsing node exists and event type information is pre-stored in the parsing node, the parsing node considers the event type information as the event type information of an event recorded in the log, wherein the event type information is determined according to the field information which can be parsed from the log by a regular expression stored in at least one parsing node included in a route along which the log is transmitted to the parsing node.

Description

Daily record analytic method and daily record parse node equipment
Technical field
The present invention relates to the network information security and field of network management, relate in particular to a kind of daily record analytic method and daily record parse node equipment.
Background technology
System journal (Syslog) is a kind of widely used daily record in network environment, and it can be supported by various operating system, the network equipment and safety means official.For the daily record of other types, often also can by daily record transducer, be converted to the form of Syslog in actual applications, be convenient to unify gather, manage and analyze.
Syslog is the very loosely daily record freely of a kind of form, and in fact the length except Syslog can, over the mandatory requirement of 1024 bytes, almost have other for the mandatory requirement of Syslog.Because the form of Syslog is very freely loose, so there is greatest differences in the Syslog that the different product of even same manufacturer of different vendor generates on form.As the code in following table 1, table 2 is the Syslog about same class event that two kinds of distinct devices produce:
Table 1:
Figure BDA0000061416600000011
Table 2:
From table 1, table 2, for the different Syslog of same class event, in stylistic difference, may be huge.In the face of so chaotic Syslog form, after only resolving and extract wherein Useful Information in the mode of automation, just likely to these Syslog are unified, effectively analyze and process.This process is just called daily record resolves, and its object mainly contains following 2 points:
The first is in order to determine the event type of daily record.Popular say what will determine exactly that this daily record says is what on earth.As in fact basic identical in the implication of 2 Syslog above, all illustrate that a TCP communication has been allowed through.So these 2 Syslog can be endowed " TCPAccept " this event type.
It two is in order to obtain the field information relevant to the event type of daily record.As for above-mentioned " TCPAccept " event, conventionally also to know which two end points this TCP communication occurs between.So at least also will pass through the parsing to daily record, therefrom extract the content that comprises source IP, source port, object IP and destination interface and come.As: above-mentioned 2 day has aimed at after parsing, all wants the field that can obtain as shown in following table 3a to extract result set:
Table 3a:
Field name Field is extracted result
Source IP 192.168.65.65
Source port 1355
Object IP 10.10.10.10
Destination interface 80
The object of to sum up, a Syslog being resolved will therefrom obtain " event type " and " field extraction result set " these two information exactly.
In the network of a normal operation, the Syslog amount that various main frames, the network equipment, safety means and application software produce greatly.Conventionally in the network environment of a big-and-middle-sized tissue, the daily record of its generation reaches tens thousand of per second.These daily records need to be gathered in real time, be resolved and be stored, and therefore daily record are resolved and have been proposed very harsh performance requirement.
At present, the parsing scheme of Syslog almost all realizes by regular expression bar none.Regular expression, refers to that is used for describing or mating a series of single character strings that meet the character string of certain syntactic rule.In simple terms, utilizing regular expression to resolve Syslog is exactly a kind of means of text being carried out to pattern matching and contents extraction.Regular expression has defined a pattern, and the text that only meets this pattern could be by matching regular expressions.And the contents extraction that this regular expression is mated out, just can realize the extraction to certain content in daily record.
In the ideal situation, daily record has been applied after suitable regular expression, just should once obtain analysis result.Yet problem is: after receiving a daily record, it is suitable which regular expression of ability how to confirm is only, so it is a difficult point in daily record resolving that the regular expression how rapidly form of definite daily record, and selection answers is in contrast resolved.For this problem, mainly contain at present following technical scheme:
One, the method for exhaustion
In simple terms, the method for exhaustion is exactly that all known journal formats (regular expression) are all followed to daily record coupling to be resolved one time, and what wherein can match is exactly suitable naturally.
Obviously this is a kind of mode of very poor efficiency: because much equipment all may generate daily record of a great variety, form differs.A kind of typical network equipment just may generate the diverse daily record of hundreds of forms.If therefore all must attempt resolving with hundreds of regular expressions to every daily record, its performance waste is just very surprising: even if daily record parse node equipment can carry out canonical coupling 100,000 times in 1 second, if yet daily record of its every parsing on average to attempt 100 times, the daily record analytic ability finally obtaining also not enough 1000 times per second.
Two, keyword method
Keyword method is a kind of daily record analytic method proposing for making up method of exhaustion defect, and it is also the daily record parsing means of current main employing.It all has requirement to Log Source (producing the equipment of daily record) and daily record parse node equipment.
The prerequisite of key application word method is that aiming at day that same equipment sends possesses a common header on form, the uniform format of this common header, wherein at least comprises following information: the Log Types identifier daily record of a kind of concrete form of unique identification (can) and daily record priority.
Such as, for showing such two daily records shown in 3b, table 4:
Table 3b:
Figure BDA0000061416600000041
Table 4:
The form of these two daily records is completely different, but their stem is identical, and the form of stem is " Log Types: ".For above-mentioned two daily records that possess this common header, daily record parse node equipment just can extract respectively the common header information of these two daily records, and therefrom parses Log Types (being keyword); And then select its corresponding regular expression to complete final parsing work according to this keyword.As above in example, can determine corresponding regular expression according to " SESLOG " and " FWLOG " this both keyword.
The limitation of keyword method is clearly: if do not comprise common header in different daily record, and just can not key application word method.And in actual applications, even also there is very big-difference in the different product of different vendor in the realization of keyword.If any manufacturer can be only by a keyword with regard to unique definite journal format; And some manufacturers are in view of the complexity of its functions of the equipments itself, often need to determine journal format by " Log Types+subtype " such multiclass classification mechanism.Therefore the equipment that carries out daily record parsing with keyword method often has to realize corresponding keyword extraction logic for mainstream vendor's mode classification, equal existing problems in autgmentability and adaptability.
Summary of the invention
The embodiment of the present invention provides a kind of daily record analytic method and equipment, in order to a kind of efficient and adaptable daily record parsing scheme to be provided.
The embodiment of the present invention is by the following technical solutions:
An analytic method, comprising:
Parse node obtains daily record and is used to indicate the skew to be matched of not resolving content in daily record; And utilize the first regular expression of storing, and the not parsing content that described skew to be matched is indicated is resolved, obtain the field information with the first matching regular expressions; And judge whether to exist subordinate's parse node; The determination result is NO, and described parse node is pre-stored while having event type information, described parse node is defined as described event type information the event type information of described daily record institute recording events, wherein, described event type information be according to described daily record be delivered to described parse node the regular expression of at least one parse node storage of comprising on the path of the process field information that can parse from daily record and definite.
Optionally, described method also comprises:
Judgment result is that while being, described parse node determines that according to skew to be matched mode determines skew to be matched, and described daily record and definite skew to be matched are offered and can resolve the Yi Ge subordinate parse node of not resolving content in daily record according to appointment analysis mode, wherein, described skew to be matched determines that mode is to arrange according to the form of daily record and the predetermined field information that need determine from daily record.
Optionally, described parse node offers described daily record and described definite skew to be matched, according to specifying analysis mode to resolve the Yi Ge subordinate parse node of not resolving content in daily record, specifically to comprise:
According to the mode of described daily record and described definite skew to be matched is once provided to a described subordinate parse node, successively to subordinate's parse node described in each, provide described daily record and described definite skew to be matched, until receive after the acknowledge message of Yi Ge subordinate parse node transmission, stop providing described daily record and described definite skew to be matched; Wherein, described acknowledge message is described Yi Ge subordinate parse node according to the second regular expression of self storage, determines can resolve not resolve in described daily record to send after content.
Optionally, described parse node offers described daily record and described definite skew to be matched can resolve the Yi Ge subordinate parse node of not resolving content in daily record according to appointment analysis mode, specifically comprise: described parse node obtains the key field in the first regular expression, and from described daily record, parse key field information according to the key field obtaining; The second regular expression that described parse node used respectively from each pre-stored subordinate's parse node, determine the second regular expression with described key field information matches; And described parse node is chosen Yi Ge subordinate parse node from subordinate's parse node of the second regular expression of use and described key field information matches, and by described daily record and described definite skew to be matched, offer this subordinate's parse node of choosing.
Optionally, described parse node offers described daily record and described definite skew to be matched, according to specifying analysis mode to resolve the Yi Ge subordinate parse node of not resolving content in daily record, specifically to comprise:
Described parse node obtains the key field in the first regular expression, and from described daily record, parses key field information according to the key field obtaining; Described parse node, from the field information distributing respectively for each subordinate's parse node in advance, is determined the field information with described key field information matches; And described parse node chooses Yi Ge subordinate parse node from the corresponding subordinate of definite field information parse node, and described daily record and described definite skew to be matched are offered to this subordinate's parse node of choosing.
A parse node equipment, comprising:
First obtains unit, for obtaining daily record and being used to indicate the skew to be matched that content is not resolved in daily record; Second obtains unit, for utilizing the first regular expression of storage, not parsing content in the daily record that the first indicated acquisition unit of skew to be matched that the first acquisition unit is obtained obtains is resolved, and obtains the field information with the first matching regular expressions; Judging unit, for judging whether to exist subordinate's parse node equipment; Event type information determining unit, for obtaining at judging unit, the determination result is NO, and described daily record parse node equipment is pre-stored while having event type information, described event type information is defined as to the event type information of described daily record institute recording events, wherein, described event type information be according to described daily record be delivered to described daily record parse node equipment the regular expression of at least one daily record parse node device storage of comprising on the path of the process field information that can parse from daily record and definite.
The beneficial effect of the embodiment of the present invention is as follows:
The such scheme that the embodiment of the present invention provides is by being provided for not resolving in daily record the skew to be matched of content, thereby parse node can only not resolved resolving content, and resolved part without repeated resolution, therefore can reduce the problems such as recalling of producing and repeated matching in daily record resolving, thereby reach higher analyzing efficiency, and the method does not require that resolved different daily records need possess public header message, therefore there is stronger adaptability.
Accompanying drawing explanation
The idiographic flow schematic diagram of a kind of daily record analytic method that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 is a kind of concrete structure schematic diagram of analytic tree;
Fig. 3 is the internal structure schematic diagram of any parse node;
Fig. 4 realizes the schematic diagram of subordinate's parse node selector of mixed type;
The process schematic diagram of Fig. 5 for a Syslog is resolved;
The concrete structure schematic diagram of a kind of daily record parse node equipment that Fig. 6 provides for the embodiment of the present invention.
Embodiment
The problems that exist for prior art scheme herein, have proposed a kind of efficient and adaptable daily record parsing scheme.This scheme can reduce the problems such as recalling of producing in daily record resolving and repeated matching, thereby reaches higher analyzing efficiency, and the method do not require that resolved different daily records need possess public header message, therefore has stronger adaptability.
Below in conjunction with accompanying drawing, this scheme that the embodiment of the present invention is provided is specifically introduced.
First, the embodiment of the present invention provides a kind of daily record analytic method, and the idiographic flow schematic diagram of the method as shown in Figure 1, comprises the following steps:
Step 11, parse node obtains daily record and is used to indicate the skew to be matched of not resolving content in daily record;
Wherein, parse node can be by software, to be realized the virtual unit of its function, can be also coordinated and realized a kind of entity device of its function by soft and hardware, and the embodiment of the present invention does not limit this.Wherein, the example that utilizes software to realize parse node function is asked for an interview a specific embodiment hereinafter, does not repeat them here.
The daily record that parse node obtains can produce and input parse node by equipment, can be also that the higher level's parse node by this parse node sends to this parse node.Skew to be matched can be a sign of appointing, this sign is equivalent to play the effect of a pointer, is used to indicate the not parsing content in daily record.
Step 12, utilizes the first regular expression of storing, and the not parsing content that skew to be matched is indicated is resolved, and obtains the field information with the first matching regular expressions;
The mode of utilizing regular expression to resolve daily record that the implementation of this step provides to prior art is similar.After the field information of acquisition and the first matching regular expressions, this field information can be stored in a preassigned memory space, to facilitate follow-up reading this field information.Or, also field information can be exported to party in request's terminal of field information.
Step 13, this parse node judges whether to exist subordinate's parse node, judgment result is that while being, performs step 14, otherwise, perform step 15;
Subordinate's parse node of this parse node represents: on the analytic tree that possesses tree consisting of a plurality of parse nodes, than this parse node, more press close to the parse node of analytic tree top layer, and subordinate's parse node of this parse node and this parse node should be arranged in analytic tree for transmitting the same path of daily record simultaneously.
In the process in advance this parse node being arranged, subordinate's parse node mapping table (referring to table 11 hereinafter) of its subordinate's parse node relevant information of storage can be set in this parse node, by judging in this table whether store subordinate's parse node sign (such as subordinate's parse node name), just can judge whether this parse node exists corresponding subordinate parse node.
Step 14, parse node determines that according to skew to be matched mode determines skew to be matched, and daily record and definite skew to be matched are offered and can resolve the Yi Ge subordinate parse node of not resolving content in daily record according to appointment analysis mode, wherein, skew to be matched determines that mode is to arrange according to the form of daily record and the predetermined field information that need determine from daily record;
Wherein, mode is determined in the skew to be matched setting in advance according to wanting field information definite from described daily record, may require this parse node after daily record is resolved, skew to be matched is revised as to the end of pointing to the part of being resolved by this parse node in daily record, the skew to be resolved that subordinate's parse node of this parse node just can provide according to this parse node like this, resolves the part that this end is later; In addition, according to which, also may require this parse node after daily record is resolved, skew to be matched is not modified, subordinate's parse node of this parse node, just according to unmodified skew to be matched, is resolved the not parsing content in its indicated daily record like this.
A concrete execution mode of said process will describe in detail later, not repeat them here.
Step 15, this parse node further judges that self is pre-stored while having event type information, this pre-stored event type information is defined as to the event type information of this daily record institute recording events, wherein, event type information be according to daily record be delivered to this parse node the regular expression of at least one parse node storage of comprising on the path of the process field information that can parse from daily record and definite.
In step 15, event type information can be to arrange when the analytic tree that structure comprises a plurality of parse nodes in advance.Event type information is generally relevant with the demand that daily record is resolved.Such as, for two such daily records of " Tom sold a car to Bill " and " Tom bought a house form Kate ", can be by arranging, the parse node regular expression used that first is resolved these two daily records can parse respectively field information sold/bought (can parse the field information corresponding to verb) from these two daily records.By arranging, further can be so that when parsing sold, " Tom sold a carto Bill " this daily record can be transmitted by the path I that comprises above-mentioned first parse node and some other parse node, and be finally delivered to last parse node a place (such as selecting the mode of subordinate's parse node determine bang path by character string selector, referring to hereinafter) of path I.Similarly, by arranging, also can be so that when parsing bought, " Tom bought a house form Kate " this daily record also can be transmitted by the path II that comprises above-mentioned first parse node and some other parse node, and is finally delivered to last parse node b place of path II.
For above-mentioned situation, " sold " that can parse according to first parse node, pre-stored corresponding event type information in parse node a, such as this event type information can be " selling transaction log "; Similarly, " bought " that can parse according to first parse node, pre-stored corresponding event type information in parse node b, such as corresponding event type information can be " buying in transaction log ".Optionally, if second parse node can parse " Tom " such name according to the regular expression of its use from daily record in path I, path II, according to this name, the storage event type information prestoring in can also parse node a is set to " relevant with Tom sell transaction log ", and the storage event type information prestoring in parse node b is set to " relevant with Tom buy in transaction log ".
It should be noted that, in step 14, parse node offers daily record and definite skew to be matched after subordinate's parse node, can also repeat above-mentioned steps 14~15, until determine the event type information of daily record institute recording events for this subordinate's parse node.In actual applications, due to the determination result is NO, and this parse node is further judged, and self is pre-stored while having event type information, just can determine the event type information of this daily record institute recording events, therefore the optional step in the daily record analytic method that, in fact above-mentioned steps 14 can provide for the embodiment of the present invention.
In addition need to stress, above-mentioned steps 14 can be, but not limited to following three kinds of implementations, is respectively exhaustive mode, canonical mode and character string mode.Below respectively these three kinds of modes are introduced:
1, exhaustive mode.
Under which, in order to realize parse node, daily record and definite skew to be matched are offered to subordinate's parse node, can provide daily record and the mode of the skew to be matched determined according to Xiang Yige subordinate parse node once, the skew to be matched that successively daily record is provided and determines to each subordinate's parse node, until receive after the acknowledge message of Yi Ge subordinate parse node transmission, stop the skew to be matched that daily record is provided and determines.
Wherein, above-mentioned acknowledge message is subordinate's parse node according to the second regular expression of self storage, determines and can send after content according to specifying analysis mode to resolve not resolve in daily record.
2, canonical mode.
Under which, in order to realize parse node, daily record and definite skew to be matched are offered to subordinate's parse node, can carry out following step:
First, parse node obtains the key field information parsing from daily record according to the key field in the first regular expression, wherein, key field can be predetermined according to journal format, also can be to determine according to the part of speech of the word comprising in daily record etc., in the embodiment of the present invention, this not limited;
Such as, known according to the journal format that " Tom sold a car to Bill " and " Tom bought a house form Kate " is such, its form can be summarized as subject (Tom)+verb (sold/bought)+object (car, house).Now a key field of the verb that can extract in this form can be set in the first regular expression, thereby when utilizing the first regular expression to resolve any daily record of this form, just can obtain the key field information (being above-mentioned verb sold/bought) that this key field matches.
Then, the second regular expression that parse node is used from pre-stored subordinate's parse node, determine the second regular expression with key field information matches;
In general, the situation that comprises variable content in the extraction result that canonical selector is mainly applicable to obtain according to " keyword ".Suppose that the extraction result obtaining according to " keyword " is from a daily record: logtype=16timestamp=123321323subtype=" nat ", wherein the numerical value of timestamp in every daily record is all different.So, what Ruo You subordinate parse node was used is such regular expression: " logtype=16.*? subtype=" nat " " (regular expression of ignoring the variable content in this extraction result), can think that such subordinate's parse node is the second regular expression with key field information matches.
Finally, parse node is chosen Yi Ge subordinate parse node from subordinate's parse node of the second regular expression of use and key field information matches, and daily record and definite skew to be matched are offered to this subordinate's parse node of choosing.
3, character string mode.
Under which, in order to realize parse node, daily record and definite skew to be matched are offered to subordinate's parse node, can carry out following step:
First, parse node obtains the key field information parsing from daily record according to the key field in the first regular expression;
Then, parse node, from the pre-stored field information distributing respectively for each subordinate's parse node in advance, is determined the field information with key field information matches;
Finally, parse node is chosen Yi Ge subordinate parse node from the corresponding subordinate of definite field information parse node, and by daily record and definite skew to be matched, offers this subordinate's parse node of choosing.
When the such scheme that the embodiment of the present invention is provided is applied in reality, also may occur not existing can be according to specifying analysis mode to resolve the situation of not resolving the Yi Ge subordinate parse node of content in daily record, for this situation, can, by setting in advance the mode of an acquiescence parse node, daily record be offered to this acquiescence parse node with the skew to be matched of determining and be further processed.Alternatively, this acquiescence parse node self can not resolved daily record, but choose Yi Ge subordinate parse node from the subordinate's parse node of self, this daily record is resolved.
Acquiescence parse node also can adopt one of above-mentioned three kinds of modes choosing downstream site.Specific implementation mechanism is as follows:
1, acquiescence parse node is chosen the realization mechanism of subordinate's parse node according to exhaustive mode:
Can be when specifying analysis mode to resolve not resolve in daily record the Yi Ge subordinate parse node of content when not existing, parse node offers preassigned acquiescence parse node by daily record and the skew to be matched determined;
And the 4th regular expression of indication acquiescence parse node utilization storage, to receiving the indicated not parsing content of skew to be matched, resolve, obtain the field information with the 4th matching regular expressions, and when there is the subordinate parse node of acquiescence parse node, according to skew to be matched, determine that mode determines skew to be matched, and according to the mode of the definite skew to be matched of daily record and acquiescence parse node is once provided to the Yi Ge subordinate parse node of acquiescence parse node, successively each subordinate's parse node to acquiescence parse node provides daily record and the definite skew to be matched of acquiescence parse node, until receive after the acknowledge message of Yi Ge subordinate parse node transmission, the skew to be matched that provides daily record and acquiescence parse node to determine is provided.
Above-mentioned acknowledge message is subordinate's parse node according to the 3rd regular expression of self storage, determines can resolve not resolve in daily record to send after content.
2, acquiescence parse node is chosen the realization mechanism of subordinate's parse node according to canonical mode:
Similarly, can be when specifying analysis mode to resolve not resolve in daily record the Yi Ge subordinate parse node of content when not existing, parse node offers preassigned acquiescence parse node by daily record and the skew to be matched determined; And
Indication acquiescence parse node obtains according to the key field in the 4th regular expression of storage, from the indicated not parsing content of the skew to be matched that receives, parse key field information, and when there is the subordinate parse node of acquiescence parse node, according to skew to be matched, determine that mode determines skew to be matched, and from the 3rd regular expression that subordinate's parse node of pre-stored acquiescence parse node is used, determine the 3rd regular expression with key field information matches, and choose Yi Ge subordinate parse node from use the subordinate's parse node with the 3rd regular expression of key field information matches, and the skew to be matched of daily record and acquiescence node determination is offered to this subordinate's parse node of choosing.
Wherein, to can be, but not limited to be to determine according to the part of speech of the word comprising in journal format or daily record etc. to above-mentioned key field.
3, acquiescence parse node is chosen the realization mechanism of subordinate's parse node according to character string mode:
Can be when specifying analysis mode to resolve not resolve in daily record the Yi Ge subordinate parse node of content when not existing, parse node offers preassigned acquiescence parse node by daily record and the skew to be matched determined; And
Indication acquiescence parse node obtains the key field in the 4th regular expression of storing, from the indicated not parsing content of the skew to be matched that receives, parse key field information, and when there is the subordinate parse node of acquiescence parse node, according to skew to be matched, determine that mode determines skew to be matched, and from being respectively in advance the field information of subordinate's parse node distribution of giving tacit consent to parse node, determine the field information with key field information matches, and choose Yi Ge subordinate parse node from the corresponding subordinate of definite field information parse node, and the skew to be matched of daily record and acquiescence node determination is offered to this subordinate's parse node of choosing.
The existence of acquiescence parse node, has increased the flexibility of daily record resolving greatly.Such as for a kind of daily record of buying in type of transaction:<buyer>bought<sth>from<seller>, after root parse node is resolved it, if sth comprises rifle, pistol or shotgun etc., daily record is submitted to weapontransaction parse node to be further processed; And if sth comprises heroin, cocaine or marijuana etc., daily record is submitted to drug transaction parse node to be further processed; And the situation that is other article for sth is submitted to daily record acquiescence parse node normal transaction node processing.Now, can construct in the following manner root parse node:
Figure BDA0000061416600000131
Figure BDA0000061416600000141
Below in conjunction with reality, the concrete application flow of the such scheme that the detailed description embodiment of the present invention provides.
With the example that resolves to Syslog, in actual applications, the such scheme that the embodiment of the present invention provides can be that the tree network implementation level by consisting of parse node is dissolved and analysed.This tree network can be called " analytic tree ".A kind of concrete structure schematic diagram of analytic tree as shown in Figure 2.Each square frame in Fig. 2 represents a parse node on analytic tree, and the parse node in analytic tree bottom is commonly referred to as root parse node, and the parse node in top layer can be called leaf parse node.On the same bang path of Syslog, relatively far and near degree according to different parse nodes apart from analytic tree bottom, the parse node nearer apart from analytic tree bottom can be called to higher level's parse node of the parse node relatively far away apart from analytic tree bottom, and this parse node relatively far away apart from analytic tree bottom can be called subordinate's parse node of this parse node nearer apart from analytic tree bottom.Wherein each parse node can connect a plurality of subordinates parse node, thereby forms the analytic tree of stratification as shown in Figure 2.
When the analytic tree based on is as shown in Figure 2 resolved Syslog, first Syslog enters analytic tree from the root parse node of tree, and transmits between the different parse nodes on analytic tree.Wherein each parse node is only resolved a part of content in Syslog, and according to the trend of the parsing situation decision Syslog to this partial content (determine which subordinate's parse node Syslog should be transfused to, or finish the parsing to Syslog).
Arbitrarily the internal structure schematic diagram of parse node as shown in Figure 3, the numbering in Fig. 3 1.~concrete meaning is 6. as follows:
1. represent the information of this parse node of input.In this information, comprise from equipment and receive the original Syslog of coming, and point to the skew to be matched (if this parse node is root parse node, this skew to be matched can be considered as 0) of not resolving content in Syslog.If any such Syslog as shown in table 5 below:
Table 5:
Feb?518:27:38.269:%LINK-4-ERROR:FastEthernet0/4is?experiencing?errors
Suppose that wherein timestamp part (i.e. " Feb 518:27:38.269: ") is resolved complete by higher level's parse node of this parse node, higher level's parse node can be located " % " in code shown in skew Compass 5 to be matched, allow and skip as this parse node of subordinate's parse node the part that has completed parsing in daily record, and directly to the Context resolution after timestamp part.Scanning and the parsing that can avoid to greatest extent each parse node to carry out repetition to Syslog like this, improve analyzing efficiency greatly.
2. represent the regular expression that this parse node adopts, it mainly contains two following effects this regular expression in the course of work of parse node:
1, access control: only have the Syslog matching with regular expression just can be accepted by this parse node, and carry out subsequent treatment.Parse node can check the form of Syslog and integrality according to regular expression, thus the wrong or incomplete Syslog of content on queueing form.
2, field is extracted: the mode with name group in regular expression has defined the character string that need to extract from original log.A name group at least consists of the regular expression of field name and Related fields.Such as, if a kind of name group is the character string with " (? P<field name>regular expression) " format writing, represent, from this regular expression character string of mating with in bracket " () ", to extract field in "<>" as field corresponding to " field name ".And for example name group be " ^.*? (? P<keyword>[A-Z0-9_]+)-(? P<severity>d)-", wherein just comprised two field names: " keyword " and " severity ".
This parse node, after completing and utilizing regular expression to resolve accordingly Syslog, just arranges the skew to be matched in Syslog, so that subordinate's parse node of this parse node can be skipped the part being parsed in Syslog.Follow-up like this parse node just conducts interviews (cannot recall) without the content to before skew to be matched, thereby has effectively avoided the repeated matching operation to Syslog.
3. be daily record analysis result table, it can be set to and original Syslog binds and this table can transmit between each parse node on analytic tree.Complete parse node after the partial content of Syslog is resolved can be by the extraction result additional record of the name group except " keyword " in regular expression the afterbody at this daily record analysis result table.
By aforesaid operations mode, Syslog is every through a parse node, and the regular expression that just may be resolved node parses some fields and is recorded in this daily record analysis result table.After parse node completes the parsing of Syslog full content, in this daily record analysis result table, just obtained complete Syslog analysis result.A kind of concrete structure of analysis result table can be as shown in table 6 below:
Table 6:
Field name Field is extracted result
severity 4
interface FastEthernet0/4
... ...
4. the extraction result for utilizing " keyword " of regular expression in 2. to obtain." keyword " is the reserved field name of parse node, for determining How to choose subordinate parse node." keyword " field is only used parse node is inner, its extract result can not be output to analysis result table 3. in.The extraction result obtaining according to " keyword " field is only as following input 5..
⑤Wei subordinate parse node selector, its effect is the extraction result that obtains according to " keyword " field (also may be not according to this extraction result depending on actual conditions), and in conjunction with subordinate's parse node mapping table content 6., which subordinate's parse node decision should select as the subsequent treatment parse node of Syslog.In embodiments of the present invention, subordinate's parse node selector can be, but not limited to comprise as Types Below:
1, exhaustive selector
This class selector need to be according to the extraction result obtaining according to " keyword " field when selecting subordinate's parse node.
The exhaustive selector that parse node comprises arbitrarily selects the realization mechanism of subordinate's parse node to be: all subordinates parse node of this parse node is inputted original Syslog and skew to be matched in trial, until You Yige subordinate parse node has been accepted Syslog or traveled through all subordinates parse node.
It is very big that exhaustive selector is applicable to each Syslog content difference to be resolved, is difficult to judge by extracting certain feature field the situation of Syslog type.Two daily records as shown in following table 7,8:
Table 7:
Figure BDA0000061416600000171
Table 8:
Figure BDA0000061416600000172
Two Syslog of this shown in table 7, table 8 obviously on form difference very big, be difficult to by certain fixed field, determine Log Types simply.Now utilize exhaustive selector to attempt it just comparatively easy.
Yet exhaustive selector is used seldom in actual applications because in most situations exhaustive selector can do as one likes can be more outstanding canonical selector or character string selector replace.Below will specifically introduce canonical selector and character string selector, not repeat them here.
2, canonical selector
This class selector according to each subordinate's parse node respectively a corresponding regular expression check the extraction result obtaining according to " keyword ".Such as, canonical selector can attempt the extraction result obtaining according to " keyword " field to mate with all subordinates corresponding regular expressions of parse node successively, and select first parse node that the match is successful.
The situation that comprises variable content in the extraction result that canonical selector is mainly applicable to obtain according to " keyword ".Suppose that the extraction result obtaining according to " keyword " is from a daily record: logtype=16timestamp=123321323subtype=" nat ", wherein the numerical value of timestamp in every daily record is all different.So by a such regular expression: " logtype=16.*? subtype=" nat " " just can ignore the variable content in this extraction result, and the corresponding parse node of regular expression that all whiles and " logtype=16 " and " subtype=" nat " " are matched is as the parse node of selecting (a plurality of if the parse node matching has, therefrom to choose one as the parse node of selecting).
Canonical selector and exhaustive selector have certain similarity, and it still cannot be avoided applying mechanically a plurality of regular expressions daily record is mated to trial.Yet should be noted that: first, canonical selector need to mate the scope of attempting and only limit to the extraction result obtaining according to " keyword " field, therefore compares with exhaustive selector, and the content that canonical selector will mate seldom; Secondly, can be very simple for realizing the regular expression of subordinate's parse node selection, as long as its be enough to distinguish different subordinates parse node difference between the Syslog form that can process respectively, do not need to possess complete access checking ability and field extractability.From practical application, utilize respectively simple and complicated regular expression to carry out Syslog and mate the required time and may differ hundreds of times.Therefore even if canonical selector exists and attempts and the problem of coupling repeatedly, but because it has strictly limited matching range and has used very simple regular expression to mate trial, in performance cost still far below exhaustive selector.
Three, character string selector
This class selector requires all corresponding unique character string of each subordinate's parse node.Character string selector will be selected with the corresponding parse node of the identical character string content of extraction result obtaining according to " keyword " as the follow-up subordinate's parse node that carries out log processing.
Function, the function of canonical selector has contained the function of character string selector.But character string selector can obtain than the much higher performance of canonical selector performance by mechanism such as hash tables when specific implementation.
Character string selector is mainly applicable to " keyword " and extracts the situation that does not have variable content in result.
The keyword ratio juris that the realization mechanism of character string selector is partly introduced to background technology has some similar, but the realization mechanism of character string selector is much more flexible than keyword method.This is because the application prerequisite of keyword method is to determine by common header the form of whole daily record; Yet in the realization mechanism of character string selector, only require and can distinguish select which subordinate's parse node by " keyword ".Therefore, in the realization mechanism of character string selector, choosing " keyword " is actually very freely.As: subordinate's parse node of supposing current parse node only has two, respectively two Syslog shown in corresponding parsing following table 9,10:
Table 9:
Figure BDA0000061416600000181
Table 10:
Figure BDA0000061416600000191
So now as long as being just enough to decision, the first character (extracting the first character of Syslog as the content of " keyword " field) of judgement Syslog should select which subordinate's parse node.Visible, even if do not possess fixing common header (being public " keyword " field) in original Syslog, still can utilize character string selector to choose neatly " keyword ", and complete efficiently according to " keyword " work of selecting subordinate's parse node.
It should be noted that above three kinds of selectors are only optional selector in a kind of concrete execution mode.And in fact, can add according to actual needs the selector of newtype.Because each parse node all completes the selection work of subordinate's parse node independently, so add new selector type, be very easy to, in any parse node, newly-increased selector or change selector type can not have influence on the implementation that other parse nodes are selected subordinate's parse node.So the autgmentability of the mode of choosing of subordinate's parse node that the embodiment of the present invention provides is very strong.
⑥Wei subordinate parse node mapping table.In this subordinate's parse node mapping table of storing in parse node arbitrarily, all store some relevant informations of subordinate's parse node of this parse node.A kind of concrete structure of subordinate's parse node mapping table can be as shown in table 11 below:
Table 11:
Figure BDA0000061416600000192
" subordinate's parse node name " in table 11 navigates to corresponding subordinate's parse node for unique.According to concrete technology, realize the difference of means, subordinate's parse node name can be the unique identifier (being generally subordinate's parse node title) of subordinate's parse node, also can be for pointing to the pointer of subordinate's parse node data structure or quoting of subordinate's parse node object.In the specific implementation can be for navigating to any characteristic information of subordinate's parse node.
In table 11 " type decided of parse node condition ”You subordinate parse node selector, specifically can be, but not limited to following several situation:
When the selector of subordinate's parse node is exhaustive selector, corresponding " parse node condition " is empty, and the corresponding part of table 11 does not need to store any content.Because exhaustive selector relies on subordinate's parse node self to judge that can it carry out subsequent treatment to daily record, therefore need to not place parse node condition in the relevant position of table 11.
When the selector of subordinate's parse node is canonical selector, corresponding " parse node condition " regular expression for mating for the extraction result obtaining according to " keyword ".If the extraction result and the regular expression that obtain according to " keyword " match, illustrate that daily record can pass to " subordinate's parse node name " indicated parse node corresponding to this " parse node condition " and carry out subsequent treatment.
When the selector of subordinate's parse node is character string selector, corresponding " parse node condition " is fixed character string, and require this character string in subordinate's parse node mapping table as shown in table 11, be unique, do not allow repetition.If the extraction result obtaining according to " keyword " when stored word symbol string is identical in form corresponding to " the parse node condition " of a certain subordinate parse node, determines that daily record can pass to this subordinate's parse node and carry out subsequent treatment.
What deserves to be explained is, the last column in subordinate's parse node mapping table as shown in table 11 is for the relevant information of storage " acquiescence parse node ".When the subordinate's parse node selector in a certain parse node cannot be arranged in subordinate's parse node name before acquiescence subordinate parse node name and selects suitable subordinate parse node name from table 11 (when table 11 parse node of giving tacit consent to all parse node names indications before subordinate's parse node name all cannot carry out subsequent treatment to daily record further), subordinate's parse node selector can be chosen acquiescence subordinate parse node name (usually, acquiescence subordinate parse node name can be preset as sky).Now according to subordinate's parse node of choosing empty (or can be also other signs that parse node no longer includes subordinate's parse node that are used to indicate of making an appointment) by name, can determine that daily record cannot further be resolved by any subordinate parse node, therefore the transmission aiming at day on analytic tree will be so far, and resolving stops.
What need to stress is, generally, each parse node only comprises subordinate's parse node selector of a type, but in actual applications, can utilize the relevant information of " acquiescence parse node " in subordinate's parse node mapping table, dissimilar subordinate's parse node selector is combined, form subordinate's parse node selector of mixed type.Form mixed type subordinate's parse node selector realize schematic diagram as shown in Figure 4.The relevant information that this schematic diagram has provided " acquiescence parse node " based in subordinate's parse node mapping table realizes the schematic diagram of subordinate's parse node selector of character string selector and the combined and a kind of mixed type that obtains of exhaustive selector.The realization mechanism of this schematic diagram is made an explanation and is: the subordinate's parse node first being undertaken based on character string by the first parse node that comprises character string selector is selected, and daily record to be resolved is delivered to corresponding subordinate's parse node.The acquiescence parse node that comprises exhaustive selector is all delivered in the daily record that wherein cannot be processed by subordinate's parse node of the first parse node, and the exhaustive selector that further utilizes this acquiescence parse node to comprise selects subordinate's parse node of this acquiescence parse node to carry out dissection process to daily record.By that analogy, can also realize by implementation as shown in Figure 4 subordinate's parse node selector of other mixed types.
The data that in the whole resolving that daily record is resolved, analytic tree and parse node comprised due to the embodiment of the present invention are all without any write operation, so the scheme that the embodiment of the present invention provides is very easy to support multi-thread concurrent to process.That is: same analytic tree can directly be resolved thread use and not need to read and write the conservation treatment of mutual exclusion and so on by any number of daily records, can easily realize by increasing simply the quantity of system processor and thread the lifting of performance like this.
In the concrete enforcement of the such scheme providing in the embodiment of the present invention, the structure of above-mentioned analytic tree and parse node can be by any formal construction that can be understood by computer.As can be directly encoded and be solidificated in software by program language, or import with the form of outside configuration file.Yet in flexibility, consider, the optional configuration file passing through based on extend markup language (XML, Extensible MarkupLanguage) defines analytic tree and parse node in embodiments of the present invention.Like this for realizing the running of the daily record parse node equipment of the such scheme that the embodiment of the present invention provides, can realize the modification of analytic tree or augment by revising the mode of configuration file, and in this process, not need software code to carry out any modification.
The concrete configuration file format of analytic tree is as shown in table 12 below:
Table 12:
<ParseTree?model=″RGOS″vendor=″Ruijie″version=″1.0″>......</ParseTree>
Code definition in table 12 analytic tree (being ParseTree), for the Syslog that designated equipment (the equipment here refers to any equipment that can generate Syslog, comprises the softwares such as the hardware such as switch, router or server OS, application service) is generated, resolve.Model in code is the model of this designated equipment, the manufacturer that vendor is this equipment, the version number that version is this equipment.By model-vendor-version three elements, can uniquely determine a kind of equipment.
In an analytic tree, can comprise a plurality of parse nodes, the relation between parse node is by nested embodiment.Following table 13 has defined a root parse node corresponding to Liang Ge subordinate parse node:
Table 13:
Figure BDA0000061416600000221
In above-mentioned code shown in table 13, some important meaning of parameters are as follows:
Action has determined whether parse node can change the deviant to be matched of original Syslog.Its value can be " strip " (mobile skew to be matched, the part of being resolved by current parse node in Syslog like this cannot be accessed by the follow-up parse node that Syslog is resolved) or " match " (skew to be matched is not carried out to any modification, the part of being resolved by current parse node in such Syslog so still can be accessed by the follow-up parse node that Syslog is resolved).Usually, the default value of action is " match ".
Below by a concrete example, specifically introduce the realization mechanism of action setting up procedure:
Suppose to have such two kinds of journal formats, be a kind ofly<buyer bought<sth>from<seller>, as Tom bought a house form Kate; And another kind is:<seller>sold<sth>to<buyer>, as Tom sold a car to Bill.
Suppose to need analytic tree can resolve this two kinds of daily records, and therefrom extract buyer, sth and seller, so obvious, before the middle verb of judgement is not sold or bought, cannot know that first name is buyer or seller.So in structure parse node, can construct like this:
Figure BDA0000061416600000231
First parse node of daily record being resolved must be the parse node that an action is set to match, and the task of this parse node just judges that this is a daily record that comprises " sold " this verb or a daily record that comprises " bought " this verb.Action that now can not first parse node is set to strip, because also have the name that need to be resolved and extract by the downstream site of first parse node before keyword bought/sold.
Below continue to introduce the implication of other parameters except action:
Pattern has defined the regular expression of parse node, wherein can comprise the name group with " (? P<field name>regular expression) " definition.What pay particular attention to is according to XML standard, and all "<" number will be written as “ & Lt; ", and ">" number to be written as “ & Gt; ".So regular expression that comprises two name groups " keyword " and " severity ": ^.*? (? P<keyword>[A-Z0-9_]+)-(? P<severity>d)-write on and in XML configuration file, just become form as follows: ^.*? (? P& Lt; Keyword& Gt; [A-Z0-9_]+)-(? P& Lt; Severity& Gt; D)-.This problem is that the restriction by XML standard causes.
Name has defined the title of parse node.In parse node at the same level, the title of parse node is unique arbitrarily.Subordinate's parse node title only needs definition when using canonical selector or character string selector, otherwise can be anonymous (not being with name attribute).
Event has defined event type corresponding to parse node.When completing, day aim at after the transmission on analytic tree, its can be endowed last parse node of accepting it with event type, to realize the implication of unique this daily record of sign.If a parse node is without event type attribute, this parse node generally can be counted as a middle parse node.Should be able to not appear at this parse node place and finish the situation to the resolving of daily record.If there is the situation of resolving with the parse node place end log of event type attribute not, illustrate that format error or the daily record of daily record for resolving may occur in this daily record, can abandon this daily record as illegal daily record.
Keyword has defined subordinate's parse node mapping table of parse node, wherein: matchby has defined the type of subordinate's parse node selector, its value can be " null " (exhaustive selector), " string " (character string selector) and " regex " (canonical selector).
NodeMap is the list item of subordinate's parse node mapping table, wherein: condition is parse node condition, according to the difference of selector type, can be sky, regular expression or character string.When using character string selector, pattern can be variable " keyword ", represents that its value is for the extraction result of " keyword " field in parse node regular expression.Node is subordinate's parse node name of subordinate's parse node.Value can be variable " keyword ", represents that subordinate's parse node name is the extraction result of " keyword " field in parse node regular expression.This design can be avoided defining a large amount of mapping table list items, because most of subordinate parse node is all to name according to the extraction result obtaining according to " keyword " field in actual applications.Default attribute flags the acquiescence parse node in subordinate's parse node mapping table.The default attribute that only allows maximum 1 NodeMap in table is " true ".
The configuration code to parse node based on shown in table 13, can complete the definition to a complete analytic tree.Code in following table 14 is the definition to complete analytic tree:
Table 14:
Figure BDA0000061416600000251
Based on foregoing, the process that a Syslog is resolved of take is below example, and the working condition of the defined analytic tree of configuration file as shown in table 14 is described.The idiographic flow schematic diagram of resolving please refer to Fig. 5.
Step 51, first, this Syslog is transferred to the root parse node A of analytic tree;
In the embodiment of the present invention, can suppose theing contents are as follows shown in table 15 of this Syslog.
Table 15:
Figure BDA0000061416600000261
Step 52, root parse node A utilizes the Syslog in the regular expression his-and-hers watches 15 that self possess to mate, and by coupling, extracts the parameter comprising in the field consistent with matching regular expressions from Syslog;
In the invention process embodiment, suppose that the regular expression that root parse node possesses is:
" ^.*? (? P<keyword>[A-Z0-9_]+)-(? P<severity>d)-", according to this regular expression, can from table 15, extract " keyword " is " LINK ", and " severity " is " 5 ";
Step 53, root parse node A outputs to the extraction result except " keyword " in analysis result table as shown in table 6, obtains result as shown in table 16 below:
Table 16:
Field name Field is extracted result
severity 5
Step 54, because the action of root parse node A is set to " strip " (i.e. mobile skew to be matched, the part of being resolved by current parse node in Syslog like this cannot be accessed by the follow-up parse node that Syslog is resolved), therefore root parse node A is after completing the parsing of Syslog, position according to the part of being resolved by root parse node A in Syslog in whole Syslog, mobile skew to be matched is modified, mobile skew to be matched is revised as to the end of pointing to by the part of root parse node A parsing and locates.Waiting until the content that subordinate's parse node of root parse node A mates becomes as shown in table 17 below:
Table 17:
CHANGED:Interface?Serial3/0.1/2/4/2:0changed?state?to?administratively?down
Step 55, subordinate's parse node selector of supposing root parse node A is character string selector (matchby=" string "), this character string selector is according to the value of the keyword extracting in step 52, be that keyword=" LINK " searches subordinate's parse node mapping table, the selected parse node B ,Gai parse node B of subordinate of subordinate meets: its parse node condition is fixed character string " LINK ";
Step 56, root parse node A sends to the parse node B of subordinate by Syslog and amended movement skew to be matched;
Step 57, parse node B, according to amended movement skew to be matched, navigates to content shown in table 17 in Syslog, and its row is resolved;
Suppose the regular expression that parse node B uses " ^ (? P<keyword>[A-Z0-9_]+): s* ", according to the Syslog shown in these regular expression his-and-hers watches 17, mate, can obtain " keyword " is " CHANGED ".
It should be noted that in the regular expression due to parse node B use, do not comprise other name groups except " keyword ", so parse node B does not have the extraction result that can export.
Step 58, because the action of parse node B is set to " strip ", therefore parse node B is after completing the parsing of Syslog, need to be according to part that in Syslog, resolved Node B is resolved the position in whole Syslog, mobile skew to be matched is modified, mobile skew to be matched is revised as to the place, end of the part of pointing to resolved Node B parsing.
Waiting until the content that subordinate's parse node of parse node B mates becomes as shown in table 18 below:
Table 18:
Interface?Serial3/0.1/2/4/2:0changed?state?to?administratively?down
Step 59, suppose that the subordinate's parse node selector in Node B to be resolved is also character string selector (matchby=" string "), this character string selector is according to the value of the keyword extracting in step 52, be that keyword=" CHANGED " searches the subordinate's parse node mapping table in parse node B, the selected parse node C ,Gai parse node B of subordinate of subordinate meets: its parse node condition is fixed character string " CHANGED ";
Step 510, parse node B sends to parse node C by Syslog and self amended movement skew to be matched;
Step 511, parse node C transmits according to parse node B the amended movement skew to be matched of coming, and navigates to content shown in table 18 in Syslog, and its row is resolved;
Suppose regular expression that parse node C uses for " ^Interface (? P<intf>[A-Za-z]+s? [0-9] [0-9/. :]+) changed state to (? P<stat>[w]+) ", can from content shown in table 18, extract the field that matches with " intf " for the field that " Serial3/0.1/2/4/2:0 " and " stat " match is " down ", and output in analysis result table.
Step 512, suppose that parse node C is the leaf parse node (parse node that no longer includes any subordinate parse node can be called leaf parse node) of analytic tree, the parse node C that matches so Syslog being carried out has just gone to terminal here on analytic tree.So event type (the defined event type of event in table 13) information set when parse node C can be configured is defined as the event type information of this Syslog.Such as the defined event type of hypothesis event is " LINK_CHANGED ", the event type of this Syslog is " LINK_CHANGED ".Further, according to the analysis result table as shown in table 19 below of recording, obtain the field information relevant to the event type of Syslog in the process that Syslog is resolved, thereby realize the object that Syslog is resolved.
Table 19:
Field name Field is extracted result
severity 5
intf Seria13/0.1/2/4/2:0
stat down
The daily record analytic method providing corresponding to the embodiment of the present invention, the embodiment of the present invention also provides a kind of daily record parse node equipment, and the concrete structure schematic diagram of this equipment as shown in Figure 6, comprises following functions unit:
First obtains unit 61, for obtaining daily record and being used to indicate the skew to be matched that content is not resolved in daily record;
Second obtains unit 62, for utilizing the first regular expression of storage, not parsing content in the daily record that the first indicated acquisition unit 61 of skew to be matched that the first acquisition unit 61 is obtained obtains is resolved, and obtains the field information with the first matching regular expressions;
Judging unit 63, for judging whether to exist subordinate's parse node equipment;
Event type information determining unit 64, for obtaining at judging unit 63, the determination result is NO, and daily record parse node equipment is pre-stored for as shown in Figure 6 this while having event type information, this event type information is defined as to the event type information of daily record institute recording events, wherein, event type information be according to daily record be delivered to daily record parse node equipment the regular expression of at least one daily record parse node device storage of comprising on the path of the process field information that can parse from daily record and definite.
In an optional embodiment, this daily record parse node equipment as shown in Figure 6 can further include:
Skew determining unit to be matched, while for judgment result is that of obtaining at judging unit 63 being, determines that according to skew to be matched mode determines skew to be matched;
Unit is provided, for daily record and the definite skew to be matched of skew determining unit to be matched that the first acquisition unit 61 is obtained, offer and can resolve the Yi Ge subordinate parse node equipment that content is not resolved in daily record according to appointment analysis mode, wherein, skew to be matched determines that mode is to arrange according to the form of daily record and the predetermined field information that need determine from daily record.
Optionally, in embodiments of the present invention, provide the unit specifically can be for the mode of daily record and described definite skew to be matched is provided according to Xiang Yige subordinate parse node equipment once, the skew to be matched that successively daily record is provided and determines to each subordinate's parse node equipment, until receive after the acknowledge message of Yi Ge subordinate parse node equipment transmission, stop the skew to be matched that daily record is provided and determines;
Wherein, acknowledge message Wei Yige subordinate parse node equipment is according to the second regular expression of self storage, determines and can send after content according to specifying analysis mode to resolve not resolve in daily record.
In addition,, according to the different implementations that Elementary Function is provided, can also adopt following two kinds of dividing mode to providing the structure of unit to divide.
The first dividing mode is to provide dividing elements to obtain module, key field information analysis module, regular expression determination module, choose module, module is provided for key field, and wherein the function of modules is as follows:
Key field obtains module, for obtaining the key field of the first regular expression;
Key field information analysis module, obtains according to key field the key field that module obtains, and parses key field information from described daily record;
Regular expression determination module, the second regular expression of using respectively for each subordinate's parse node equipment from pre-stored, determines the second regular expression of the key field information matches going out with key field information analysis module parses;
Choose module, for from using subordinate's parse node equipment of the second definite regular expression of regular expression determination module, choose Yi Ge subordinate parse node equipment;
Module is provided, for and by described daily record and described definite skew to be matched, offer the subordinate's parse node equipment that module is chosen of choosing.
The first dividing mode is to provide dividing elements to obtain module, key field information analysis module, determination module, choose module, module is provided for key field, and wherein, the major function of each module is as follows:
Key field obtains module, for obtaining the key field of the first regular expression;
Key field information analysis module, obtains according to key field the key field that module obtains, and parses key field information from described daily record;
Determination module, for the field information from distributing respectively for each subordinate's parse node equipment in advance, determines the field information of the key field information matches going out with key field information analysis module parses;
Choose module, for choosing Yi Ge subordinate parse node equipment from the corresponding subordinate of the definite field information of determination module parse node equipment;
Provide module, for described daily record and described definite skew to be matched are offered to the subordinate's parse node equipment that module is chosen of choosing.
Optionally, provide the unit can also be for do not exist can be according to specifying analysis mode to resolve daily record while not resolving the subordinate parse node equipment of content, daily record and the skew to be matched determined are offered to preassigned acquiescence parse node.
In the daily record parsing scheme that the embodiment of the present invention provides, each parse node is after completing the parsing task of own responsible log content, can be by redefining the mode of skew to be matched, the part being resolved " is clipped ", be resolved content and add a sign, thereby subordinate's parse node is follow-up, do not need again to resolve log content and do meaningless repeated resolution completing, improved analyzing efficiency.
The scheme that the embodiment of the present invention provides is with a kind of didactic mechanism works of successively refining: except being positioned at the leaf at the end of analytic tree, analyse node, other parse nodes are not all known the event type information of this daily record institute recording events when resolving daily record.But every grade of parse node can select mechanism in conjunction with the current log feature information (keyword) obtaining by built-in subordinate's parse node, daily record is passed to the subordinate's parse node processing that can further resolve this daily record.When daily record has been finalized corresponding event type information, wherein the field contents of required extraction is also extracted complete by parse nodes at different levels simultaneously.This mechanism is also the key that scheme that the embodiment of the present invention provides can reach high log processing performance, and it will determine that trial and conjecture that Log Types is required process the minimum degree that is reduced to.Even if need, by exhaustive or canonical selector, the partial content of daily record is attempted to conjecture under minority extreme case, its number of times that need to attempt also can not surpass the number of subordinate's parse node of this parse node under worst case.
Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention is also intended to comprise these changes and modification interior.

Claims (4)

1. a daily record analytic method, is characterized in that, comprising:
Parse node obtains daily record and is used to indicate the skew to be matched of not resolving content in daily record; And
Utilize the first regular expression of storage, the not parsing content that described skew to be matched is indicated is resolved, obtain the field information with the first matching regular expressions; And
Judge whether to exist subordinate's parse node;
The determination result is NO, and described parse node is pre-stored while having event type information, described parse node is defined as described event type information the event type information of described daily record institute recording events, wherein, described event type information be according to described daily record be delivered to described parse node the regular expression of at least one parse node storage of comprising on the path of the process field information that can parse from daily record and definite;
Judgment result is that while being, described parse node determines that according to skew to be matched mode determines skew to be matched, and described daily record and definite skew to be matched are offered and can resolve the Yi Ge subordinate parse node of not resolving content in daily record according to appointment analysis mode, wherein, described skew to be matched determines that mode is to arrange according to the form of daily record and the predetermined field information that need determine from daily record;
Wherein, described parse node offers described daily record and described definite skew to be matched, according to specifying analysis mode to resolve the Yi Ge subordinate parse node of not resolving content in daily record, specifically to comprise:
According to the mode of described daily record and described definite skew to be matched is once provided to a described subordinate parse node, successively to subordinate's parse node described in each, provide described daily record and described definite skew to be matched, until receive after the acknowledge message of Yi Ge subordinate parse node transmission, stop providing described daily record and described definite skew to be matched;
Wherein, described acknowledge message is described Yi Ge subordinate parse node according to the second regular expression of self storage, determines can resolve not resolve in described daily record to send after content; Or
Described parse node offers described daily record and described definite skew to be matched, according to specifying analysis mode to resolve the Yi Ge subordinate parse node of not resolving content in daily record, specifically to comprise:
Described parse node obtains the key field in the first regular expression, and from described daily record, parses key field information according to the key field obtaining;
The second regular expression that described parse node used respectively from each pre-stored subordinate's parse node, determine the second regular expression with described key field information matches; And
Described parse node is chosen Yi Ge subordinate parse node from subordinate's parse node of the second regular expression of use and described key field information matches, and by described daily record and described definite skew to be matched, offers this subordinate's parse node of choosing; Or
Described parse node offers described daily record and described definite skew to be matched, according to specifying analysis mode to resolve the Yi Ge subordinate parse node of not resolving content in daily record, specifically to comprise:
Described parse node obtains the key field in the first regular expression, and from described daily record, parses key field information according to the key field obtaining;
Described parse node, from the field information distributing respectively for each subordinate's parse node in advance, is determined the field information with described key field information matches; And
Described parse node is chosen Yi Ge subordinate parse node from the corresponding subordinate of definite field information parse node, and described daily record and described definite skew to be matched are offered to this subordinate's parse node of choosing.
2. the method for claim 1, is characterized in that, also comprises:
Can be when specifying analysis mode to resolve not resolve in daily record the subordinate parse node of content when not existing, described parse node offers preassigned acquiescence parse node by described daily record and described definite skew to be matched.
3. a daily record parse node equipment, is characterized in that, comprising:
First obtains unit, for obtaining daily record and being used to indicate the skew to be matched that content is not resolved in daily record;
Second obtains unit, for utilizing the first regular expression of storage, not parsing content in the daily record that the first indicated acquisition unit of skew to be matched that the first acquisition unit is obtained obtains is resolved, and obtains the field information with the first matching regular expressions;
Judging unit, for judging whether to exist subordinate's parse node equipment;
Event type information determining unit, for obtaining at judging unit, the determination result is NO, and described daily record parse node equipment is pre-stored while having event type information, described event type information is defined as to the event type information of described daily record institute recording events, wherein, described event type information be according to described daily record be delivered to described daily record parse node equipment the regular expression of at least one daily record parse node device storage of comprising on the path of the process field information that can parse from daily record and definite;
Skew determining unit to be matched, while for judgment result is that of obtaining at judging unit being, determines that according to skew to be matched mode determines skew to be matched;
Unit is provided, for daily record and the definite skew to be matched of skew determining unit to be matched that the first acquisition unit is obtained, offer and can resolve the Yi Ge subordinate parse node equipment that content is not resolved in daily record according to appointment analysis mode, wherein, described skew to be matched determines that mode is to arrange according to the form of daily record and the predetermined field information that need determine from daily record;
Wherein, described in, provide unit specifically for:
According to the mode of described daily record and described definite skew to be matched is once provided to a described subordinate parse node equipment, successively to subordinate's parse node equipment described in each, provide described daily record and described definite skew to be matched, until receive after the acknowledge message of Yi Ge subordinate parse node equipment transmission, stop providing described daily record and described definite skew to be matched;
Wherein, described acknowledge message is described Yi Ge subordinate parse node equipment according to the second regular expression of self storage, determines can resolve not resolve in described daily record to send after content; Or
The described unit that provides specifically comprises:
Key field obtains module, for obtaining the key field of the first regular expression;
Key field information analysis module, obtains according to key field the key field that module obtains, and parses key field information from described daily record;
Regular expression determination module, the second regular expression of using respectively for each subordinate's parse node equipment from pre-stored, determines the second regular expression of the key field information matches going out with key field information analysis module parses;
Choose module, for from using subordinate's parse node equipment of the second definite regular expression of regular expression determination module, choose Yi Ge subordinate parse node equipment;
Module is provided, for and by described daily record and described definite skew to be matched, offer the subordinate's parse node equipment that module is chosen of choosing; Or
The described unit that provides specifically comprises:
Key field obtains module, for obtaining the key field of the first regular expression;
Key field information analysis module, obtains according to key field the key field that module obtains, and parses key field information from described daily record;
Determination module, for the field information from distributing respectively for each subordinate's parse node equipment in advance, determines the field information of the key field information matches going out with key field information analysis module parses;
Choose module, for choosing Yi Ge subordinate parse node equipment from the corresponding subordinate of the definite field information of determination module parse node equipment;
Provide module, for described daily record and described definite skew to be matched are offered to the subordinate's parse node equipment that module is chosen of choosing.
4. equipment as claimed in claim 3, it is characterized in that, the described unit that provides is not also for existing and can, according to specifying analysis mode to resolve daily record while not resolving the subordinate parse node equipment of content, described daily record and described definite skew to be matched being offered to preassigned acquiescence parse node.
CN201110125560.6A 2011-05-16 2011-05-16 Log parsing method and log parsing node device Active CN102164050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110125560.6A CN102164050B (en) 2011-05-16 2011-05-16 Log parsing method and log parsing node device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110125560.6A CN102164050B (en) 2011-05-16 2011-05-16 Log parsing method and log parsing node device

Publications (2)

Publication Number Publication Date
CN102164050A CN102164050A (en) 2011-08-24
CN102164050B true CN102164050B (en) 2014-01-22

Family

ID=44465038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110125560.6A Active CN102164050B (en) 2011-05-16 2011-05-16 Log parsing method and log parsing node device

Country Status (1)

Country Link
CN (1) CN102164050B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104717085B (en) * 2013-12-16 2018-05-01 中国移动通信集团湖南有限公司 A kind of daily record analysis method and device
CN103986974B (en) * 2014-06-05 2018-01-19 安一恒通(北京)科技有限公司 Video loads decision method and device
CN107660283B (en) 2015-04-03 2021-12-28 甲骨文国际公司 Method and system for implementing a log parser in a log analysis system
US11226975B2 (en) 2015-04-03 2022-01-18 Oracle International Corporation Method and system for implementing machine learning classifications
US11727025B2 (en) 2015-04-03 2023-08-15 Oracle International Corporation Method and system for implementing a log parser in a log analytics system
CN105447099B (en) * 2015-11-11 2018-12-14 中国建设银行股份有限公司 Log-structuredization information extracting method and device
CN106021554A (en) * 2016-05-30 2016-10-12 北京奇艺世纪科技有限公司 Log analysis method and device
CN106547658B (en) * 2016-10-28 2020-10-27 阿里巴巴(中国)有限公司 Automatic testing method and device
CN106598827B (en) * 2016-12-19 2019-05-31 东软集团股份有限公司 Extract the method and device of daily record data
CN108268545B (en) * 2016-12-30 2020-11-10 中移(杭州)信息技术有限公司 Method and device for establishing hierarchical user label library
CN108694213A (en) * 2017-04-12 2018-10-23 艺龙网信息技术(北京)有限公司 Generation method, search method and the device of journal file
CN108984221B (en) * 2017-05-31 2021-07-20 北京京东尚科信息技术有限公司 Method and device for acquiring multi-platform user behavior logs
CN108595310A (en) * 2017-12-28 2018-09-28 北京兰云科技有限公司 A kind of log processing method and device
US11681944B2 (en) 2018-08-09 2023-06-20 Oracle International Corporation System and method to generate a labeled dataset for training an entity detection system
CN110929121B (en) * 2018-09-20 2023-07-04 中国石油化工股份有限公司 Log analysis-based seismic data processor time refinement calculation method and system
CN109522391A (en) * 2018-11-27 2019-03-26 兰州智华辰宇交通科技有限公司 The vehicle-mounted log analysis of subway O&M and method for early warning
CN110109957B (en) * 2019-03-29 2021-10-01 奇安信科技集团股份有限公司 Streaming event correlation matching method and device
CN112882713B (en) * 2019-11-29 2024-03-12 北京数安鑫云信息技术有限公司 Log analysis method, device, medium and computer equipment
CN112463772B (en) * 2021-02-02 2022-05-27 北京信安世纪科技股份有限公司 Log processing method and device, log server and storage medium
CN113051086B (en) * 2021-03-16 2022-01-18 贝壳找房(北京)科技有限公司 Data processing method and device, electronic equipment and storage medium
CN114840599B (en) * 2022-07-05 2022-11-01 杭州广立微电子股份有限公司 Semiconductor source data parsing method, ETL system, computer device and product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101651679A (en) * 2009-09-16 2010-02-17 清华大学 Data frame analyzing and processing system and method based on tree structure
CN101789174A (en) * 2009-12-29 2010-07-28 北京世纪高通科技有限公司 Journal monitoring method and device
CN101931562A (en) * 2010-09-29 2010-12-29 杭州华三通信技术有限公司 Web log processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100357900C (en) * 2005-01-20 2007-12-26 上海复旦光华信息科技股份有限公司 Automatic extraction and analysis for formwork based on heterogenerous logbook
US7860881B2 (en) * 2006-03-09 2010-12-28 Microsoft Corporation Data parsing with annotated patterns

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101651679A (en) * 2009-09-16 2010-02-17 清华大学 Data frame analyzing and processing system and method based on tree structure
CN101789174A (en) * 2009-12-29 2010-07-28 北京世纪高通科技有限公司 Journal monitoring method and device
CN101931562A (en) * 2010-09-29 2010-12-29 杭州华三通信技术有限公司 Web log processing method and device

Also Published As

Publication number Publication date
CN102164050A (en) 2011-08-24

Similar Documents

Publication Publication Date Title
CN102164050B (en) Log parsing method and log parsing node device
CN101192171B (en) Method and system for transforming a single language program into multiple language programs
CN100555279C (en) Item synchro system and method without snapshot
US7779386B2 (en) Method and system to automatically regenerate software code
JP5431513B2 (en) Interpreting command scripts using local and extended storage for command indexing
US20010039540A1 (en) Method and structure for dynamic conversion of data
EP1231545A2 (en) Multi-language compatible information system
US20080155239A1 (en) Automata based storage and execution of application logic in smart card like devices
CN101183379A (en) Attribute level federation from multiple data sources
CN109740122A (en) The conversion method and device of mind map use-case file
CN112306645B (en) Transaction processing method, device, equipment and medium for Ether house virtual machine
CN113434175B (en) Data processing method, device, storage medium and equipment
CN103905231A (en) Method and device for unified management of device types
CN109002470A (en) Knowledge mapping construction method and device, client
CN113868252A (en) Database mode matching method and device and SQL query statement generation method
CN106598885B (en) A kind of working method of configurable multiport general data bridge system
CN105843899A (en) Automatic big-data analysis method and system capable of simplifying programming
CN101859246A (en) System and method for converting corresponding scripts according to different browsers
CN111581212A (en) Data storage method, system, server and storage medium of relational database
CN101794240A (en) Method and system for aggregating data content
US7536398B2 (en) On-line organization of data sets
US20090276472A1 (en) Data processing system and method
CN103188117B (en) Information interaction server simulation testing device and method
US9020972B1 (en) System and method for constructing a database instruction
US20020059390A1 (en) Integration messaging system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20201216

Address after: 200030 full floor, 4 / F, 190 Guyi Road, Xuhui District, Shanghai

Patentee after: Shanghai Ruishan Network Co., Ltd

Address before: 100036 11 / F, East Building, Zhongyi pengao building, 29 Fuxing Road, Haidian District, Beijing

Patentee before: Beijing Star-Net Ruijie Networks Co.,Ltd.

TR01 Transfer of patent right