CN109144964A - log analysis method and device based on machine learning - Google Patents
log analysis method and device based on machine learning Download PDFInfo
- Publication number
- CN109144964A CN109144964A CN201810957288.XA CN201810957288A CN109144964A CN 109144964 A CN109144964 A CN 109144964A CN 201810957288 A CN201810957288 A CN 201810957288A CN 109144964 A CN109144964 A CN 109144964A
- Authority
- CN
- China
- Prior art keywords
- log
- group
- item
- information
- functional value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides the log analysis method and devices based on machine learning, comprising: obtains original log information;Original log information is grouped by dimension, obtains multiple groups log information, wherein includes multiple log text informations in every group of log information, each log text information includes N number of character string, and N is greater than and is equal to 2;Right according to N number of character string M item of composition, M is greater than and is equal to 1;According to clustering algorithm by multiple groups log information and item to being clustered into log event classification group;Every group of highest log event of frequency is chosen from log event classification group;Log event based on selection generates log template, so as to improve log analyzing efficiency and precision.
Description
Technical field
The present invention relates to field of computer technology, more particularly, to the log analysis method and device based on machine learning.
Background technique
In order to guarantee system information safety, log is almost the indispensable a part of all systems.Log mainly by with
The information that the generates when operation of record system, such as the exception of system, regular job, user behavior event association attributes with
Information.These information have very important work to the operating status for understanding system and using the user behavior habit etc. of the system
With, therefore it is usually used in system exception monitoring, system user behavioural analysis etc..
Data volume with system user scale, the growth of system complexity, log increases therewith, the developer of system or
The behavior of state and system user when the O&M person of person's system will run according to log information monitoring system abundant, with
This goes to the source of tracking system abnormal problem, and prediction user uses the behavior etc. of system.General relatively conventional log parses skill
Art is based on regular expression and extracts mode, then the mode based on extraction is simply classified.The major defect of this technology is
The journal format processing accuracy of diversification is very low, and performance is also comparatively low.
Summary of the invention
In view of this, improving day the purpose of the present invention is to provide the log analysis method and device based on machine learning
Will analyzing efficiency and precision.
In a first aspect, the embodiment of the invention provides the log analytic methods based on machine learning, which comprises
Obtain original log information;
The original log information is grouped by dimension, obtains multiple groups log information, wherein every group of log information
In include multiple log text informations, each log text information includes N number of character string, and N is greater than and is equal to 2;
Right according to M item of N number of character string composition, M is greater than and is equal to 1;
According to clustering algorithm by the multiple groups log information and the item to being clustered into log event classification group;
Every group of highest log event of frequency is chosen from the log event classification group;
Log event based on selection generates log template.
Further, described to be returned the multiple groups log information and the item to log event is clustered into according to clustering algorithm
Class group includes repeating following iterative processing, until each log text information is traversed:
It is right based on the item, the log text information is calculated in the first potential functional value currently organized, and is worked as to described
Preceding group is marked;
Calculate second potential functional value of the log text information in unmarked group;
Described first potential functional value is compared with the described second potential functional value;
If the second potential functional value be greater than the described first potential functional value, update the log text information from
Described current group is moved to unmarked group of the information;
If the second potential functional value is equal to the described first potential functional value, using current group as the log
Event classification group.
Further, described right based on the item, the log text information is calculated in the first potential function currently organized
Value, comprising:
The described first potential functional value is calculated according to the following formula:
Wherein, ω (B) is the described first potential functional value, is the log text to r ∈ R (B), N (r, B) for the item
It include the item in this information B to the log quantity of r, p (r, B)=N (r, B)/| B | to include in the log-file information B
The item calculates the log proportion of r, the second potential functional value by above-mentioned formula.
It is further, described that every group of highest log event of frequency is chosen from the log event classification group, comprising:
Count in the log event classification group frequency of each item to appearance in every group of log information;
Every group of log information middle term is reached into the item of pre-determined number to as candidate item to the frequency of appearance;
By the candidate sets chosen in every group of log information at log event candidate;
The highest log event of every group of frequency of occurrences is chosen from the log event candidate.
It is further, described right according to M item of N number of character string composition, comprising:
It is right that the item is calculated according to the following formula:
Wherein, M is the number of the item pair, and N is the quantity of the character string.
Second aspect, the embodiment of the invention provides the log resolver based on machine learning, described device includes:
Acquiring unit, for obtaining original log information;
Grouped element obtains multiple groups log information for the original log information to be grouped by dimension,
In, it include multiple log text informations in every group of log information, each log text information includes N number of character string, and N is greater than and waits
In 2;
Component units, for being constituted according to N number of character string, M item is right, and M is greater than and is equal to 1;
Cluster cell, for being returned the multiple groups log information and the item to log event is clustered into according to clustering algorithm
Class group;
Selection unit, for choosing every group of highest log event of frequency from the log event classification group;
Generation unit generates log template for the log event based on selection.
Further, the cluster cell is for repeating following iterative processing, until each log text envelope
Breath is all traversed:
It is right based on the item, the log text information is calculated in the first potential functional value currently organized, and is worked as to described
Preceding group is marked;
Calculate second potential functional value of the log text information in unmarked group;
Described first potential functional value is compared with the described second potential functional value;
If the second potential functional value be greater than the described first potential functional value, update the log text information from
Described current group is moved to unmarked group of the information;
If the second potential functional value is equal to the described first potential functional value, using current group as the log
Event classification group.
Further, the cluster cell is used for:
The described first potential functional value is calculated according to the following formula:
Wherein, ω (B) is the described first potential functional value, is the log text to r ∈ R (B), N (r, B) for the item
It include the item in this information B to the log quantity of r, p (r, B)=N (r, B)/| B | to include in the log-file information B
The item calculates the log proportion of r, the second potential functional value by above-mentioned formula.
Further, the selection unit is used for:
Count in the log event classification group frequency of each item to appearance in every group of log information;
Every group of log information middle term is reached into the item of pre-determined number to as candidate item to the frequency of appearance;
By the candidate sets chosen in every group of log information at log event candidate;
The highest log event of every group of frequency of occurrences is chosen from the log event candidate.
Further, the Component units are used for:
It is right that the item is calculated according to the following formula:
Wherein, M is the number of the item pair, and N is the quantity of the character string.
The embodiment of the invention provides the log analysis method and devices based on machine learning, comprising: obtains original log
Information;Original log information is grouped by dimension, obtains multiple groups log information, wherein includes in every group of log information
Multiple log text informations, each log text information include N number of character string, and N is greater than and is equal to 2;It is constituted according to N number of character string
M item is right, and M is greater than and is equal to 1;According to clustering algorithm by multiple groups log information and item to being clustered into log event classification group;From
Every group of highest log event of frequency is chosen in log event classification group;Log event based on selection generates log template, from
And log analyzing efficiency and precision can be improved.
Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention are in specification, claims
And specifically noted structure is achieved and obtained in attached drawing.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below
Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor
It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the log analytic method flow chart based on machine learning that the embodiment of the present invention one provides;
The process of step S104 in the log analytic method based on machine learning that Fig. 2 provides for the embodiment of the present invention one
Figure;
The process of step S105 in the log analytic method based on machine learning that Fig. 3 provides for the embodiment of the present invention one
Figure;
Fig. 4 is the log resolver schematic diagram provided by Embodiment 2 of the present invention based on machine learning.
Icon:
10- acquiring unit;20- grouped element;30- Component units;40- cluster cell;50- selection unit;60- generates single
Member.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
It is rule-based matching that general log, which parses the most common method, and mould is extracted from log by regular expression
Formula, the subsequent mode based on extraction are simply classified, and are increased compared to relatively directly parsing content of text analyzing efficiency, but
It is to traverse the timeliness that log parses log to have a significant impact, and if influence whether day comprising a large amount of unrelated logs in log
The precision of the classification of will, log parsing can be also decreased obviously.
In this application, log information includes the text of variable part and the text of immutable part, most of log letters
Breath is all non-structured text.Log parsing is the text by part immutable in original log from the text of variable part
It separates, and is converted into the log event of a structuring.In log parsing, clustered by clustering algorithm.Cluster
It is that similar object is divided into different group or more subsets (subset) by the method for static classification, allows in this way
Member object in the same subset has similar some attributes.
To be described in detail to the embodiment of the present invention below convenient for understanding the present embodiment.
Embodiment one:
Fig. 1 is the log analytic method flow chart based on machine learning that the embodiment of the present invention one provides.
Referring to Fig.1, method includes the following steps:
Step S101 obtains original log information;
Here, original log information is the log information for having removed outlier.Due to including unrelated in original log information
Therefore the log information of item when carrying out log parsing to original log information, needs to remove the log information of outlier,
So as to improve the precision of log parsing.
Usually there are some changeless items in original log information, the first situation is that the position in log information is solid
It is fixed constant, for example daily record data concentrates the timestamp for representing log generation, though to change attribute constant for content.These are not only
Log classification is not helped, and will cause the increase of processing cost, therefore, it is necessary to remove;Second situation is in log
Position in information is variation, such as IP address, port etc. that daily record data is concentrated, can be made a return journey using regular expression
It removes.
Wherein, log parsing passes through by using specific potential functional value as the measurement standard for evaluating similar event
Continuous iteration improves classification accuracy.Key step includes the generation of item pair, the cluster of log information and log classification template
It generates.
Original log information is grouped by dimension, obtains multiple groups log information, wherein every group of day by step S102
It include multiple log text informations in will information, each log text information includes N number of character string, and N is greater than and is equal to 2;
Step S103, right according to N number of character string M item of composition, M is greater than and is equal to 1;
Here, since each log text information includes N number of character string, each character string is an item of log, every two
The composable item of a item is right, and the relationship between M and N is from formula (2).For example, N number of character string include " 12 ", " 34 " and
" AB ", then, 12 " is right with one item of " 34 " composition, and one item of " AB " and " 12 " composition is right, and " 34 " and " AB " constitute an item
Right, i.e., it is right to may be constructed 3 items for 3 character strings.
Step S104, according to clustering algorithm by multiple groups log information and item to being clustered into log event classification group;
Here, right based on item, the potential functional value that each log text information is organized from a group to another is calculated separately,
Whether increase by comparing the two potential functional values organized, so that it is determined that whether log text information moves, if increased,
Illustrate that the log text information is moved to another group, and Update log grouping information from a group, passes through continuous iteration, choosing
Bigger potential functional value is selected, to the last in an iteration, is increased without the potential functional value of any log text information
Add, then current group can be determined as to log event classification group.
Step S105 chooses every group of highest log event of frequency from log event classification group;
Step S106, the log event based on selection generate log template.
Further, referring to Fig. 2, step S104 is the following steps are included: repeat following iterative processing, until each day
Will text information is all traversed:
Step S201, it is right based on item, log text information is calculated in the first potential functional value currently organized, and to current group
It is marked;
Step S202 calculates second potential functional value of the log text information in unmarked group;
First potential functional value is compared by step S203 with the second potential functional value;
Step S204, if the second potential functional value is greater than the first potential functional value, Update log text information is from working as
Preceding group is moved to unmarked group of information;
Step S205, if the second potential functional value is equal to the first potential functional value, using current group as log thing
Part classification group.
Specifically, right based on item, and log text information can be calculated in the first potential letter currently organized according to formula (1)
Numerical value, the first potential functional value is the summation of all items pair in log text information, right after the first potential functional value has been calculated
Current group is marked, and can distinguish with other groups, so that it is determined which can be moved to is unmarked for the log text information
Group in.The second potential functional value is calculated by iteration, then by formula (1), then compares the first potential functional value and second
Potential functional value if increased, illustrates the log text information from marked so that it is determined that whether log text information moves
Current group be moved to another unlabelled group, and Update log grouping information, by continuous iteration, select bigger potential
Functional value to the last in an iteration, increases without the potential functional value of any log text information, then can will work as
Preceding grouping is determined as log event classification group.
Further, step S201 includes:
The first potential functional value is calculated according to formula (1):
Wherein, ω (B) is the first potential functional value, to r ∈ R (B), N (r, B) is wrapped in log text information B for item
Item is included to the log quantity of r, p (r, B)=N (r, B)/| B | to include log proportion of the item to r in log-file information B,
Second potential functional value is calculated by above-mentioned formula.
Further, referring to Fig. 3, step S105 the following steps are included:
Step S301, frequency of each item to appearance in every group of log information in statistical log event classification group;
Every group of log information middle term is reached the item of pre-determined number to as candidate item to the frequency of appearance by step S302;
Here, the item of pre-determined number is right to the item for being more than half for frequency of occurrence.
Step S303, by the candidate sets chosen in every group of log information at log event candidate;
Step S304 chooses every group of highest log event of the frequency of occurrences from log event candidate.
Specifically, each log text information in every group has the sequence Item of high matching score.Log template generates
In the process, construct log information label first, i.e., in preservation log event classification group, each item in each log text information
To the frequency of appearance, selecting frequency of occurrence in every group is more than the item of half as candidate item, i.e. message label;Then, by every group
The candidate sets contained in log information are candidate at log event, and the highest log event candidate of the frequency of occurrences is current in every group
The final log template output of group.
Further, step S103 includes:
According to formula (2) computational item pair:
Wherein, M is the number of item pair, and N is the quantity of character string.
The embodiment of the invention provides the log analytic methods based on machine learning, comprising: obtains original log information;It will
Original log information is grouped by dimension, obtains multiple groups log information, wherein includes multiple logs in every group of log information
Text information, each log text information include N number of character string, and N is greater than and is equal to 2;It is right according to N number of character string M item of composition,
M is greater than and is equal to 1;According to clustering algorithm by multiple groups log information and item to being clustered into log event classification group;From log event
Every group of highest log event of frequency is chosen in classification group;Log event based on selection generates log template, so as to mention
High log analyzing efficiency and precision.
Embodiment two:
Fig. 4 is the log resolver schematic diagram provided by Embodiment 2 of the present invention based on machine learning.
Referring to Fig. 4, which includes acquiring unit 10, grouped element 20, Component units 30, cluster cell 40, chooses list
Member 50 and generation unit 60.
Acquiring unit 10, for obtaining original log information;
Grouped element 20 obtains multiple groups log information for original log information to be grouped by dimension, wherein
It include multiple log text informations in every group of log information, each log text information includes N number of character string, and N is greater than and is equal to
2;
Component units 30, for being constituted according to N number of character string, M item is right, and M is greater than and is equal to 1;
Cluster cell 40, for being sorted out the multiple groups log information and item to log event is clustered into according to clustering algorithm
Group;
Selection unit 50, for choosing every group of highest log event of frequency from log event classification group;
Generation unit 60 generates log template for the log event based on selection.
Further, cluster cell 40 is for repeating following iterative processing, until each log text information by
Traversal:
It is right based on item, log text information is calculated in the first potential functional value currently organized, and current group is marked;
Calculate second potential functional value of the log text information in unmarked group;
First potential functional value is compared with the second potential functional value;
If the second potential functional value is greater than the first potential functional value, Update log text information is moved to from current group
Unmarked group of information;
If the second potential functional value is equal to the first potential functional value, using current group as log event classification group.
Further, cluster cell 40 is used for:
The first potential functional value is calculated according to formula (1):
Wherein, ω (B) is the first potential functional value, to r ∈ R (B), N (r, B) is wrapped in log text information B for item
Item is included to the log quantity of r, p (r, B)=N (r, B)/| B | to include log proportion of the item to r in log-file information B,
Second potential functional value is calculated by above-mentioned formula.
Further, selection unit 50 is used for:
Frequency of each item to appearance in every group of log information in statistical log event classification group;
Every group of log information middle term is reached into the item of pre-determined number to as candidate item to the frequency of appearance;
By the candidate sets chosen in every group of log information at log event candidate;
The highest log event of every group of frequency of occurrences is chosen from log event candidate.
Further, Component units 30 are used for:
According to formula (2) computational item pair:
Wherein, M is the number of item pair, and N is the quantity of character string.
The embodiment of the invention provides the log resolvers based on machine learning, comprising: obtains original log information;It will
Original log information is grouped by dimension, obtains multiple groups log information, wherein includes multiple logs in every group of log information
Text information, each log text information include N number of character string, and N is greater than and is equal to 2;It is right according to N number of character string M item of composition,
M is greater than and is equal to 1;According to clustering algorithm by multiple groups log information and item to being clustered into log event classification group;From log event
Every group of highest log event of frequency is chosen in classification group;Log event based on selection generates log template, so as to mention
High log analyzing efficiency and precision.
The embodiment of the present invention also provides a kind of electronic equipment, including memory, processor and storage are on a memory and can
The computer program run on a processor, processor are realized provided by the above embodiment based on machine when executing computer program
The step of log analytic method of study.
The embodiment of the present invention also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium
Calculation machine program executes the log analytic method based on machine learning of above-described embodiment when computer program is run by processor
Step.
Computer program product provided by the embodiment of the present invention, the computer-readable storage including storing program code
Medium, the instruction that said program code includes can be used for executing previous methods method as described in the examples, and specific implementation can be joined
See embodiment of the method, details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description
It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase
Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected;It can
To be mechanical connection, it is also possible to be electrically connected;It can be directly connected, can also can be indirectly connected through an intermediary
Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition
Concrete meaning in invention.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
In the description of the present invention, it should be noted that term " center ", "upper", "lower", "left", "right", "vertical",
The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to
Convenient for description the present invention and simplify description, rather than the device or element of indication or suggestion meaning must have a particular orientation,
It is constructed and operated in a specific orientation, therefore is not considered as limiting the invention.In addition, term " first ", " second ",
" third " is used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention
Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair
It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art
In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention
Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. a kind of log analytic method based on machine learning, which is characterized in that the described method includes:
Obtain original log information;
The original log information is grouped by dimension, obtains multiple groups log information, wherein wrap in every group of log information
Multiple log text informations are included, each log text information includes N number of character string, and N is greater than and is equal to 2;
Right according to M item of N number of character string composition, M is greater than and is equal to 1;
According to clustering algorithm by the multiple groups log information and the item to being clustered into log event classification group;
Every group of highest log event of frequency is chosen from the log event classification group;
Log event based on selection generates log template.
2. the log analytic method according to claim 1 based on machine learning, which is characterized in that described to be calculated according to cluster
The multiple groups log information and the item are included repeating following iterative processing to log event classification group is clustered by method,
Until each log text information is traversed:
It is right based on the item, the log text information is calculated in the first potential functional value currently organized, and to described current group
It is marked;
Calculate second potential functional value of the log text information in unmarked group;
Described first potential functional value is compared with the described second potential functional value;
If the second potential functional value is greater than the described first potential functional value, the log text information is updated from described
Current group is moved to unmarked group of the information;
If the second potential functional value is equal to the described first potential functional value, using current group as the log event
Classification group.
3. the log analytic method according to claim 2 based on machine learning, which is characterized in that described to be based on the item
It is right, the log text information is calculated in the first potential functional value currently organized, comprising:
The described first potential functional value is calculated according to the following formula:
Wherein, ω (B) is the described first potential functional value, is the log text envelope to r ∈ R (B), N (r, B) for the item
Ceasing in B includes log quantity of the item to r, and p (r, B)=N (r, B)/B is in the log-file information B including the item
To the log proportion of r, the second potential functional value is calculated by above-mentioned formula.
4. the log analytic method according to claim 1 based on machine learning, which is characterized in that described from the log
Every group of highest log event of frequency is chosen in event classification group, comprising:
Count in the log event classification group frequency of each item to appearance in every group of log information;
Every group of log information middle term is reached into the item of pre-determined number to as candidate item to the frequency of appearance;
By the candidate sets chosen in every group of log information at log event candidate;
The highest log event of every group of frequency of occurrences is chosen from the log event candidate.
5. the log analytic method according to claim 1 based on machine learning, which is characterized in that described according to the N
It is right that a character string constitutes M item, comprising:
It is right that the item is calculated according to the following formula:
Wherein, M is the number of the item pair, and N is the quantity of the character string.
6. a kind of log resolver based on machine learning, which is characterized in that described device includes:
Acquiring unit, for obtaining original log information;
Grouped element obtains multiple groups log information for the original log information to be grouped by dimension, wherein every
It include multiple log text informations in group log information, each log text information includes N number of character string, and N is greater than and is equal to 2;
Component units, for being constituted according to N number of character string, M item is right, and M is greater than and is equal to 1;
Cluster cell, for being sorted out the multiple groups log information and the item to log event is clustered into according to clustering algorithm
Group;
Selection unit, for choosing every group of highest log event of frequency from the log event classification group;
Generation unit generates log template for the log event based on selection.
7. the log resolver according to claim 6 based on machine learning, which is characterized in that the cluster cell is used
In repeating following iterative processing, until each log text information is traversed:
It is right based on the item, the log text information is calculated in the first potential functional value currently organized, and to described current group
It is marked;
Calculate second potential functional value of the log text information in unmarked group;
Described first potential functional value is compared with the described second potential functional value;
If the second potential functional value is greater than the described first potential functional value, the log text information is updated from described
Current group is moved to unmarked group of the information;
If the second potential functional value is equal to the described first potential functional value, using current group as the log event
Classification group.
8. the log resolver according to claim 7 based on machine learning, which is characterized in that the cluster cell is used
In:
The described first potential functional value is calculated according to the following formula:
Wherein, ω (B) is the described first potential functional value, is the log text envelope to r ∈ R (B), N (r, B) for the item
Ceasing in B includes log quantity of the item to r, and p (r, B)=N (r, B)/B is in the log-file information B including the item
To the log proportion of r, the second potential functional value is calculated by above-mentioned formula.
9. the log resolver according to claim 6 based on machine learning, which is characterized in that the selection unit is used
In:
Count in the log event classification group frequency of each item to appearance in every group of log information;
Every group of log information middle term is reached into the item of pre-determined number to as candidate item to the frequency of appearance;
By the candidate sets chosen in every group of log information at log event candidate;
The highest log event of every group of frequency of occurrences is chosen from the log event candidate.
10. the log resolver according to claim 6 based on machine learning, which is characterized in that the Component units
For:
It is right that the item is calculated according to the following formula:
Wherein, M is the number of the item pair, and N is the quantity of the character string.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810957288.XA CN109144964A (en) | 2018-08-21 | 2018-08-21 | log analysis method and device based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810957288.XA CN109144964A (en) | 2018-08-21 | 2018-08-21 | log analysis method and device based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109144964A true CN109144964A (en) | 2019-01-04 |
Family
ID=64790971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810957288.XA Pending CN109144964A (en) | 2018-08-21 | 2018-08-21 | log analysis method and device based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109144964A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110321457A (en) * | 2019-04-19 | 2019-10-11 | 杭州玳数科技有限公司 | Access log resolution rules generation method and device, log analytic method and system |
CN111160021A (en) * | 2019-10-12 | 2020-05-15 | 华为技术有限公司 | Log template extraction method and device |
CN111258975A (en) * | 2020-04-26 | 2020-06-09 | 中国人民解放军总医院 | Method, apparatus, device and medium for locating abnormality in image archiving communication system |
CN111462826A (en) * | 2020-04-09 | 2020-07-28 | 合肥本源量子计算科技有限责任公司 | Method for prompting quantum chemical simulation calculation progress, electronic equipment and storage medium |
WO2021088385A1 (en) * | 2019-11-06 | 2021-05-14 | 国网上海市电力公司 | Online log analysis method, system, and electronic terminal device thereof |
CN114745452A (en) * | 2022-03-29 | 2022-07-12 | 烽台科技(北京)有限公司 | Equipment management method and device and electronic equipment |
-
2018
- 2018-08-21 CN CN201810957288.XA patent/CN109144964A/en active Pending
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110321457A (en) * | 2019-04-19 | 2019-10-11 | 杭州玳数科技有限公司 | Access log resolution rules generation method and device, log analytic method and system |
CN111160021A (en) * | 2019-10-12 | 2020-05-15 | 华为技术有限公司 | Log template extraction method and device |
WO2021088385A1 (en) * | 2019-11-06 | 2021-05-14 | 国网上海市电力公司 | Online log analysis method, system, and electronic terminal device thereof |
CN111462826A (en) * | 2020-04-09 | 2020-07-28 | 合肥本源量子计算科技有限责任公司 | Method for prompting quantum chemical simulation calculation progress, electronic equipment and storage medium |
CN111462826B (en) * | 2020-04-09 | 2023-04-28 | 合肥本源量子计算科技有限责任公司 | Method for prompting quantum chemistry simulation calculation progress, electronic equipment and storage medium |
CN111258975A (en) * | 2020-04-26 | 2020-06-09 | 中国人民解放军总医院 | Method, apparatus, device and medium for locating abnormality in image archiving communication system |
CN114745452A (en) * | 2022-03-29 | 2022-07-12 | 烽台科技(北京)有限公司 | Equipment management method and device and electronic equipment |
CN114745452B (en) * | 2022-03-29 | 2023-05-16 | 烽台科技(北京)有限公司 | Equipment management method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109144964A (en) | log analysis method and device based on machine learning | |
US10237295B2 (en) | Automated event ID field analysis on heterogeneous logs | |
CN104298679B (en) | Applied business recommended method and device | |
JP6233411B2 (en) | Fault analysis apparatus, fault analysis method, and computer program | |
CN111797210A (en) | Information recommendation method, device and equipment based on user portrait and storage medium | |
CN103678702A (en) | Video duplicate removal method and device | |
CN111160021A (en) | Log template extraction method and device | |
CN110928957A (en) | Data clustering method and device | |
CN110134845A (en) | Project public sentiment monitoring method, device, computer equipment and storage medium | |
CN112860685A (en) | Automatic recommendation of analysis of data sets | |
CN110263121B (en) | Table data processing method, apparatus, electronic apparatus and computer readable storage medium | |
WO2016093839A1 (en) | Structuring of semi-structured log messages | |
CN113806492A (en) | Record generation method, device and equipment based on semantic recognition and storage medium | |
CN117420998A (en) | Client UI interaction component generation method, device, terminal and medium | |
CN113760891A (en) | Data table generation method, device, equipment and storage medium | |
CN113468866B (en) | Method and device for analyzing non-standard JSON string | |
CN114610955A (en) | Intelligent retrieval method and device, electronic equipment and storage medium | |
CN115905630A (en) | Graph database query method, device, equipment and storage medium | |
CN116822491A (en) | Log analysis method and device, equipment and storage medium | |
CN115291931A (en) | Version change processing method and device, electronic equipment and storage medium | |
CN117501275A (en) | Method, computer program product and computer system for analyzing data consisting of a large number of individual messages | |
CN115051863A (en) | Abnormal flow detection method and device, electronic equipment and readable storage medium | |
CN109947891B (en) | Document analysis method and device | |
CN108846103A (en) | A kind of data query method and device | |
CN106469086B (en) | Event processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190104 |
|
RJ01 | Rejection of invention patent application after publication |