CN108108288A - A kind of daily record data analytic method, device and equipment - Google Patents

A kind of daily record data analytic method, device and equipment Download PDF

Info

Publication number
CN108108288A
CN108108288A CN201810019078.6A CN201810019078A CN108108288A CN 108108288 A CN108108288 A CN 108108288A CN 201810019078 A CN201810019078 A CN 201810019078A CN 108108288 A CN108108288 A CN 108108288A
Authority
CN
China
Prior art keywords
daily record
record data
resolved
data
configuration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810019078.6A
Other languages
Chinese (zh)
Inventor
胡嘉伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810019078.6A priority Critical patent/CN108108288A/en
Publication of CN108108288A publication Critical patent/CN108108288A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data

Abstract

An embodiment of the present invention provides a kind of daily record data analytic method, device and equipment, the above methods to include:The configuration information of daily record parsing task is obtained, configuration information includes:The source of daily record data to be resolved, the resolution rules parsed to daily record data to be resolved identify;It is identified according to resolution rules, in the corresponding analytical function of pre-stored resolution rules, searches the analytical function for being parsed to daily record data to be resolved;With reference to the source of daily record data to be resolved, the analytical function found is called to parse daily record data to be resolved, obtain the analysis result of structuring.The daily record that technical solution of the embodiment of the present invention can be set according to programmer parses task configuration information, generate parsing task, from the source pulling data of daily record data to be resolved, and pre-stored analytical function corresponding with resolution rules mark is called to carry out daily record data parsing, so as to obtain the analysis result of structuring.Programmer's workload can be reduced, improves work efficiency.

Description

A kind of daily record data analytic method, device and equipment
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of daily record data analytic method, device and equipment.
Background technology
Program operationally, can generate a large amount of daily record datas.Daily record data is generally used for logging program operation information, so as to Make developer that exploitation be facilitated to debug, understand production environment implementation status, and developer being capable of foundation by analyzing daily record data Analysis result debugs program, is obtained in addition, developer can also carry out data parsing based on daily record data in required Hold.Usual daily record data is all the data of no structure or half structure, in order to more easily be analyzed daily record data, it is necessary to right Daily record data is parsed, and obtains structured data, and the data obtained are saved in data warehouse.Wherein, no knot The data of structure refer to that data structure is irregular or imperfect, is patrolled without predefined data model, inconvenience with database two dimension The data for collecting table to show.
A kind of common real-time logs acquisition and process of analysis include, and program writes daily record data in journal file, point Daily record data in cloth log collection Transmission system acquisition journal file, and the daily record data collected is sent to distribution Message system, when in practical application in the presence of the demand analyzed daily record data, stream process computing engines, such as Spark Streaming obtains daily record data from above-mentioned distributed information system, and operation program person carries out daily record data according to above-mentioned The data analysis program of the demand exploitation of analysis, parses acquired daily record data, obtains the daily record data of structuring, And obtained data are stored into data warehouse.
However, inventor has found in the implementation of the present invention, at least there are the following problems for the prior art:
Different according to the demand analyzed daily record data, this just needs programmer to write corresponding parsing journey every time Sequence so that stream process computing engines are called when being parsed to daily record data, causes programmer's workload larger.
The content of the invention
The embodiment of the present invention is designed to provide a kind of daily record data analytic method, device and equipment, to reduce program The workload of member improves work efficiency.Specific technical solution is as follows:
The one side that the present invention is implemented provides a kind of daily record data analytic method, applied to stream process computing engines, institute The method of stating includes:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved, to institute State the resolution rules mark that daily record data to be resolved is parsed;
According to the resolution rules identify, in the corresponding analytical function of pre-stored resolution rules, search for pair The analytical function that the daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to the daily record data to be resolved It is parsed, obtains the analysis result of structuring.
Optionally, after the step of generation daily record data parsing task according to the configuration information received, also wrap It includes:
The mark of daily record parsing task and the configuration information are stored into database.
Optionally, during daily record data parses task run, the method further includes:
Detect the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure;
If it is, parsing the mark of task according to the daily record, the configuration information is read from the database, and is returned It is identified described in receipt row according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, lookup is used for The step of analytical function parsed to the daily record data to be resolved.
Optionally, the configuration information further includes:Analysis result proof rule mark, it is described with reference to described in it is to be resolved The source of daily record data calls the analytical function found to parse the daily record data to be resolved, obtains structuring After analysis result, further include:
According to the proof rule identify, pre-stored proof rule it is corresponding verification function in, search for pair The verification function that the analysis result of the structuring is verified;
The analysis result of structuring described in the verification function pair found is called to be verified.
Optionally, the configuration information further includes:The storage location of analysis result;
In the source of the daily record data to be resolved with reference to described in, the analytical function found is called to the day to be resolved Will data are parsed, and after the analysis result for obtaining structuring, are further included:
The analysis result of the structuring is stored to the storage location of the analysis result.
Optionally, the resolution rules include at least one of following rule:
The data to match with default regular expression are extracted from daily record data;
Extract the data of specified domain in the daily record data of JSON forms;
The data for extracting access target URL from daily record data and generating, the target URL are:Default URL parameter list In URL;
Timestamp in daily record data is converted into time text;
Time text in daily record data is converted into timestamp;
It is regional information by the IP address conversion in daily record data;
Extract the character string in daily record data.
The another aspect that the present invention is implemented additionally provides a kind of daily record data resolver, draws applied to stream process calculating It holds up, described device includes:
Acquisition module, for obtaining the configuration information of daily record parsing task, the configuration information includes:Daily record number to be resolved According to source, the resolution rules that are parsed to the daily record data to be resolved identify;
Searching module, for being identified according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules In, search the analytical function for being parsed to the daily record data to be resolved;
Parsing module for the source with reference to the daily record data to be resolved, calls the analytical function found to described Daily record data to be resolved is parsed, and obtains the analysis result of structuring.
Optionally, described device further includes:
Memory module, for storing the mark of daily record parsing task and the configuration information into database.
Optionally, described device further includes:
Detection module, for detecting the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure, If it is, triggering read module;
The read module for parsing the mark of task according to the daily record, is matched somebody with somebody from the database described in reading Confidence ceases, and triggers the searching module.
Optionally, the configuration information further includes:The proof rule mark of analysis result, correspondingly described device is also wrapped It includes:
Verify function lookup module, it is corresponding in pre-stored proof rule for being identified according to the proof rule It verifies in function, searches the verification function verified for the analysis result to the structuring;
Authentication module, the analysis result for calling structuring described in the verification function pair found are verified.
Optionally, the configuration information further includes:The storage location of analysis result;
Memory module is additionally operable to store the analysis result of the structuring to the storage location of the analysis result.
Optionally, the resolution rules include at least one of following rule:
The data to match with default regular expression are extracted from daily record data;
Extract the data of specified domain in the daily record data of JSON forms;
The data for extracting access target URL from daily record data and generating, the target URL are:Default URL parameter list In URL;
Timestamp in daily record data is converted into time text;
Time text in daily record data is converted into timestamp;
It is regional information by the IP address conversion in daily record data;
Extract the character string in daily record data.
The another aspect that the present invention is implemented additionally provides a kind of daily record data analyzing device, is connect including processor, communication Mouthful, memory and communication bus, wherein, processor, communication interface, memory completes mutual communication by communication bus;
Memory, for storing computer program;
Processor during for performing the program stored on memory, realizes any method steps of claim 1-6 Suddenly.
At the another aspect that the present invention is implemented, a kind of computer readable storage medium is additionally provided, it is described computer-readable Instruction is stored in storage medium, when run on a computer so that computer performs any of the above-described daily record number According to analytic method.
At the another aspect that the present invention is implemented, a kind of computer program product for including instruction is additionally provided, when it is being counted When being run on calculation machine so that computer performs any of the above-described daily record data analytic method.
Daily record data analytic method provided in an embodiment of the present invention, device and equipment, can be according to the log task of acquisition The source of daily record data to be resolved in parsing task configuration information, the parsing parsed to the daily record data to be resolved are advised Generation parsing task is then identified, from the source pulling data of daily record data to be resolved, and calls pre-stored and resolution rules It identifies corresponding analytical function and carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, according to daily record The demand that data are analyzed is different, and programmer only needs to set corresponding configuration information, reduces workload, improves work Make efficiency.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described.
Fig. 1 is daily record data analytic method flow diagram of the embodiment of the present invention;
Fig. 2 is daily record data resolver structure diagram of the embodiment of the present invention;
Fig. 3 is daily record data analyzing device structure diagram of the embodiment of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is described.
In the prior art when in the presence of the demand analyzed daily record data, according to the need analyzed daily record data Ask different, this just needs programmer to write corresponding analysis program every time, for stream process computing engines to daily record data into It is called during row parsing, causes that programmer's workload is larger, and work efficiency is low.
In order to solve the above technical problems, in one embodiment of the present of invention, a kind of daily record data analytic method is provided, is wrapped It includes:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved, to institute State the resolution rules mark that daily record data to be resolved is parsed;
According to the resolution rules identify, in the corresponding analytical function of pre-stored resolution rules, search for pair The analytical function that the daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to the daily record data to be resolved It is parsed, obtains the analysis result of structuring.
As it can be seen that in technical solution provided in an embodiment of the present invention, task configuration can be parsed according to the log task of acquisition The source of daily record data to be resolved in information, the resolution rules mark generation solution parsed to the daily record data to be resolved Analysis task from the source pulling data of daily record data to be resolved, and is called pre-stored with the corresponding solution of resolution rules mark It analyses function and carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, analyzed according to daily record data Demand it is different, programmer only needs to set corresponding configuration information, reduces workload, improves work efficiency.
Daily record data analytic method provided in an embodiment of the present invention is described in detail below by specific embodiment.
With reference to Fig. 1, show a kind of flow diagram of daily record data analytic method provided in an embodiment of the present invention, apply In stream process computing engines.
In a kind of realization method, above-mentioned stream process computing engines can be spark streaming.
Specifically, above-mentioned daily record data analytic method includes:
S100, obtains the configuration information of daily record parsing task, and the configuration information includes:Daily record data to be resolved comes Source, the resolution rules parsed to the daily record data to be resolved identify.
The source of daily record data to be resolved can be understood as the storage location of daily record data to be resolved, for example, day to be resolved Will data are stored in the type of service topicA of distributed information system kafka, when read stored in kafka systems wait to solve When analysing daily record data, daily record data to be resolved is read from the type of service topicA of kafka systems, at this time daily record number to be resolved According to source be type of service topicA in kafka systems.
Resolution rules can be understood as realizing the function that parses daily record data to be resolved, according to daily record data into The different each resolution rules of demand of row parsing correspond to an analytical function, are advised between resolution rules and analytical function by parsing Then identify realization one-to-one relationship.Wherein, resolution rules mark can be the set of number being randomly assigned.
S200 is identified according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, is searched and is used In the analytical function parsed to the daily record data to be resolved.
The corresponding analytical function of one resolution rules can be understood as:It is used to implement the function of the resolution rules.
In a kind of realization method, each resolution rules can be shown in the form of a list so that programmer selects, when It, can be corresponding according to the corresponding resolution rules identifier lookup of the resolution rules when detecting that a certain resolution rules are selected Analytical function.Specifically, each corresponding analytical function of resolution rules is stored in advance in function library, it can be according to selected The corresponding mark of resolution rules searches corresponding analytical function in function library.
For example, programmer has selected the resolution rules of " timestamp is converted to time text ", then basis selectively parses The resolution rules mark of rule, searches in function library and is converted to the corresponding analytical function of time text with by timestamp.
S300 with reference to the source of the daily record data to be resolved, calls the analytical function found to the day to be resolved Will data are parsed, and obtain the analysis result of structuring.
Specifically, determine to read the storage address of daily record data to be resolved according to the source of daily record data to be resolved, from true Daily record data to be resolved is read in fixed storage address, calls the corresponding analytical function of the resolution rules found to read Daily record data is parsed.
In a kind of realization method of the embodiment of the present invention, a front end display interface can be set and based on spark The daily record parsing task general purpose function frame of streaming.The input frame of each configuration information is provided in the display interface of front end, Programmer can input configuration information by the form directly inputted in input frame, can also be by selecting to carry in display interface The respective option of confession inputs configuration information;It needs that the part of configuration information is set to be arranged to variable shape in general purpose function frame Formula, and it is corresponding with the input frame of configuration information in the display interface of front end.Specifically, day to be resolved is read in general framework function The storage address part of will data is corresponding with the source input frame of daily record data to be resolved in the display interface of front end, analytical function Part is corresponding with resolution rules input frame.For example, Origination section=A of daily record data to be resolved is made in general purpose function frame, When inputting kafka in source input frame of the programmer in front end display interface daily record data to be resolved, during topic A, then will Kafka, topic A assign A, that is, determine that this daily record data parses task by the type of service topic A in kakfa systems Read data;Analytical function part=B in general purpose function frame is made, when programmer is in front end display interface resolution rules input frame During the resolution rules of middle selection " timestamp is converted to time text ", when being searched first according to resolution rules mark in function library Between stamp be converted to the corresponding analytical function of time text, then by the analytical function found assign B.Based on this, can generate Daily record data parses task, by reading daily record data to be resolved in the topic A in distributed information system kafka, then will The timestamp included in daily record data to be resolved is converted to time text.
Stream process computing engines running log needs to run resource when parsing task, and operation can be obtained in local device Daily record data can also be parsed task and be submitted to explorer, such as YARN clusters by resource.Divided by explorer Resource with running log data parsing required by task.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, can be parsed according to the log task of acquisition The source of daily record data to be resolved in task configuration information, the resolution rules mark parsed to the daily record data to be resolved Know generation parsing task, from the source pulling data of daily record data to be resolved, and pre-stored and resolution rules is called to identify Corresponding analytical function carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, according to daily record data The demand analyzed is different, and programmer only needs to set corresponding configuration information, reduces workload, improves work effect Rate.
In a kind of realization method of the embodiment of the present invention, described according to the configuration information received generation daily record data parsing After the step of task, further include:
The mark of daily record parsing task and the configuration information are stored into database.
In a kind of realization method, the mark of daily record parsing task can be the task name that daily record parses task, wherein task Name can be programmer oneself setting or stream process computing engines distribute automatically;Daily record parses the mark of task Can also be unique ID (IDentity) number of the stream process computing engines distribution when generating daily record data parsing task.Specifically The configuration information that daily record can be parsed to the mark of task and obtained is stored into relevant database MySQL.
The embodiment of the present invention can by daily record parse task mark and configuration information store into database, when need again During secondary generation daily record parsing task, the corresponding configuration information stored in database can be directly invoked, without Configuration information is set from new, work efficiency is provided.
In a kind of realization method of the embodiment of the present invention, during daily record data parses task run, the method is also wrapped It includes:
Detect the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure;
If it is, parsing the mark of task according to the daily record, the configuration information is read from the database, and is returned It is identified described in receipt row according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, lookup is used for The step of analytical function parsed to the daily record data to be resolved.
The embodiment of the present invention can detect the operating status for the daily record data parsing task being currently running in real time, when detecting When daily record parsing task failure exits, the mark of task is parsed according to daily record, what is stored in reading database parses with the daily record The corresponding configuration information of task regenerates daily record data parsing task, based on this energy according to the configuration information read Enough ensure the stability of daily record data parsing task run, so as to improve the robustness of system.
In a kind of realization method of the embodiment of the present invention, the configuration information further includes:The proof rule mark of analysis result, In the source of the daily record data to be resolved with reference to described in, call the analytical function that finds to the daily record data to be resolved into Row parsing, after obtaining the analysis result of structuring, further includes:
According to the proof rule identify, pre-stored proof rule it is corresponding verification function in, search for pair The verification function that the analysis result of the structuring is verified;
The analysis result of structuring described in the verification function pair found is called to be verified.
The corresponding verification function of one proof rule can be understood as:It is used to implement the function of the proof rule.
In a kind of realization method, proof rule can include:Analysis result is verified according to regular expression;According to Dictionary such as verifies at the analysis result.For example, analysis result is verified by/^ [0-9] { 1,20 } $/this regular expression Whether it is made of entirely number.
It, can be by verification result real-time display, to enable programmers to understand daily record data in time in a kind of realization method Whether analysis result meets expected structuring daily record data.Can also set, which will directly not meet expected analysis result, deletes It removes, i.e., when by verifying that the analysis result that function pair obtains is verified, determining analysis result and not meeting expected structuring Directly obtained analysis result is deleted during daily record data.
The embodiment of the present invention can set analysis result proof rule to be carried out in real time to the analysis result of the structuring of acquisition Verification to enable programmers to understand in time whether daily record data analysis result meets expected structuring daily record data, may be used also Expected analysis result is not met with direct deletion, so as to improve work efficiency when subsequently analyzing structuring daily record data.
In a kind of realization method of the embodiment of the present invention, the configuration information further includes:The storage location of analysis result;
In the source of the daily record data to be resolved with reference to described in, the analytical function found is called to the day to be resolved Will data are parsed, and after the analysis result for obtaining structuring, are further included:
The analysis result of the structuring is stored to the storage location of the analysis result.
In a kind of realization method, the analysis result of structuring can be stored into Hive, Hive is a kind of based on distribution The data file of structuring can be mapped as a database table, and provide simple sql by the data warehouse of formula framework (Structured Query Language) query function, Hive include multiple databases (Database), each database Include multiple tables (Table).Specifically, can knot be determined by Database information in specified Hive and Table information The storage location of the analysis result of structure.For example, Database1;Table2 is represented analysis result storage to database 1 Under table 2 in.
The structuring analysis result of acquisition can be stored the storage to pre-set analysis result by the embodiment of the present invention Position can obtain structuring parsing knot when needing to analyze daily record data directly in pre-set storage location Fruit, easy-to-look-up structuring analysis result.
In a kind of realization method, resolution rules can include at least one of following rule:
The first:The data to match with default regular expression are extracted from daily record data;
For example, regular expression can be passed through:.+ .bmp, to extract the file of the entitled bpm of suffix.
Second:Extract the data of specified domain in the daily record data of JSON forms;
For example, extract the daily record data { " href " of JSON forms:“http://iqiyi/ ", " rel ":" some " } in The content of " href ".
The third:Access target URL (Uniform Resource Locator) is extracted from daily record data and is generated Data, the target URL are:URL in default URL parameter list;
Http is accessed for example, can extract:The data that //www.iqiyi.com/ is generated.
4th kind:Timestamp in daily record data is converted into time text;
For example, timestamp is 1513404776,12 when being converted into corresponding time text 16 days 14 December in 2017 Divide 56 seconds.
5th kind:Time text in daily record data is converted into timestamp;
For example, time text 12 divides 56 seconds when being 16 days 14 December in 2017, corresponding timestamp is translated into 1513404776。
6th kind:It is regional information by the IP address conversion in daily record data;
For example, it is Hangzhou, Zhejiang province city that IP address, which is 115.239.210.26, which to be converted to after regional information,.
7th kind:Extract the character string in daily record data.
It should be noted that the present invention is only illustrated exemplified by above-mentioned, not to the concrete types of resolution rules into Row limits.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, can be parsed according to the log task of acquisition The source of daily record data to be resolved in task configuration information, the resolution rules mark parsed to the daily record data to be resolved Know generation parsing task, from the source pulling data of daily record data to be resolved, and pre-stored and resolution rules is called to identify Corresponding analytical function carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, according to daily record data The demand analyzed is different, and programmer only needs to set corresponding configuration information, reduces workload, improves work effect Rate.
With reference to Fig. 2, a kind of daily record data resolver structure diagram of the embodiment of the present invention is shown, applied to stream process Computing engines, described device include:
Acquisition module 400, for obtaining the configuration information of daily record parsing task, the configuration information includes:Day to be resolved The source of will data, the resolution rules parsed to the daily record data to be resolved identify;
Searching module 500, for being identified according to the resolution rules, in the corresponding parsing letter of pre-stored resolution rules In number, the analytical function for being parsed to the daily record data to be resolved is searched;
Parsing module 600 for the source with reference to the daily record data to be resolved, calls the analytical function found to institute It states daily record data to be resolved to be parsed, obtains the analysis result of structuring.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, daily record data resolver can be according to obtaining The source of daily record data to be resolved in the log task parsing task configuration information obtained carries out the daily record data to be resolved The resolution rules mark generation parsing task of parsing, from the source pulling data of daily record data to be resolved, and calls and prestores Analytical function corresponding with resolution rules mark carry out daily record data parsing, so as to obtain the analysis result of structuring.It is based on This, different according to the demand analyzed daily record data, programmer only needs to set corresponding configuration information, reduce Workload improves work efficiency.
In a kind of realization method of the embodiment of the present invention, described device further includes:
Memory module, for storing the mark of daily record parsing task and the configuration information into database.
In a kind of realization method of the embodiment of the present invention, described device further includes:
Detection module, for detecting the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure, If it is, triggering read module;
The read module for parsing the mark of task according to the daily record, is matched somebody with somebody from the database described in reading Confidence ceases, and triggers the searching module.
In a kind of realization method of the embodiment of the present invention, the configuration information further includes:The proof rule mark of analysis result, Correspondingly described device further includes:
Verify function lookup module, it is corresponding in pre-stored proof rule for being identified according to the proof rule It verifies in function, searches the verification function verified for the analysis result to the structuring;
Authentication module, the analysis result for calling structuring described in the verification function pair found are verified.
In a kind of realization method of the embodiment of the present invention, the configuration information further includes:The storage location of analysis result;
Memory module is additionally operable to store the analysis result of the structuring to the storage location of the analysis result.
In a kind of realization method, the resolution rules include at least one of following rule:
The data to match with default regular expression are extracted from daily record data;
Extract the data of specified domain in the daily record data of JSON forms;
The data for extracting access target URL from daily record data and generating, the target URL are:Default URL parameter list In URL;
Timestamp in daily record data is converted into time text;
Time text in daily record data is converted into timestamp;
It is regional information by the IP address conversion in daily record data;
Extract the character string in daily record data.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, daily record data resolver can be according to obtaining The source of daily record data to be resolved in the log task parsing task configuration information obtained carries out the daily record data to be resolved The resolution rules mark generation parsing task of parsing, from the source pulling data of daily record data to be resolved, and calls and prestores Analytical function corresponding with resolution rules mark carry out daily record data parsing, so as to obtain the analysis result of structuring.It is based on This, different according to the demand analyzed daily record data, programmer only needs to set corresponding configuration information, reduce Workload improves work efficiency.
The embodiment of the present invention additionally provides a kind of daily record data analyzing device, as shown in figure 3, including processor 001, communication Interface 002, memory 003 and communication bus 004, wherein, processor 001, communication interface 002, memory 003 is by communicating always Line 004 completes mutual communication,
Memory 003, for storing computer program;
Processor 001 during for performing the program stored on memory 003, realizes the day described in the embodiment of the present invention Will data analysis method.
Specifically, above-mentioned daily record data analytic method, including:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved, to institute State the resolution rules mark that daily record data to be resolved is parsed;
According to the resolution rules identify, in the corresponding analytical function of pre-stored resolution rules, search for pair The analytical function that the daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to the daily record data to be resolved It is parsed, obtains the analysis result of structuring.
It should be noted that above-mentioned processor 001, which performs the program stored on memory 003, realizes daily record data parsing The other embodiment of method, identical with the embodiment that preceding method embodiment part provides, which is not described herein again.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, daily record data analyzing device can be according to obtaining The source of daily record data to be resolved in the log task parsing task configuration information obtained carries out the daily record data to be resolved The resolution rules mark generation parsing task of parsing, from the source pulling data of daily record data to be resolved, and calls and prestores Analytical function corresponding with resolution rules mark carry out daily record data parsing, so as to obtain the analysis result of structuring.It is based on This, different according to the demand analyzed daily record data, programmer only needs to set corresponding configuration information, reduce Workload improves work efficiency.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Pomponent Interconnect, abbreviation PCI) bus or expanding the industrial standard structure (Extended Industry Standard Architecture, abbreviation EISA) bus etc..The communication bus can be divided into address bus, data/address bus, controlling bus etc.. For ease of representing, only represented in figure with a thick line, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory can include random access memory (Random Access Memory, abbreviation RAM), can also include Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.Optionally, memory may be used also To be at least one storage device for being located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit, Abbreviation CPU), network processing unit (Ne twork Processor, abbreviation NP) etc.;It can also be digital signal processor (Digital Signal Processing, abbreviation DSP), application-specific integrated circuit (Applica tion Specific Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array, Abbreviation FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can It reads to be stored with instruction in storage medium, when run on a computer, realizes the daily record data solution described in the embodiment of the present invention Analysis method.
Specifically, above-mentioned daily record data analytic method, including:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved, to institute State the resolution rules mark that daily record data to be resolved is parsed;
According to the resolution rules identify, in the corresponding analytical function of pre-stored resolution rules, search for pair The analytical function that the daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to the daily record data to be resolved It is parsed, obtains the analysis result of structuring.
It should be noted that other implementations of daily record data analytic method are realized by above computer readable storage medium storing program for executing Example, identical with the embodiment that preceding method embodiment part provides, which is not described herein again.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, can be parsed according to the log task of acquisition The source of daily record data to be resolved in task configuration information, the resolution rules mark parsed to the daily record data to be resolved Know generation parsing task, from the source pulling data of daily record data to be resolved, and pre-stored and resolution rules is called to identify Corresponding analytical function carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, according to daily record data The demand analyzed is different, and programmer only needs to set corresponding configuration information, reduces workload, improves work effect Rate.
In another embodiment provided by the invention, a kind of computer program product for including instruction is additionally provided, when it When running on computers, the daily record data analytic method described in the embodiment of the present invention is realized.
Specifically, above-mentioned daily record data analytic method, including:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved, to institute State the resolution rules mark that daily record data to be resolved is parsed;
According to the resolution rules identify, in the corresponding analytical function of pre-stored resolution rules, search for pair The analytical function that the daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to the daily record data to be resolved It is parsed, obtains the analysis result of structuring.
It should be noted that the other embodiment of daily record data analytic method is realized by above computer program product, Identical with the embodiment that preceding method embodiment portion provides, which is not described herein again.
Daily record data parsing is carried out using scheme provided in an embodiment of the present invention, can be parsed according to the log task of acquisition The source of daily record data to be resolved in task configuration information, the resolution rules mark parsed to the daily record data to be resolved Know generation parsing task, from the source pulling data of daily record data to be resolved, and pre-stored and resolution rules is called to identify Corresponding analytical function carries out daily record data parsing, so as to obtain the analysis result of structuring.Based on this, according to daily record data The demand analyzed is different, and programmer only needs to set corresponding configuration information, reduces workload, improves work effect Rate.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its any combination real It is existing.When implemented in software, can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When loading on computers and performing the computer program instructions, all or It partly generates according to the flow or function described in the embodiment of the present invention.The computer can be all-purpose computer, special meter Calculation machine, computer network or other programmable devices.The computer instruction can be stored in computer readable storage medium In or from a computer readable storage medium to another computer readable storage medium transmit, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is the data storage devices such as server, the data center integrated comprising one or more usable mediums.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk Solid State Disk (SSD)) etc..
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to Non-exclusive inclusion, so that process, method, article or equipment including a series of elements not only will including those Element, but also including other elements that are not explicitly listed or further include as this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that Also there are other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is described using relevant mode, identical similar portion between each embodiment Point just to refer each other, and the highlights of each of the examples are difference from other examples.Especially for device, For daily record data analyzing device, computer program product, computer readable storage medium embodiment, since it is substantially similar to Embodiment of the method, so description is fairly simple, the relevent part can refer to the partial explaination of embodiments of method.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modifications, equivalent replacements and improvements are made within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (10)

1. a kind of daily record data analytic method, which is characterized in that applied to stream process computing engines, the described method includes:
The configuration information of daily record parsing task is obtained, the configuration information includes:The source of daily record data to be resolved is treated to described The resolution rules mark that parsing daily record data is parsed;
It is identified according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, searched for described The analytical function that daily record data to be resolved is parsed;
With reference to the source of the daily record data to be resolved, the analytical function found is called to carry out the daily record data to be resolved Parsing obtains the analysis result of structuring.
2. the method as described in claim 1, which is characterized in that daily record data is generated according to the configuration information received described After the step of parsing task, further include:
The mark of daily record parsing task and the configuration information are stored into database.
3. method as claimed in claim 2, which is characterized in that during daily record data parses task run, the method It further includes:
Detect the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure;
If it is, parsing the mark of task according to the daily record, the configuration information is read from the database, and returns and holds Row is described to be identified according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, is searched for institute The step of stating the analytical function that daily record data to be resolved is parsed.
4. the method as described in claim 1, which is characterized in that the configuration information further includes:The proof rule of analysis result Mark, in the source of the daily record data to be resolved with reference to described in, calls the analytical function found to the daily record to be resolved Data are parsed, and after obtaining the analysis result of structuring, are further included:
It is identified according to the proof rule, in the corresponding verification function of pre-stored proof rule, searched for described The verification function that the analysis result of structuring is verified;
The analysis result of structuring described in the verification function pair found is called to be verified.
5. the method as described in claim 1, which is characterized in that the configuration information further includes:The storage location of analysis result;
In the source of the daily record data to be resolved with reference to described in, the analytical function found is called to the daily record number to be resolved According to being parsed, after the analysis result for obtaining structuring, further include:
The analysis result of the structuring is stored to the storage location of the analysis result.
6. such as the method any one of claim 1-5, which is characterized in that the resolution rules are included in following rule It is at least one:
The data to match with default regular expression are extracted from daily record data;
Extract the data of specified domain in the daily record data of JSON forms;
The data for extracting access target URL from daily record data and generating, the target URL are:In default URL parameter list URL;
Timestamp in daily record data is converted into time text;
Time text in daily record data is converted into timestamp;
It is regional information by the IP address conversion in daily record data;
Extract the character string in daily record data.
7. a kind of daily record data resolver, which is characterized in that applied to stream process computing engines, described device includes:
Acquisition module, for obtaining the configuration information of daily record parsing task, the configuration information includes:Daily record data to be resolved Source, the resolution rules parsed to the daily record data to be resolved identify;
Searching module for being identified according to the resolution rules, in the corresponding analytical function of pre-stored resolution rules, is looked into Look for the analytical function for being parsed to the daily record data to be resolved;
Parsing module for the source with reference to the daily record data to be resolved, calls the analytical function found to wait to solve to described Analysis daily record data is parsed, and obtains the analysis result of structuring.
8. device as claimed in claim 7, which is characterized in that described device further includes:
Memory module, for storing the mark of daily record parsing task and the configuration information into database.
9. device as claimed in claim 8, which is characterized in that described device further includes:
Detection module, for detecting the situation that the daily record parsing task exits in the process of implementation with the presence or absence of failure, if It is to trigger read module;
The read module for parsing the mark of task according to the daily record, is read described with confidence from the database Breath, and trigger the searching module.
10. a kind of daily record data analyzing device, which is characterized in that including processor, communication interface, memory and communication bus, Wherein, processor, communication interface, memory complete mutual communication by communication bus;
Memory, for storing computer program;
Processor during for performing the program stored on memory, realizes any method and steps of claim 1-6.
CN201810019078.6A 2018-01-09 2018-01-09 A kind of daily record data analytic method, device and equipment Pending CN108108288A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810019078.6A CN108108288A (en) 2018-01-09 2018-01-09 A kind of daily record data analytic method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810019078.6A CN108108288A (en) 2018-01-09 2018-01-09 A kind of daily record data analytic method, device and equipment

Publications (1)

Publication Number Publication Date
CN108108288A true CN108108288A (en) 2018-06-01

Family

ID=62219827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810019078.6A Pending CN108108288A (en) 2018-01-09 2018-01-09 A kind of daily record data analytic method, device and equipment

Country Status (1)

Country Link
CN (1) CN108108288A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959659A (en) * 2018-08-14 2018-12-07 杭州安恒信息技术股份有限公司 A kind of log access parsing method and system of big data platform
CN109325009A (en) * 2018-09-19 2019-02-12 亚信科技(成都)有限公司 The method and device of log parsing
CN109614382A (en) * 2018-12-11 2019-04-12 杭州数梦工场科技有限公司 A kind of the log dividing method and device of application
CN109656894A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 Log standardization storage method, device, equipment and readable storage medium storing program for executing
CN110413578A (en) * 2019-06-28 2019-11-05 北京互金新融科技有限公司 The method and apparatus of data parsing
CN110704290A (en) * 2019-09-27 2020-01-17 百度在线网络技术(北京)有限公司 Log analysis method and device
CN111368144A (en) * 2018-12-26 2020-07-03 阿里巴巴集团控股有限公司 Log analysis method, log recording method, log analysis device, log recording device, electronic device, and storage medium
CN112448971A (en) * 2019-08-29 2021-03-05 中科云谷科技有限公司 Data analysis platform, data analysis method and storage medium
CN112463533A (en) * 2020-11-25 2021-03-09 杭州安恒信息技术股份有限公司 Log data analysis method and device, electronic device and storage medium
CN112749543A (en) * 2020-12-22 2021-05-04 浙江吉利控股集团有限公司 Matching method, device, equipment and storage medium for information analysis process
CN114584619A (en) * 2022-03-07 2022-06-03 北京北信源软件股份有限公司 Equipment data analysis method and device, electronic equipment and storage medium
CN116628451A (en) * 2023-05-31 2023-08-22 江苏华存电子科技有限公司 High-speed analysis method for information to be processed

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120124047A1 (en) * 2010-11-17 2012-05-17 Eric Hubbard Managing log entries
CN103929321A (en) * 2013-01-15 2014-07-16 腾讯科技(深圳)有限公司 Log processing method and device
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106201848A (en) * 2016-06-30 2016-12-07 北京奇虎科技有限公司 The log processing method of a kind of real-time calculating platform and device
CN106682097A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for processing log data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120124047A1 (en) * 2010-11-17 2012-05-17 Eric Hubbard Managing log entries
CN103929321A (en) * 2013-01-15 2014-07-16 腾讯科技(深圳)有限公司 Log processing method and device
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106201848A (en) * 2016-06-30 2016-12-07 北京奇虎科技有限公司 The log processing method of a kind of real-time calculating platform and device
CN106682097A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for processing log data

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959659A (en) * 2018-08-14 2018-12-07 杭州安恒信息技术股份有限公司 A kind of log access parsing method and system of big data platform
CN108959659B (en) * 2018-08-14 2021-09-07 杭州安恒信息技术股份有限公司 Log access analysis method and system for big data platform
CN109325009A (en) * 2018-09-19 2019-02-12 亚信科技(成都)有限公司 The method and device of log parsing
CN109325009B (en) * 2018-09-19 2021-11-30 亚信科技(成都)有限公司 Log analysis method and device
CN109656894A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 Log standardization storage method, device, equipment and readable storage medium storing program for executing
CN109614382A (en) * 2018-12-11 2019-04-12 杭州数梦工场科技有限公司 A kind of the log dividing method and device of application
CN111368144A (en) * 2018-12-26 2020-07-03 阿里巴巴集团控股有限公司 Log analysis method, log recording method, log analysis device, log recording device, electronic device, and storage medium
CN110413578A (en) * 2019-06-28 2019-11-05 北京互金新融科技有限公司 The method and apparatus of data parsing
CN112448971B (en) * 2019-08-29 2024-01-23 中科云谷科技有限公司 Data analysis platform, data analysis method and storage medium
CN112448971A (en) * 2019-08-29 2021-03-05 中科云谷科技有限公司 Data analysis platform, data analysis method and storage medium
CN110704290A (en) * 2019-09-27 2020-01-17 百度在线网络技术(北京)有限公司 Log analysis method and device
CN110704290B (en) * 2019-09-27 2024-02-13 百度在线网络技术(北京)有限公司 Log analysis method and device
CN112463533A (en) * 2020-11-25 2021-03-09 杭州安恒信息技术股份有限公司 Log data analysis method and device, electronic device and storage medium
CN112749543A (en) * 2020-12-22 2021-05-04 浙江吉利控股集团有限公司 Matching method, device, equipment and storage medium for information analysis process
CN114584619A (en) * 2022-03-07 2022-06-03 北京北信源软件股份有限公司 Equipment data analysis method and device, electronic equipment and storage medium
CN114584619B (en) * 2022-03-07 2024-02-23 北京北信源软件股份有限公司 Equipment data analysis method and device, electronic equipment and storage medium
CN116628451A (en) * 2023-05-31 2023-08-22 江苏华存电子科技有限公司 High-speed analysis method for information to be processed
CN116628451B (en) * 2023-05-31 2023-11-14 江苏华存电子科技有限公司 High-speed analysis method for information to be processed

Similar Documents

Publication Publication Date Title
CN108108288A (en) A kind of daily record data analytic method, device and equipment
He et al. Towards automated log parsing for large-scale log data analysis
CN109582551B (en) Log data analysis method and device, computer equipment and storage medium
US10467316B2 (en) Systems and methods for web analytics testing and web development
US20170109657A1 (en) Machine Learning-Based Model for Identifying Executions of a Business Process
CN109376069B (en) Method and device for generating test report
CN106534146B (en) A kind of safety monitoring system and method
US20170109676A1 (en) Generation of Candidate Sequences Using Links Between Nonconsecutively Performed Steps of a Business Process
US20170109668A1 (en) Model for Linking Between Nonconsecutively Performed Steps in a Business Process
CN107111527A (en) Data Stream Processing language for analytical instrument software
US20170109667A1 (en) Automaton-Based Identification of Executions of a Business Process
US20170109639A1 (en) General Model for Linking Between Nonconsecutively Performed Steps in Business Processes
KR102067032B1 (en) Method and system for data processing based on hybrid big data system
CN110362968A (en) Information detecting method, device and server
US20230040635A1 (en) Graph-based impact analysis of misconfigured or compromised cloud resources
CN108074033A (en) Processing method, system, electronic equipment and the storage medium of achievement data
US20170109638A1 (en) Ensemble-Based Identification of Executions of a Business Process
US11681606B2 (en) Automatic configuration of logging infrastructure for software deployments using source code
CN104320312A (en) Network application safety test tool and fuzz test case generation method and system
CN114528457A (en) Web fingerprint detection method and related equipment
Azodi et al. A new approach to building a multi-tier direct access knowledgebase for IDS/SIEM systems
CN105184156A (en) Security threat management method and system
CN112307292A (en) Information processing method and system based on advanced persistent threat attack
US20170109640A1 (en) Generation of Candidate Sequences Using Crowd-Based Seeds of Commonly-Performed Steps of a Business Process
US20170109670A1 (en) Crowd-Based Patterns for Identifying Executions of Business Processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180601

RJ01 Rejection of invention patent application after publication