CN107451034A - A kind of big data cluster log management apparatus, method and system - Google Patents

A kind of big data cluster log management apparatus, method and system Download PDF

Info

Publication number
CN107451034A
CN107451034A CN201710706639.5A CN201710706639A CN107451034A CN 107451034 A CN107451034 A CN 107451034A CN 201710706639 A CN201710706639 A CN 201710706639A CN 107451034 A CN107451034 A CN 107451034A
Authority
CN
China
Prior art keywords
daily record
log
server
acquisition module
rank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710706639.5A
Other languages
Chinese (zh)
Inventor
崔俊珩
李国涛
石皓轩
张栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN201710706639.5A priority Critical patent/CN107451034A/en
Publication of CN107451034A publication Critical patent/CN107451034A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/875Monitoring of systems including the internet

Abstract

The invention provides a kind of big data cluster log management apparatus, method and system, the device includes:Daily record memory module, log management module and at least one log acquisition module;At least one log acquisition module is located at least one server that big data cluster includes respectively, wherein, at least one log acquisition module is provided with each described server;Each described log acquisition module, for according to log collection set in advance rule, daily record being gathered from the server at place, and the daily record collected is sent into the daily record memory module;The daily record memory module, for being stored to the daily record received;The log management module, for the query statement according to outside input, corresponding target journaling and output are obtained from the daily record memory module.This programme can improve the efficiency being managed to big data cluster daily record.

Description

A kind of big data cluster log management apparatus, method and system
Technical field
The present invention relates to communication technical field, more particularly to a kind of big data cluster log management apparatus, method and system.
Background technology
Big data refers to the data that can not be caught, managed and be handled with conventional software instrument in the range of certain time Set, it is to need new tupe to have stronger decision edge, see clearly magnanimity, the Gao Zeng for finding power and process optimization ability Long rate and diversified information assets.In order to effectively be stored, analyzed to big data, generally big data cluster is utilized Big data is managed.And big data cluster generally includes distributed file system (HadoopDistributed File System, HDFS), PostgreSQL database (Hadoop Database, HBase), explorer (Yet Another Resource Negotiator, YARN), multiple components such as Tool for Data Warehouse (HIVE), each deployment of components at one or On multiple servers.In order to understand the running status of big data cluster, it is necessary to which the daily record to big data cluster is managed.
At present, when the daily record to big data cluster is managed, it is necessary to be checked respectively using the mode of order line each The running log of individual component.
Because big data cluster includes multiple components, the running log of each component is checked using command line mode, is needed Longer time is expended, causes to be managed big data cluster daily record less efficient.
The content of the invention
The embodiments of the invention provide a kind of big data cluster log management apparatus, method and system, it is possible to increase to big The efficiency that data cluster daily record is managed.
In a first aspect, the embodiments of the invention provide a kind of big data cluster log management apparatus, including:Daily record stores mould Block, log management module and at least one log acquisition module;
At least one log acquisition module is located at least one server that big data cluster includes respectively, its In, at least one log acquisition module is provided with each described server;
Each described log acquisition module, for regular, the clothes from place according to log collection set in advance Daily record is gathered on business device, and the daily record collected is sent to the daily record memory module;
The daily record memory module, for being stored to the daily record received;
The log management module, for the query statement according to outside input, phase is obtained from the daily record memory module Corresponding target journaling simultaneously exports.
Alternatively,
The log acquisition module, for utilizing daily record filter inserts Grok, according to log collection rule definition Store path, journal format and daily record rank, daily record is gathered from the server at place;Wherein, the journal format with The journal format of institute's operating component is corresponding on the server where the log acquisition module, and the daily record rank includes Normal level Debug, follow rank Trace, latent fault rank Warn, error level Error and gross error rank Fatal In any one.
Alternatively,
The daily record memory module, for searching for application server Solr using enterprise-level, for the daily record received Stored after creating index;
The log management module, for carried according to the query statement keyword, daily record rank, daily record generation when Between and journal format in it is at least one, the index that the daily record memory module creates is inquired about, to obtain and the query statement The corresponding target journaling simultaneously exports.
Alternatively,
The log management module, it is further used for the more new command according to outside input, to log collection rule It is updated, and the log collection rule after renewal is sent to each log acquisition module, makes each day Will acquisition module is according to the log collection rule collection daily record after renewal;Wherein, it is described that the log collection rule is entered Row renewal be included in the log collection rule increase it is at least one in store path, journal format and daily record rank, or Person at least one in original store path, journal format and daily record rank in the log collection rule is modified or Delete.
Alternatively,
The log management module, the target journaling is exported for the form by webpage.
Second aspect, the embodiment of the present invention additionally provides a kind of big data cluster blog management method, in big data cluster Including at least one server on log acquisition module is set respectively, wherein, be provided with least on each described server One log acquisition module, in addition to:
Each described log acquisition module is utilized respectively, according to log collection set in advance rule, from the daily record Daily record is gathered on the server where acquisition module;
The daily record collected is stored;
According to the query statement of outside input, corresponding target journaling and defeated is obtained from the daily record stored Go out.
Alternatively,
It is described regular according to log collection set in advance, adopted from the server where the log acquisition module Collect daily record, including:
Using daily record filter inserts Grok, according to store path, journal format and the day of log collection rule definition Will rank, daily record is gathered from the server at place;Wherein, where the journal format and the log acquisition module The journal format of institute's operating component is corresponding on the server, and the daily record rank includes normal level Debug, follows rank Any one in Trace, latent fault rank Warn, error level Error and gross error rank Fatal.
Alternatively,
Before the daily record that described pair collects stores, further comprise:
Application server Solr is searched for using enterprise-level, for the log creation index collected;
The query statement according to outside input, corresponding target journaling is obtained simultaneously from the daily record stored Output, including:
In the keyword, daily record rank, daily record generation time and the journal format that are carried according to the query statement at least One, the index is inquired about, to obtain the target journaling corresponding with the query statement and export.
Alternatively,
According to the more new command of outside input, the log collection rule is updated, and by the day after renewal Will collection rule is sent to each log acquisition module, makes each log acquisition module according to the day after renewal Will collection rule gathers daily record;Wherein, described be updated to the log collection rule is included in the log collection rule It is at least one in middle increase store path, journal format and daily record rank, or to original in the log collection rule At least one in store path, journal format and daily record rank modifies or deleted.
The third aspect, the embodiment of the present invention have also passed through a kind of big data cluster Log Administration System, including:Include to The big data cluster of a few server and above-mentioned first aspect provide any one described in big data cluster log management Device.
Big data log management apparatus provided in an embodiment of the present invention, method and system, include every in big data cluster At least one log acquisition module is set in one server, and each log acquisition module is advised according to log collection set in advance Daily record is then gathered from the server at place, the daily record that daily record memory module collects to each log acquisition module is deposited Storage, log management module obtain target journaling from daily record memory module according to the query statement received and exported.As can be seen here, After daily record in each log acquisition module collection big data cluster on each server, uniformly deposited by daily record memory module Storage, when user needs to inquire about the daily record of big data cluster, it is only necessary to send query statement, log management mould to log management module Block can with from daily record memory module obtain target journaling be sent to user, reduce needed for big data cluster log acquisition when Between, so as to improve the efficiency being managed to big data cluster daily record.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis These accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of schematic diagram for big data cluster log management apparatus that one embodiment of the invention provides;
Fig. 2 is the schematic diagram for another big data cluster log management apparatus that one embodiment of the invention provides;
Fig. 3 is a kind of flow chart for big data cluster blog management method that one embodiment of the invention passes through;
Fig. 4 is a kind of schematic diagram for big data cluster Log Administration System that one embodiment of the invention provides;
Fig. 5 is the flow chart for another big data cluster blog management method that one embodiment of the invention provides.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, rather than whole embodiments, based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained on the premise of creative work is not made, belongs to the scope of protection of the invention.
As shown in figure 1, the embodiments of the invention provide a kind of big data cluster log management apparatus, the device can wrap Include:Daily record memory module 101, log management module 102 and at least one log acquisition module 103;
At least one log acquisition module 103 is located at least one server that big data cluster includes respectively, wherein, At least one log acquisition module 103 is provided with each server;
Each log acquisition module 103, for regular according to log collection set in advance, from the server at place Daily record is gathered, and the daily record collected is sent to daily record memory module 101;
Daily record memory module 101, for being stored to the daily record received;
Log management module 102, for the query statement according to outside input, obtained relatively from daily record memory module 101 The target journaling answered and output.
The embodiments of the invention provide a kind of big data cluster log management apparatus, in each that big data cluster includes At least one log acquisition module is set in server, each log acquisition module according to log collection rule set in advance from Daily record is gathered on the server at place, the daily record that daily record memory module collects to each log acquisition module stores, day Will management module obtains target journaling from daily record memory module according to the query statement received and exported.As can be seen here, it is each After daily record in log acquisition module collection big data cluster on each server, unified storage is carried out by daily record memory module, When user needs to inquire about the daily record of big data cluster, it is only necessary to send query statement, log management module to log management module Can with from daily record memory module obtain target journaling be sent to user, reduce needed for big data cluster log acquisition when Between, so as to improve the efficiency being managed to big data cluster daily record.
Alternatively, as shown in figure 1,
Log acquisition module 103 gathers daily record, specifically, daily record according to log collection rule on server where it Acquisition module utilizes daily record filter inserts Grok, according to store path, journal format and the journal stage of log collection rule definition Not, daily record is gathered on the server where from it.Wherein, the store path of log collection rule definition, is the log collection mould Block daily record to be collected on the server store path, the component specifically run on the server by the log acquisition module Determine;The journal format of log collection rule definition, produced by the component run on the server with the log acquisition module The journal format of daily record is corresponding;The daily record rank of log collection rule definition includes normal level Debug, follows rank Any one in Trace, latent fault rank Warn, error level Error and gross error rank Fatal.
In a first aspect, the store path of daily record defined in log collection rule so that log acquisition module can arrive clothes The store path collection daily record specified on business device, and for storing the daily record in other paths on the server then without collection. So, user can be limited by being defined to the store path in log collection rule with the scope to log collection It is fixed, the operating pressure of log acquisition module is reduced while daily record needed for collection.
For example, server A is used for operating component HDFS, for acquisition component HDFS running log, then correspond to service The store path of device A log collection rule definition should be the path for being used for storage assembly HDFS running logs in server A.
Second aspect, log collection rule also define journal format, because the running log of different components has difference Form, by defining the journal format in log collection rule, daily record caused by different components can be matched, realization has choosing The collection daily record of selecting property.Specifically, journal format can be defined by way of regular expression, and such daily record filtering is inserted Part Grok can match according to regular expression to daily record, obtain the daily record for meeting journal format.
For example, operating component HDFS in server A, corresponding to daily record lattice defined in the log collection rule of server A Formula is corresponding with the Japanese form of daily record produced by component HDFS.
The third aspect, log collection rule also define daily record rank, daily record rank is Debug, Trace, Warn, Any one in Error and Fatal, wherein, 5 daily record level priorities be followed successively by from low to high Debug, Trace, Warn, Error and Fatal, log acquisition module can gather priority and define journal stage equal to and above log collection rule Other daily record.So, by defining daily record rank, log acquisition module can be controlled to gather the daily record rank of daily record, for The relatively low daily record of daily record rank, to reduce the total quantity that log acquisition module gathers daily record, can reduce day without collection The operating pressure of will acquisition module, daily record memory module and log management module.
For example, operating component HDFS in server A, corresponding to journal stage defined in the log collection rule of server A Not Wei Warn, then the log acquisition module in server A in the daily record on acquisition server A, only gather daily record rank be Warn, Error or Fatal daily record.
It should be noted that log collection rule can include more rules, a corresponding server per rule.Example Such as, big data cluster includes 5 servers, wherein, server 1 and server 2 are used for operating component HDFS, and server 3 is used In operating component Hbase, server 4 is used for operating component YARN, and server 5 is used for operating component HIVE.Correspondingly, daily record is adopted Collection rule includes 5 rules, wherein, 5 log collection rule difference corresponding servers 1 to server 5.So, each daily record Acquisition module is regular in the log collection of collection daily record when institute foundation, belongs to the total log collection rule of same, convenient to day Will collection rule is managed.
Alternatively, as shown in figure 1,
The daily record that daily record memory module 101 is used to collect each log acquisition module 103 stores, log management Target journaling is obtained in the daily record that module 102 is used to store from daily record memory module 101 according to query statement and is exported.Specifically Ground, daily record memory module 101 can utilize enterprise-level search application server Solr, first the log creation rope to receiving Draw, then stored to creating the daily record after indexing, wherein, the index of daily record can be the keyword of daily record, daily record rank, It is at least one in daily record generation time and journal format.And log management module 102 can then carry according to query statement Keyword, daily record rank, daily record generation the time and journal format in it is at least one, inquiry log memory module 101 creates Index, obtain the target journaling corresponding with query statement and output.
Daily record memory module is stored after being indexed using Solr to the log creation received, and log management module passes through The index that inquiry log memory module creates, can improve use with quick obtaining to the target journaling corresponding with query statement Family obtains the speed of target journaling, improves the experience of user.
Alternatively, as shown in Fig. 2
Log management module 102 can also be updated according to the more new command of outside input to log collection rule, and Log collection rule after renewal is sent respectively to each log acquisition module 103, makes each basis of log acquisition module 103 Log collection rule after renewal carries out log collection.Wherein, it can be to log collection log collection rule to be updated It is at least one in increase store path, journal format and daily record rank in rule, it can also be to log collection rule Central Plains At least one in some store paths, journal format and daily record rank modifies or deleted.
As in the foregoing embodiment, log collection rule can include more rules, the corresponding service of each rule Device.Log management module is updated after more new command is received, according to more new command to log collection rule, the mistake of renewal Journey can increase by one or more new rule in log collection rule, to realize to New Parent (this in big data cluster The preceding component for not carrying out log collection) daily record be acquired, can also be to one or more in former log collection rule Rule is modified, to realize the log collection rule to existing component (component for having carried out log collection) in big data cluster It is modified, can also be and one or more rule in former log collection rule is deleted, stops to large data sets The daily record of existing component is acquired in group.
User can be updated by log management module to log collection rule, with the self-defined service to be managed The daily record of device or component so that log management is carried out more flexibly and conveniently to big data cluster, further improves user's Experience.
Alternatively, as shown in Figure 1 or 2,
Log management module 102 is after target journaling is got, the target day that be able to will be got by the form of webpage Will exports.
On the one hand, log management module in the form of a web page exports the target journaling got, and realizing can with the page Mode depending on change is managed to big data daily record, improves the usage experience of user;On the other hand, will obtain in the form of a web page The target journaling input got, user can directly check the daily record of big data cluster by browser, improve to big data Cluster carries out the convenience of log management.
As shown in figure 3, one embodiment of the invention provides a kind of big data cluster blog management method, this method can be with Comprise the following steps:
Step 301:Log acquisition module is set respectively at least one server that big data cluster includes, wherein, At least one log acquisition module is provided with each server;
Step 302:Each log acquisition module is utilized respectively, according to log collection set in advance rule, from daily record Daily record is gathered on server where acquisition module;
Step 303:The daily record collected is stored;
Step 304:According to the query statement of outside input, corresponding target journaling is obtained from the daily record stored simultaneously Output.
The embodiments of the invention provide a kind of big data cluster blog management method, each included by big data cluster At least one log acquisition module is set on individual server, using log acquisition module according to log collection rule from each service Daily record is gathered on device, and the daily record to collecting stores, and after query statement is received, is obtained from each daily record of storage Exported after taking corresponding target journaling.As can be seen here, by each log acquisition module, according to log collection rule from each Daily record is gathered on server and carries out unified storage, directly mesh can be obtained from the daily record of storage after query statement is received Daily record is marked, it is relative that daily record is obtained by command line mode, reduce the time needed for big data cluster log acquisition, so as to Improve the efficiency being managed to big data cluster daily record.
Alternatively, as shown in figure 3,
Carrying out the process of log collection in step 302 using log acquisition module can specifically include:
Using daily record filter inserts Grok, according to store path, journal format and the journal stage of log collection rule definition Not, daily record is gathered from the server at place;Wherein, journal format and institute's operation group on the server where log acquisition module The journal format of part is corresponding, daily record rank include normal level Debug, follow rank Trace, latent fault rank Warn, Any one in error level Error and gross error rank Fatal.
Alternatively, as shown in figure 3,
Before step 303 stores to daily record, it can also include:Application server Solr is searched for using enterprise-level, For the log creation index collected;
Correspondingly, step 304 obtains target journaling and exported and can include according to query statement:Carried according to query statement Keyword, daily record rank, daily record generation the time and journal format in it is at least one, search index, referred to obtaining with inquiry Make corresponding target journaling and output.
Alternatively,
On the basis of the blog management method of big data cluster shown in Fig. 3, it can also include:
According to the more new command of outside input, log collection rule is updated, and the log collection after renewal is advised Each log acquisition module is then sent to, makes each log acquisition module according to the log collection rule collection daily record after renewal; Wherein, log collection rule is updated to be included in log collection rule and increases store path, journal format and journal stage It is at least one in not, or in original store path, journal format and daily record rank in log collection rule at least One is modified or deleted.
Alternatively, as described in Figure 3,
In step 304 after target journaling is got, target journaling can be exported by the form of webpage.
It should be noted that each step included by above method embodiment, due to apparatus of the present invention embodiment base In same design, particular content can be found in the narration in apparatus of the present invention embodiment, and here is omitted.
As shown in figure 4, one embodiment of the invention provides a kind of big data cluster Log Administration System, including:Including There is the big data cluster 401 of at least one server 4011 and above mentioned embodiment provide any one big data cluster daily record pipe Device 402 is managed, wherein,
At least one log acquisition module 4021 is provided with each server 4011;
Each log acquisition module 4021 is used to, according to log collection rule, day is gathered on server 4011 where from it Will, and the daily record collected is sent to daily record memory module 4022;
Daily record memory module 4022 is used to store the daily record received;
Log management module 4023 is used for the query statement for receiving outside input, and stores mould from daily record according to query statement Block 4022 obtains corresponding target journaling and output.
With reference to the big data cluster Log Administration System shown in Fig. 4, to large data sets provided in an embodiment of the present invention Group's blog management method is described in further detail, as shown in figure 5, this method may comprise steps of:
Step 501:One log acquisition module is set on each server that big data cluster includes.
In an embodiment of the invention, a big data cluster includes multiple components, each component be required for Lack a server to be run, therefore a big data cluster has generally included multiple servers.In order to large data sets The daily record of group is managed collectively, and a log acquisition module is set on each server that big data cluster includes.
For example, a big data cluster A includes 5 servers altogether, wherein, server 1 and server 2 are used to run group Part HDFS, server 3 are used for operating component Hbase, and server 4 is used for operating component YARN, and server 5 is used for operating component HIVE.Log acquisition module 1 then is set to log acquisition module 5 on server 1 to server 5 respectively.
Step 502:Create log collection rule.
In an embodiment of the invention, according to the need that log management is carried out to each component included by big data cluster Ask, create corresponding to the log collection of each component and server rule, wherein, log collection rule includes more rules, A corresponding component per rule.Log collection rule defines the rank of the store path of daily record, the form of daily record and daily record Filtration parameter, in case log acquisition module be used as with reference to carry out log collection.
For example, include 5 rules for the big data cluster A log collection rules created, wherein,
Rule 1 corresponds to server 1, and store path 1 defined in rule 1 is to be used for storage assembly HDFS on server 1 Running log store path, rule 1 defined in journal format 1 and component HDFS running log journal format phase Together, daily record rank 1 defined in rule 1 is Warn;
Rule 2 corresponds to server 2, and store path 2 defined in rule 2 is to be used for storage assembly HDFS on server 2 Running log store path, rule 2 defined in journal format 2 and component HDFS running log journal format phase Together, daily record rank 2 defined in rule 2 is Warn;
Rule 3 corresponds to server 3, and store path 3 defined in rule 3 is to be used for storage assembly Hbase on server 3 Running log store path, rule 3 defined in journal format 3 and component Hbase running log journal format phase Together, daily record rank 3 defined in rule 3 is Debug;
Rule 4 corresponds to server 4, and store path 4 defined in rule 4 is to be used for storage assembly YARN on server 4 Running log store path, rule 4 defined in journal format 4 and component YARN running log journal format phase Together, daily record rank 4 defined in rule 4 is Error;
Rule 5 corresponds to server 5, and store path 5 defined in rule 5 is to be used for storage assembly HIVE on server 5 Running log store path, rule 5 defined in journal format 5 and component HIVE running log journal format phase Together, daily record rank 5 defined in rule 5 is Trace.
Step 503:Using each log acquisition module, day is gathered from each server according to log collection rule respectively Will.
In an embodiment of the invention, after the completion of log collection rule creation, by the log collection created rule It is sent respectively to each log acquisition module.For each log acquisition module, using the log acquisition module, pass through daily record Daily record is gathered on server where filter inserts Grok from the log acquisition module.
For example, after the log collection rule of establishment is sent respectively into 5 log acquisition modules, 5 log collections are utilized Module gathers daily record from 5 servers respectively according to log collection rule, specifically,
Using log acquisition module 1, the rule 1 included according to log collection rule, by daily record filter inserts Grok, from Daily record 1 is gathered on server 1.Wherein, the store path of each daily record 1 that log acquisition module 1 is gathered on the server is Store path 1, and the journal format of each daily record 1 is identical with journal format 1, and the daily record rank of each daily record 1 be Warn, Error or Fatal;
Using log acquisition module 2, the rule 2 included according to log collection rule, by daily record filter inserts Grok, from Daily record 2 is gathered on server 2.Wherein, the store path of each daily record 2 that log acquisition module 2 is gathered on a server 2 is Store path 2, and the journal format of each daily record 2 is identical with journal format 2, and the daily record rank of each daily record 2 be Warn, Error or Fatal;
Using log acquisition module 3, the rule 3 included according to log collection rule, by daily record filter inserts Grok, from Daily record 3 is gathered on server 3.Wherein, store path of each daily record 3 that log acquisition module 3 is gathered on server 3 be Store path 3, and the journal format of each daily record 3 is identical with journal format 3, and the daily record rank of each daily record 3 be Debug, Trace, Warn, Error or Fatal;
Using log acquisition module 4, the rule 4 included according to log collection rule, by daily record filter inserts Grok, from Daily record 4 is gathered on server 4.Wherein, store path of each daily record 4 that log acquisition module 4 is gathered on server 4 be Store path 4, and the journal format of each daily record 4 is identical with journal format 4, and the daily record rank of each daily record 4 be Error or Fatal;
Using log acquisition module 5, the rule 5 included according to log collection rule, by daily record filter inserts Grok, from Daily record 5 is gathered on server 5.Wherein, store path of each daily record 5 that log acquisition module 5 is gathered on server 5 be Store path 5, and the journal format of each daily record 5 is identical with journal format 5, and the daily record rank of each daily record 5 be Trace, Warn, Error or Fatal.
Step 504:Application server Solr is searched for using enterprise-level, the daily record wound collected to each log acquisition module Index and stored.
In an embodiment of the invention, each log acquisition module is after daily record is collected, the unified day that will be collected Will is sent to enterprise-level search application server Solr, Solr and each log creation received is indexed, and to creating rope Daily record after drawing carries out unified storage.Wherein, can the keyword including daily record, daily record rank, day for the index of log creation Will generates part or all of in time and journal format.
For example, the daily record collected is sent to Solr by 5 log acquisition modules, Solr indexes for each log creation Afterwards, unified storage is carried out to each daily record.
Step 505:The query statement of outside input is received, target journaling is obtained according to query statement.
In an embodiment of the invention, the query statement of real-time reception user input, the inspection carried according to query statement Rope parameter, the index created using Solr inquiries, obtains the target journaling corresponding with query statement.Wherein, query statement The search argument of carrying includes part or all of in keyword, daily record rank, daily record generation time and journal format.
For example, the daily record rank that the query statement of user's input carries is Error, the journal format of carrying is journal format 2, then it is Error or Fatal using Solr inquiry logs rank, and journal format is the daily record of journal format 2, obtains journal stage Not Wei Error or Fatal, and for component HDFS running logs each daily record as target journaling.
Step 506:The each target journaling got is shown.
In an embodiment of the invention, after the target journaling corresponding with query statement is got by Solr, with The form of webpage exports each target journaling got.
It is Error or Fatal to the daily record rank got for example, by the form of webpage, and is run for component HDFS Each daily record of daily record is shown.
The embodiment of the present invention additionally provides a kind of computer-readable recording medium, is stored with execute instruction on the computer-readable recording medium, works as storage Described in the computing device of controller during execute instruction, the storage control performs the big data that above-mentioned each embodiment provides Cluster blog management method.
The embodiment of the present invention has also passed through a kind of storage control, including:Processor, memory and bus;
The memory is used to store execute instruction, and the processor is connected with the memory by the bus, when During the storage control operation, the execute instruction of memory storage described in the computing device, so that the storage Controller performs the big data cluster blog management method that above-mentioned each embodiment provides.
In summary, the big data cluster log management apparatus of each embodiment offer of the present invention, method and system, at least Have the advantages that:
1st, in embodiments of the present invention, at least one daily record is set to adopt in each server that big data cluster includes Collect module, each log acquisition module gathers daily record, day according to log collection rule set in advance from the server at place The daily record that will memory module collects to each log acquisition module stores, and log management module is according to the inquiry received Instruction obtains target journaling from daily record memory module and exported.As can be seen here, each log acquisition module collection big data cluster In after daily record on each server, unified storage is carried out by daily record memory module, user needs to inquire about the day of big data cluster During will, it is only necessary to log management module send query statement, log management module can with from daily record memory module obtain mesh Mark daily record is sent to user, reduces the time needed for big data cluster log acquisition, so as to improve to big data cluster The efficiency that daily record is managed.
2nd, in embodiments of the present invention, there is journal format defined in log collection rule, journal format passes through regular expressions The form of formula is defined.So, journal format can form the form of daily record according to component in big data cluster and be determined Justice, it is acquired and manages so as to the daily record to all kinds component, improves the big data blog management method and dress The applicability put.
3rd, in embodiments of the present invention, there is daily record rank defined in log collection rule, so that when carrying out log collection, Only collection priority, which is equal to or higher than, defines the other daily record of journal stage.So, can be only to preferential by defining daily record rank The higher daily record of level is acquired, and the relatively low conventional daily record for priority then without collection, reduces log collection mould The operating pressure of block so that big data log management work is more simplified and effectively.
4th, in embodiments of the present invention, when gathering daily record from each server, daily record filtering module can be passed through Grok is carried out, and Grok is gathered from server according to each bar parameter of log collection rule definition and met log collection rule Daily record, ensure to gather the accuracy of daily record.
5th, in embodiments of the present invention, it is the day collected by Solr before the daily record to collecting stores Will creates index, so after query statement is received, can quickly be obtained and query statement phase by the index that Solr is created Corresponding target journaling, the speed that user obtains target journaling is improved, ensure that the usage experience of user.
It should be noted that herein, such as first and second etc relational terms are used merely to an entity Or operation makes a distinction with another entity or operation, and not necessarily require or imply and exist between these entities or operation Any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant be intended to it is non- It is exclusive to include, so that process, method, article or equipment including a series of elements not only include those key elements, But also the other element including being not expressly set out, or also include solid by this process, method, article or equipment Some key elements.In the absence of more restrictions, by sentence " including the key element that a 〃 〃 " is limited, it is not excluded that Other identical factor in the process including the key element, method, article or equipment also be present.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through Programmed instruction related hardware is completed, and foregoing program can be stored in computer-readable storage medium, the program Upon execution, the step of execution includes above method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or light Disk etc. is various can be with the medium of store program codes.
It is last it should be noted that:Presently preferred embodiments of the present invention is the foregoing is only, is merely to illustrate the skill of the present invention Art scheme, is not intended to limit the scope of the present invention.Any modification for being made within the spirit and principles of the invention, Equivalent substitution, improvement etc., are all contained in protection scope of the present invention.

Claims (10)

  1. A kind of 1. big data cluster log management apparatus, it is characterised in that including:Daily record memory module, log management module and At least one log acquisition module;
    At least one log acquisition module is located at least one server that big data cluster includes respectively, wherein, often At least one log acquisition module is provided with one server;
    Each described log acquisition module, for regular, the server from place according to log collection set in advance Upper collection daily record, and the daily record collected is sent to the daily record memory module;
    The daily record memory module, for being stored to the daily record received;
    The log management module, for the query statement according to outside input, obtained from the daily record memory module corresponding Target journaling and output.
  2. 2. device according to claim 1, it is characterised in that
    The log acquisition module, for utilizing daily record filter inserts Grok, according to the storage of log collection rule definition Path, journal format and daily record rank, daily record is gathered from the server at place;Wherein, the journal format with it is described The journal format of institute's operating component is corresponding on the server where log acquisition module, and the daily record rank includes normal Rank Debug, follow in rank Trace, latent fault rank Warn, error level Error and gross error rank Fatal Any one.
  3. 3. device according to claim 1, it is characterised in that
    The daily record memory module, for searching for application server Solr using enterprise-level, for the log creation received Stored after index;
    The log management module, for carried according to the query statement keyword, daily record rank, daily record generation the time and It is at least one in journal format, the index that the daily record memory module creates is inquired about, it is relative with the query statement to obtain The target journaling answered simultaneously exports.
  4. 4. device according to claim 1, it is characterised in that
    The log management module, it is further used for the more new command according to outside input, log collection rule is carried out Renewal, and the log collection rule after renewal is sent to each log acquisition module, adopt each daily record Collect module according to the log collection rule collection daily record after renewal;Wherein, it is described that log collection rule is carried out more Newly it is included at least one or right in increase store path, journal format and daily record rank in the log collection rule At least one in original store path, journal format and daily record rank modifies or deleted in the log collection rule Remove.
  5. 5. according to any described device in Claims 1-4, it is characterised in that
    The log management module, the target journaling is exported for the form by webpage.
  6. 6. a kind of big data cluster blog management method, it is characterised in that at least one server that big data cluster includes It is upper that log acquisition module is set respectively, wherein, at least one log acquisition module is provided with each described server, Also include:
    Each described log acquisition module is utilized respectively, according to log collection set in advance rule, from the log collection Daily record is gathered on the server where module;
    The daily record collected is stored;
    According to the query statement of outside input, corresponding target journaling and output are obtained from the daily record stored.
  7. 7. according to the method for claim 6, it is characterised in that it is described according to log collection set in advance rule, from institute State and gather daily record on the server where log acquisition module, including:
    Using daily record filter inserts Grok, according to store path, journal format and the journal stage of log collection rule definition Not, daily record is gathered from the server at place;Wherein, the journal format with it is described where the log acquisition module The journal format of institute's operating component is corresponding on server, and the daily record rank includes normal level Debug, follows rank Any one in Trace, latent fault rank Warn, error level Error and gross error rank Fatal.
  8. 8. according to the method for claim 6, it is characterised in that
    Before the daily record that described pair collects stores, further comprise:
    Application server Solr is searched for using enterprise-level, for the log creation index collected;
    The query statement according to outside input, corresponding target journaling and defeated is obtained from the daily record stored Go out, including:
    It is at least one in the keyword, daily record rank, daily record generation time and the journal format that are carried according to the query statement, The index is inquired about, to obtain the target journaling corresponding with the query statement and export.
  9. 9. according to the method for claim 6, it is characterised in that further comprise:
    According to the more new command of outside input, the log collection rule is updated, and the daily record after renewal is adopted Collection rule is sent to each log acquisition module, each log acquisition module is adopted according to the daily record after renewal Collection rule collection daily record;Wherein, it is described the log collection rule is updated to be included in the log collection rule increase Add it is at least one in store path, journal format and daily record rank, or to original storage in the log collection rule At least one in path, journal format and daily record rank modifies or deleted.
  10. A kind of 10. big data cluster Log Administration System, it is characterised in that including:Include the big number of at least one server According to any described big data cluster log management apparatus in cluster and claim 1 to 5.
CN201710706639.5A 2017-08-17 2017-08-17 A kind of big data cluster log management apparatus, method and system Pending CN107451034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710706639.5A CN107451034A (en) 2017-08-17 2017-08-17 A kind of big data cluster log management apparatus, method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710706639.5A CN107451034A (en) 2017-08-17 2017-08-17 A kind of big data cluster log management apparatus, method and system

Publications (1)

Publication Number Publication Date
CN107451034A true CN107451034A (en) 2017-12-08

Family

ID=60492705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710706639.5A Pending CN107451034A (en) 2017-08-17 2017-08-17 A kind of big data cluster log management apparatus, method and system

Country Status (1)

Country Link
CN (1) CN107451034A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108284A (en) * 2017-12-26 2018-06-01 广东欧珀移动通信有限公司 Log processing method, device, terminal device and storage medium
CN108920349A (en) * 2018-06-25 2018-11-30 郑州云海信息技术有限公司 The management method and device of daily record data
CN109344034A (en) * 2018-09-29 2019-02-15 郑州云海信息技术有限公司 A kind of method and apparatus for managing log
CN109408481A (en) * 2018-11-06 2019-03-01 北京字节跳动网络技术有限公司 Update method, device, electronic equipment and the readable medium of log collection rule
CN109522037A (en) * 2018-11-16 2019-03-26 北京车和家信息技术有限公司 Document handling method, device, server and computer readable storage medium
CN109617726A (en) * 2018-12-14 2019-04-12 深圳壹账通智能科技有限公司 Error log acquisition method and server
CN109753422A (en) * 2019-01-02 2019-05-14 浪潮商用机器有限公司 A kind of method, apparatus, equipment and the storage medium of acquisition server system log
CN109766206A (en) * 2018-12-29 2019-05-17 北京中电普华信息技术有限公司 A kind of log collection method and system
CN109800223A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Log processing method, device, electronic equipment and storage medium
CN109861843A (en) * 2018-11-28 2019-06-07 阿里巴巴集团控股有限公司 Complete acquisition confirmation method, device and the equipment of journal file
CN110008086A (en) * 2019-04-04 2019-07-12 星潮闪耀移动网络科技(中国)有限公司 A kind of log generation method, device and a kind of client
CN111061721A (en) * 2018-10-16 2020-04-24 成都鼎桥通信技术有限公司 Data processing method and device
CN111182066A (en) * 2019-12-31 2020-05-19 青梧桐有限责任公司 Log level dynamic adjustment method based on token authentication
CN111459984A (en) * 2020-03-30 2020-07-28 北京邮电大学 Log data processing system and method based on streaming processing
CN111698109A (en) * 2019-03-14 2020-09-22 北京京东尚科信息技术有限公司 Method and device for monitoring log
CN111737091A (en) * 2020-08-27 2020-10-02 北京安帝科技有限公司 Log processing method and device and readable medium
WO2020253125A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Log management method, apparatus, and device, and storage medium
CN112148686A (en) * 2020-09-25 2020-12-29 酒泉钢铁(集团)有限责任公司 Data monitoring processing method based on software and hardware integrated machine
CN112181929A (en) * 2020-09-24 2021-01-05 杭州安恒信息技术股份有限公司 Cloud management platform log processing method and device, electronic device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411533A (en) * 2011-08-08 2012-04-11 浪潮电子信息产业股份有限公司 Log-management optimizing method for clustered storage system
CN102609502A (en) * 2012-02-02 2012-07-25 深圳市中兴移动通信有限公司 Method and system for mobile terminal desktop searching based on log mode
CN102780726A (en) * 2011-05-13 2012-11-14 中兴通讯股份有限公司 Log analysis method and log analysis system based on WEB platform
CN103177116A (en) * 2013-04-08 2013-06-26 国电南瑞科技股份有限公司 Distributed log handling and inquiring method based on two-stage index
CN103412893A (en) * 2013-07-24 2013-11-27 广东电子工业研究院有限公司 Collecting system and collecting method of logs
CN103425750A (en) * 2013-07-23 2013-12-04 国云科技股份有限公司 Cross-platform and cross-application log collecting system and collecting managing method thereof
CN106130782A (en) * 2016-07-19 2016-11-16 努比亚技术有限公司 A kind of method and system obtaining server log

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780726A (en) * 2011-05-13 2012-11-14 中兴通讯股份有限公司 Log analysis method and log analysis system based on WEB platform
CN102411533A (en) * 2011-08-08 2012-04-11 浪潮电子信息产业股份有限公司 Log-management optimizing method for clustered storage system
CN102609502A (en) * 2012-02-02 2012-07-25 深圳市中兴移动通信有限公司 Method and system for mobile terminal desktop searching based on log mode
CN103177116A (en) * 2013-04-08 2013-06-26 国电南瑞科技股份有限公司 Distributed log handling and inquiring method based on two-stage index
CN103425750A (en) * 2013-07-23 2013-12-04 国云科技股份有限公司 Cross-platform and cross-application log collecting system and collecting managing method thereof
CN103412893A (en) * 2013-07-24 2013-11-27 广东电子工业研究院有限公司 Collecting system and collecting method of logs
CN106130782A (en) * 2016-07-19 2016-11-16 努比亚技术有限公司 A kind of method and system obtaining server log

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108284A (en) * 2017-12-26 2018-06-01 广东欧珀移动通信有限公司 Log processing method, device, terminal device and storage medium
CN108920349A (en) * 2018-06-25 2018-11-30 郑州云海信息技术有限公司 The management method and device of daily record data
CN109344034A (en) * 2018-09-29 2019-02-15 郑州云海信息技术有限公司 A kind of method and apparatus for managing log
CN111061721A (en) * 2018-10-16 2020-04-24 成都鼎桥通信技术有限公司 Data processing method and device
CN109408481A (en) * 2018-11-06 2019-03-01 北京字节跳动网络技术有限公司 Update method, device, electronic equipment and the readable medium of log collection rule
CN109408481B (en) * 2018-11-06 2022-05-06 北京字节跳动网络技术有限公司 Log collection rule updating method and device, electronic equipment and readable medium
CN109522037A (en) * 2018-11-16 2019-03-26 北京车和家信息技术有限公司 Document handling method, device, server and computer readable storage medium
CN109861843A (en) * 2018-11-28 2019-06-07 阿里巴巴集团控股有限公司 Complete acquisition confirmation method, device and the equipment of journal file
CN109861843B (en) * 2018-11-28 2021-11-23 阿里巴巴集团控股有限公司 Method, device and equipment for completely collecting and confirming log files
CN109800223A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Log processing method, device, electronic equipment and storage medium
CN109617726A (en) * 2018-12-14 2019-04-12 深圳壹账通智能科技有限公司 Error log acquisition method and server
CN109766206A (en) * 2018-12-29 2019-05-17 北京中电普华信息技术有限公司 A kind of log collection method and system
CN109753422A (en) * 2019-01-02 2019-05-14 浪潮商用机器有限公司 A kind of method, apparatus, equipment and the storage medium of acquisition server system log
CN111698109A (en) * 2019-03-14 2020-09-22 北京京东尚科信息技术有限公司 Method and device for monitoring log
CN110008086A (en) * 2019-04-04 2019-07-12 星潮闪耀移动网络科技(中国)有限公司 A kind of log generation method, device and a kind of client
CN110008086B (en) * 2019-04-04 2023-07-11 新浪技术(中国)有限公司 Log generation method and device and client
WO2020253125A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Log management method, apparatus, and device, and storage medium
CN111182066A (en) * 2019-12-31 2020-05-19 青梧桐有限责任公司 Log level dynamic adjustment method based on token authentication
CN111459984A (en) * 2020-03-30 2020-07-28 北京邮电大学 Log data processing system and method based on streaming processing
CN111737091A (en) * 2020-08-27 2020-10-02 北京安帝科技有限公司 Log processing method and device and readable medium
CN111737091B (en) * 2020-08-27 2020-12-08 北京安帝科技有限公司 Log processing method and device and readable medium
CN112181929A (en) * 2020-09-24 2021-01-05 杭州安恒信息技术股份有限公司 Cloud management platform log processing method and device, electronic device and storage medium
CN112148686A (en) * 2020-09-25 2020-12-29 酒泉钢铁(集团)有限责任公司 Data monitoring processing method based on software and hardware integrated machine

Similar Documents

Publication Publication Date Title
CN107451034A (en) A kind of big data cluster log management apparatus, method and system
CN107886238B (en) Business process management system and method based on mass data analysis
CN105718515B (en) Data-storage system and its method and data analysis system and its method
CN105243159A (en) Visual script editor-based distributed web crawler system
CN110134584A (en) A kind of generation method, device, storage medium and the server of interface testing use-case
CN103902537B (en) Multi-service log data storage processing and inquiring system and method thereof
CN101535955A (en) Managing parameters for graph-based computations
CN109359094A (en) A kind of full link tracing method and device of distributed system journal
CN101208695A (en) Managing metadata for graph-based computations
CN108108466A (en) A kind of distributed system journal query analysis method and device
CN103631922B (en) Extensive Web information extracting method and system based on Hadoop clusters
CN104951512A (en) Public sentiment data collection method and system based on Internet
CN105677842A (en) Log analysis system based on Hadoop big data processing technique
CN108650684A (en) A kind of correlation rule determines method and device
CN102932195A (en) Networking protocol analysis-based business analysis monitoring method and system
CN104317942A (en) Massive data comparison method and system based on hadoop cloud platform
CN109885744A (en) Web data crawling method, device, system, computer equipment and storage medium
Sartipi et al. On modeling software architecture recovery as graph matching
CN107153702A (en) A kind of data processing method and device
CN103258017A (en) Method and system for parallel square crossing network data collection
CN105159925B (en) A kind of data-base cluster data distributing method and system
CN106802928B (en) Power grid historical data management method and system
CN107360035A (en) A kind of data processing method and system
Theeten et al. Chive: Bandwidth optimized continuous querying in distributed clouds
CN108073582A (en) A kind of Computational frame selection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171208

RJ01 Rejection of invention patent application after publication