CN107341258A - A kind of log data acquisition method and system - Google Patents
A kind of log data acquisition method and system Download PDFInfo
- Publication number
- CN107341258A CN107341258A CN201710564475.7A CN201710564475A CN107341258A CN 107341258 A CN107341258 A CN 107341258A CN 201710564475 A CN201710564475 A CN 201710564475A CN 107341258 A CN107341258 A CN 107341258A
- Authority
- CN
- China
- Prior art keywords
- data
- daily record
- flume
- branch
- read
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/389—Keeping log of transactions for guaranteeing non-repudiation of a transaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Computer Security & Cryptography (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention provides a kind of log data acquisition method and system, method includes:The target area of daily record data to be collected is divided at least two pickup areas in advance, each pickup area includes:Data center and at least one branch, daily record storage system are located at the data center of the first pickup area of one of at least two pickup area;The daily record data of the web server in each branch and data center is gathered, and the daily record data of each branch of collection is transferred to the data center of branch place pickup area;The daily record data of each pickup area is stored to daily record storage system by the Flume Primary Receives end of the data center of the first pickup area.Log collection scheme of the present invention, it is integrated with reference to characteristics such as high availability, high reliability, high timeliness, improves operating efficiency.
Description
Technical field
The present invention relates to data processing technique, is concretely a log collection method and system.
Background technology
At present, ecommerce and internet finance are developed rapidly.Bringing advantage to the user property of online transaction it is same
When, potential risk is also faced with, such as user account is usurped, financial swindling and money laundering.Therefore, monitoring of the enterprise to transaction risk
Demand is increasingly strong.When the traditional forms of enterprises is monitored to transaction risk, air control department generally is set up in enterprises, user is handed over
Easily enter under line and analyze, intervention processing is carried out after finding suspicious data.With the development of big data technology, transaction risk control by
Stepping enters digitlization and intellectuality.By the means of big data, transaction risk monitoring not only uses manpower and material resources sparingly, and improves work effect
Rate, while can effectively reduce loss caused by economic crime.Daily record data is that transaction risk monitoring is dug using big data technology
One important source of its information needed during pick analysis, therefore, all kinds of Log Collect Systems have obtained making extensively in enterprises
With.
In the related product of numerous log collections of prior art, Flume is the wherein a height with higher popularity
Performance distributed is increased income product, it provide can easy configuration multilayer collection framework, can support efficiently to receive from multiple data sources
Collection daily record data is simultaneously saved in central data warehouse.But the log collection mode of prior art is to gather number by file granularity
According to that is, pending file is acquired again after generating.Because its picking rate is limited, for the higher industry of some ageing demands
Business, such as carry out marketing in real time, financial system tracking customer transactional data by online recommended user's commodity in e-commerce venture
In the application such as monitoring trading risk, can not meet the needs of real-time.
In order to accelerate log collection, in the prior art, Flume proposes the method gathered line by line in a manner of tail-F, real
The continuous collecting of daily record is showed.But which has the following disadvantages, i.e. once application service is restarted, log content covering
Or the abnormal event such as deletion, it can cause loss of data or collect the hemistich data of mistake, cause follow-up daily record point
Analysis makes a mistake.
The content of the invention
In order to overcome the shortcomings of to tackle abnormal event in traditional logs collection, cause loss of data, log analysis mistake
The problems such as, the embodiments of the invention provide a kind of log data acquisition method, method includes:
The target area of daily record data to be collected is divided at least two pickup areas in advance, each pickup area is wrapped
Include:Data center and at least one branch, daily record storage system are located at one of at least two pickup area
The data center of first pickup area;
Gather the daily record data of the web server in each branch and data center, and by each branch of collection
Daily record data be transferred to the data center of pickup area where the branch;
The daily record data of each pickup area by the Flume Primary Receives end of the data center of the first pickup area store to
Daily record storage system.
In the embodiment of the present invention, the daily record data of described each branch of collection and the web server in data center,
And the data center that the daily record data of each branch of collection is transferred to branch place pickup area includes:
The daily record data of the web server of the branch of collection, transmitted by the Flume receiving terminals positioned at branch
The Flume receiving terminals of the data center of pickup area where to the branch.
In the embodiment of the present invention, data center that the daily record data of described each pickup area passes through the first pickup area
Flume Primary Receives end, which is stored to daily record storage system, to be included:
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center
Stored to the Flume Primary Receives end of the data center of the first pickup area to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to the Flume Primary Receives end and stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, by corresponding data
The Flume Secondary Receives end of the heart, which is transmitted to the Flume Primary Receives end, to be stored to daily record storage system.
In the embodiment of the present invention, the Flume Secondary Receives end of the data center of non-first pickup area passes through express network
Private line access is to Flume Primary Receives end.
In the embodiment of the present invention, the daily record data of described each branch of collection and the web server in data center,
And the data center that the daily record data of each branch of collection is transferred to branch place pickup area includes:
The daily record data of web server is read in units of data block and writes transfer queue;
Daily record data in transfer queue is sent to Flume receiving terminals;
The downstream for determining daily record data according to the Flume receiving terminals location type sends ground.
In the embodiment of the present invention, described method includes:
The segmentation principle of the daily record data of web server is preset, described segmentation principle includes:With size or time
Cutting is carried out to daily record data;
The journal file of storing daily record data is generated according to the segmentation principle of the daily record data of the web server of setting.
In the embodiment of the present invention, the daily record data that web server is read in units of data block simultaneously writes transfer
Queue includes:
Step 1, file pointer is pointed to daily record to be collected;
Step 2, the daily record data in current log file is read in units of data block from the amount of specifying Offsets;
Step 3, read character one by one from data block and be put into caching;
Step 4, transfer queue is write by the character in row extraction caching.
In the embodiment of the present invention, it is described read character one by one from data block and be put into caching include:
Judge whether to read new character to determine whether for data block tail;
It is determined that read for data block tail, then perform step 2.
In the embodiment of the present invention, the character write-in transfer queue in the extraction caching by row includes:
Judgement reads whether new character is newline;
It is determined that the character read is newline, the character write-in transfer queue in extraction caching;
It is determined that the character read is not newline, then step 3 is performed.
In the embodiment of the present invention, it is determined that before what is read performs step 2 for data block tail, further execution journal is abnormal
Detection,
Daily record exception is determined, then performs step 2 after resetting pointer offset amount.
In the embodiment of the present invention, further comprise before step 2 is performed:Determine whether newly-increased journal file;Its
In,
It is determined that without newly-increased journal file, step 2 is performed;
It is determined that there is newly-increased journal file, then it is next that newly-increased journal file is specified after current log file has been read
Journal file to be read.
In the embodiment of the present invention, include before step 3 is performed:
Judge whether to read data block;
It is determined that reading data block, then step 3 is performed;
It is determined that not reading data block, then carry out waiting default specified time.
In the embodiment of the present invention, it is determined that not reading data block, after waiting default specified time, execution journal is abnormal
Detection, execution step determines whether newly-increased journal file after daily record exception then resets pointer offset amount.
Meanwhile the present invention also provides a kind of log data acquisition system, including:
Region division device, for the target area of daily record data to be collected to be divided into at least two pickup areas, respectively
Pickup area includes:Data center and at least one branch, daily record storage system are located at least two acquisition zone
The data center of first pickup area in one of domain;
Log data acquisition device, for gathering the daily record data of the web server in each branch and data center,
And the daily record data of each branch of collection is transferred to the data center of branch place pickup area;
Flume Primary Receives end, the data center of first pickup area is arranged at, for by the day of each pickup area
Will data storage is to daily record storage system.
In the embodiment of the present invention, described log data acquisition device includes:
Client is gathered, each branch and data center is arranged at, gathers in each branch and data center
Web server daily record data;
Flume receiving terminals, it is arranged at each branched structure and data center;Wherein,
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center
Stored to the Flume Primary Receives end to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to Flume Primary Receives end and stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to the Secondary Receive end of corresponding data center, and are passed by the Flume Secondary Receives end of data center
The Flume Primary Receives end is transported to store to daily record storage system.
In the embodiment of the present invention, the Flume Secondary Receives end of the data center of non-first pickup area passes through express network
Private line access is to Flume Primary Receives end.
In the embodiment of the present invention, described harvester includes:
Read module, the daily record data of web server is read in units of data block and writes transfer queue;
Transit module, for the daily record data in transfer queue to be sent to Flume receiving terminals;
The Flume receiving terminals determine that the downstream of daily record data sends ground according to location type.
In the embodiment of the present invention, the read module includes:
Principle presets unit, the segmentation principle of the daily record data for presetting web server, described segmentation principle
Including:Cutting is carried out to daily record data with size or time;
Cutting unit, the segmentation principle for the daily record data of the web server according to setting generate storing daily record data
Journal file.
In the embodiment of the present invention, described read module reads the daily record data of web server simultaneously in units of data block
Write-in transfer queue includes:
Step 1, file pointer is pointed to daily record to be collected;
Step 2, the daily record data in current log file is read in units of data block from the amount of specifying Offsets;
Step 3, read character one by one from data block and be put into caching;
Step 4, transfer queue is write by the character in row extraction caching.
In the embodiment of the present invention, described read module also includes:Daily record abnormality detection module is abnormal for execution journal
Detection.
In the embodiment of the present invention, described read module also includes:
Block tail judging unit, for judging whether to read new character to determine whether for data block tail;
It is determined that read for data block tail, then perform step 2.
In the embodiment of the present invention, the read module also includes:
Newline judging unit, for judging to read whether new character is newline;
It is determined that the character read is newline, the character write-in transfer queue in extraction caching;
It is determined that the character read is not newline, then step 3 is performed.
In the embodiment of the present invention, it is determined that before what is read performs step 2 for data block tail, daily record abnormality detection module is held
Row daily record abnormality detection, determines daily record exception, then performs step 2 after resetting pointer offset amount.
In the embodiment of the present invention, also include in read module:
Newly-increased daily record judging unit, for determining whether newly-increased journal file;Wherein,
It is determined that without newly-increased journal file, step 2 is performed;
It is determined that there is newly-increased journal file, then it is next that newly-increased journal file is specified after current log file has been read
Journal file to be read.
In the embodiment of the present invention, also include in read module:
Data block judging unit, for judging whether to read data block;
It is determined that reading data block, then step 3 is performed;
It is determined that not reading data block, then carry out waiting default specified time.
It is determined that not reading data block, after waiting default specified time, held using the daily record abnormality detection module
Row daily record abnormality detection, execution step determines whether newly-increased journal file after daily record exception then resets pointer offset amount.
The present invention proposes improved log collection scheme, is one with reference to characteristics such as high availability, high reliability, high timeliness
Body, improve operating efficiency.Log data acquisition efficiency and reliability can be improved using this technology, ensure that production system is normally transported
OK.
For the above and other objects, features and advantages of the present invention can be become apparent, preferred embodiment cited below particularly,
And coordinate institute's accompanying drawings, it is described in detail below.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is the flow chart of log data acquisition method disclosed by the invention;
Fig. 2 is the schematic diagram of the daily record real-time acquisition system of an embodiment of the present invention;
Fig. 3 is the flow chart that daily record gathers in real time in the embodiment of the present invention;
Fig. 4 is the gathering algorithm flow chart of daily record real-time acquisition system in the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
The present invention provides a kind of log data acquisition method, as shown in figure 1, this method includes:
Step S101, the target area of daily record data to be collected is divided at least two pickup areas in advance, each collection
Region includes:Data center and at least one branch, daily record storage system be located at least two pickup area its
One of the first pickup area data center;
Step S102, gathers the daily record data of the web server in each branch and data center, and by each of collection
The data center of pickup area where the daily record data of branch is transferred to the branch;
Step S103, the Flume one-levels for the data center that the daily record data of each pickup area passes through the first pickup area connect
Receiving end is stored to daily record storage system.
Wherein, the daily record data of each branch of above-mentioned collection and the web server in data center, and by collection
The data center of pickup area includes where the daily record data of each branch is transferred to the branch:
The daily record data of the web server of the branch of collection, transmitted by the Flume receiving terminals positioned at branch
The Flume receiving terminals of the data center of pickup area where to the branch.
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center
Stored to the Flume Primary Receives end of the data center of the first pickup area to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to the Flume Primary Receives end and stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, by corresponding data
The Flume Secondary Receives end of the heart, which is transmitted to the Flume Primary Receives end, to be stored to daily record storage system.
In the embodiment of the present invention, the Flume Secondary Receives end of the data center of non-first pickup area passes through express network
Private line access is to Flume Primary Receives end.
It is illustrated in figure 2 the schematic diagram of the daily record real-time acquisition system of an embodiment of the present invention, in the present embodiment in advance
Multiple regions are marked off according to gathering geographic position, are region 1 and area first by daily record target area Preliminary division to be collected
Domain 2, a data center is set up respectively in region 1 and region 2, data center 1 is responsible for the daily record data of collecting zone 1, data
Then it is responsible for collecting the daily record data for storing and including region 1 and region 2 in center 2.
A network special line, the daily record number of data center 1 are set up in the present embodiment between data center 1 and data center 2
According to the passage high-speed transfer to data center 2 can be passed through.Further, in specifically deployment Collecting operation, segment out in data
This four classes log collection region of the heart 1, data center 2, branch 1 and branch 2.In the embodiment of the present invention, branch 1
It is with branch 2 with the difference of data center 1 and data center 2, branch is only used for the daily record inside collecting mechanism
Data, data center then collect, collected including whole branches in notebook data center and affiliated area (if being deployed with multiple points
Branch mechanism) daily record data.
Data center 2 is daily record storage system location in the present embodiment, and data center 2 is handling daily record number in one's respective area
The daily record data of other data centers can be further collected on the basis of.It is it should be noted that above-mentioned in the embodiment of the present invention
Division methods do not have absoluteness, and the thought that " dividing and rule " can be based on according to the actual conditions of each collection case is carried out most preferably
Division.
Inside acquisition zone, provided with a small-sized acquisition system, acquisition system is by multiple collection clients and multiple Flume
Receiving terminal forms, and runs in a LAN intranet environment.Collection client operates in Web application clothes in a manner of without intrusion
It is engaged on device, is responsible for the daily record in real time on collection home server, because the computing resource of its occupancy Web Application Server is smaller,
Pressure will not be produced to the service application of server.
The present invention has carried out high availability, high reliability, high effective innovative design to client log collection.In order to
When pursuing the low consumption of remote transmission data, invention introduces an express network passage, the day of distant pickup area
Will data can be then transmitted by the high-speed channel to daily record storage system location;In order to mitigate daily record storage system HDFS text
Part writes pressure and overhead, the present invention devise multi-layer Flume patterns, and data can only be write step by step by Flume ends
HDFS.Log data acquisition efficiency and reliability can be improved using this technology, ensure production system normal operation.
Flume receiving terminals are operated in Flume servers in the present embodiment, are responsible for real-time collecting combined data and are sent under
Node is swum, its technology used is ripe Flume acquisition technique.Herein, the downstream node is the day of Flume receiving terminals
Will data send terminal, and downstream node had both been probably Flume receiving terminals, it is also possible to daily record storage system.Each small-sized collection
The data of system, which are finally summed up, carries out follow-up excavation processing in the daily record storage system of some acquisition system, in case of the present invention
The system is the HDFS file system of data center 2.In order to rationally utilize server resource, above-mentioned deployment strategy follow it is simple,
Inside flexible principle, either data center or branch, the number of client and Flume receiving terminals is gathered according to adopting
The flexible in size adjustment of set task amount.
According to position attribution of the Flume receiving terminals with respect to HDFS file system, multi-layer setting has been carried out to Flume.Its
In, one-level Flume receiving terminals are responsible for receiving the daily record data from two level Flume ends and collect write-in HDFS systems, two level
Flume receiving terminals are responsible for receiving the daily record data of three-level Flume receiving terminals and are forwarded in one-level Flume receiving terminals, successively class
Push away.Herein, the setting of rank, which is only used for distinguishing this rank Flume receiving terminals, is finally transferred to daily record data HDFS files system
System needs the Flume receiving terminal numbers passed through, its function and indistinction.For example, three-level Flume receiving terminals are needed by three
Flume receiving terminals (including local terminal) could finally store and arrive HDFS file system.The direct data transfer of Flume receiving terminals uses
AVRO agreements.By taking Tu1Zhong data centers 2 as an example, the Flume collection terminals of two levels are provided with center, two level Flume ends will adopt
One-level Flume ends are transmitted to after the daily record of collection client collection, one-level Flume is responsible at end receiving in central interior, data
The data at the heart 1 and the two level Flume ends of branch 2 simultaneously write HDFS.And inside data center 1 and branch 2, it is provided only with
The two level Flume receiving terminals of individual layer, it is responsible for being restored again into after daily record data is transmitted to the Primary Receive end transfer of data center 2
HDFS.Unlike, due to branch 2, range data center 2 is relatively near on geographical position, and data transfer walks common net winding thread
Road, the data of data center 1 can obtain more high transformation property by the network special line of data center 1 and 2.For branch 1,
The data for gathering client are first forwarded to the Flume Secondary Receives end of data center 1 via Flume three-level receiving terminals, last same
The data of data center 2 are stored in HDFS via express network passage together.
It should be noted that client, the number of Flume receiving terminals, form Design are gathered in above example not by real
Border performance is drawn, and gathering client, the specific number of Flume receiving terminals, form when actually implementing may arbitrarily change,
Arrangement form is also increasingly complex.
In daily record real-time acquisition system of the present invention, log data stream priority process collection client and Flume three-level,
Two level and Primary Receive end, finally flow into HDFS daily record storage system.Client is either gathered in the present embodiment still
Flume receiving terminals, its internal structure have identical feature, i.e. internal structure is divided into reading module and hair digital-to-analogue block.Its feature
Also reside in, both common one daily record data transfer queues of management service, the transfer queue is used for log cache data, reading mould
Block and hair digital-to-analogue block asynchronously carry out read/write operation, the operation of non-interference other side's module to the queue.It is illustrated in figure 3 this
The flow chart that daily record gathers in real time in inventive embodiments.
Step 101:The reading module of log collection client points to pointer file to be collected, dynamically with data block
Queue is put into for unit reading and parsing
In the present embodiment, the daily record data that system is applied is basic using journal file by the application program at Web server end
Unit, deposit in log collection catalogue and be managed.It is difficult to check in order to avoid individual log file is excessive, finger can be passed through
Determine the capacity of size and each journal file of time control, therefore contain multiple journal files in log collection catalogue, wherein one
Individual is current log file, and remaining is daily record archive file.Application program presets daily record segmentation principle, supports at present with big
Small and two kinds of slit modes of time.If with daily record size cutting, it is set size to ensure each journal file;If cut with the time
Point, then ensure one journal file of generation per minute.Current log file data growing amount once reaches cutting requirement, then stops
Write in current file, while generate the new journal file for being used to write data.Application program is by current log file
Addition numeric suffix becomes Historical archiving file, and newly-generated file continues to use the name of current log file.
Reading module carries out block-by-block reading to newly-generated journal file one by one when gathering daily record data.Specifically, institute
State module and reading pointer pointed into current log file, once read one section of log content using data block as base unit every time,
Go out daily record line by line from the Context resolution of reading.If being found in certain reading, daily record generates without new data, and the module can incite somebody to action
Pointer points to log read position next time, stops the time specifying, waits new log content to generate.Stand-by period terminates
Afterwards, new journal file whether is generated in the period that the module first goes to judge to wait.If so, then explanation is previously read daily record
For file there is no new data generation, the module first to run through the remaining content of earlier file from last pointer points to position,
Then pointer is pointed into newly-generated file to start to read new data.If nothing, illustrate the journal file being previously read also not
To switching designated capabilities, the module continues the reading of content in a manner of data block and parsed.Further, in log collection mistake
Cheng Zhong, add daily record abnormality detection the step of, for daily record be deleted or content cover the problems such as correct acquisition strategies in time, protect
Demonstrate,prove the correctness integrality of institute's gathered data.The detailed description of above-mentioned gathering algorithm step is shown in Fig. 3.
Step 102:The hair digital-to-analogue block of log collection client is responsible for reading data row hair from the transfer queue of client
It is sent to downstream Flume servers.
The Flume receiving terminal address lists that data can be transmitted in downstream, the hair digital-to-analogue block are configured with inside each client
Connection between Flume receiving terminals uses load-balancing mechanism.After client terminal start-up is gathered, it is described hair digital-to-analogue block from this
An address is randomly choosed in list, long connection is established to the Flume receiving terminals corresponding to the address.To improve transmitting efficiency,
The hair digital-to-analogue block can read some data rows from transfer queue every time, and number of data lines mesh is preset value, if transfer queue
Remaining data line number is less than the preset value, then reads remaining line number.The data row of reading is packaged into a data by the module
Bag, corresponding Flume receiving terminals are sent to by Transmission Control Protocol.
Step 103:Send ground according to the type selecting downstream in the location of Flume receiving terminal servers.
According to the different qualities of the region of Flume receiving terminal servers, the Flume receiving terminals of different zones are being carried out
Different processing modes is used corresponding to data transfer phase.
As shown in Figure 3, judge that Flume servers region for branch 1, then sends data to data center
1, turn to step 104;Judge that Flume servers region for data center 1 or branch 2, is then sent data to
Data center 2, turn to step 105;Flume servers region is judged for data center 2, then need not be sent data to outer
Portion region, turn to step 106.
Step 104:The Flume three-level receiving terminals of data center 1 are sent data to positioned at data center's 1Flume two levels
Receiving terminal.
Flume receiving terminals positioned at branch 1 are referred to as three-level receiving terminal in system level aspect.Flume receiving terminal levels
Not higher, the Flume receiving terminals that data transfer needs pass through are more, and the transmission time of consuming is also more therewith.If by common
Circuit directly sends data to data center 2, is unfavorable for ensureing the ageing of data transfer.In order to avoid high latency, branch
The Flume receiving terminals of mechanism 1 forward the data to data center 1, are carried out by the Flume receiving terminals of data center 1 high in next step
The data transfer of fast network channel.By experimental test verification, such secondary transmission means is in transmission time much smaller than straight
The remote transmission connect.
Step 105:Data center 1, branch 2 pass data with the different modes of high-speed channel, general network respectively
Transport to the one-level Flume ends of data center 2.
Flume receiving terminals positioned at data center 1 and branch 2 are referred to as Secondary Receive end in system level aspect.
The data that Flume Secondary Receives end can be received are directly transferred to the Flume Primary Receives end of data center 2.The present embodiment
In because data center 1 and data center 2 directly set up have an express network path, therefore, the Flume of data center 1 connects
Receiving end can send data to data center 2 by the high-speed channel.Branch 2 is then passed data by general network circuit
Transport to data center 2.
Step 106:The Flume Primary Receives end of data center 2 receives the data inside and outside this center, is finally uploaded to
HDFS。
The Flume Primary Receives end of data center 2 not only receives the daily record data from local Secondary Receive end, equally connects
Receive the daily record data from the Secondary Receive end of branch 2 and data center 1.Present system is used the daily record in each region
Data collect the mode of integration via Flume Primary Receives end, rather than the daily record data in each region is each independently uploaded to
HDFS daily record storage systems.One of the reason for so designing is that Flume Primary Receives end can merge during combined data
The same daily record data for being served by correlation, the number of final journal file is reduced, so as to alleviate HDFS file system to a large amount of
The pressure of small documents storage.The two of reason, the system, which is set, only has Flume Primary Receives end to may have access to HDFS systems, not only just
In HDFS Access Management Access, and the possibility for destroying HDFS systems in external service end is reduced from network security aspect.
As shown in figure 4, the gathering algorithm flow chart for daily record real-time acquisition system in the embodiment of the present invention.In the present embodiment
After known collection client terminal start-up, log collection work can be ceaselessly carried out if without extraneous forced interruption.Its reading module is anti-
Operated again using same set of gathering algorithm, specific algorithm flow is as follows:
Step 10101:File pointer points to daily record to be collected, and it is 0 to set pointer initial offset.
Under normal circumstances, daily record data proceeds by collection from file header, therefore, reads file pointer and is initially directed to daily record
The first trip initial character of file data.
Step 10102:Determine whether other newly-increased journal files.
Reading module continues to read for a period of time from the beginning reading the data of new journal file every time or do not read data latency
Before number, it is necessary to above-mentioned judgement is carried out, if a currently only daily record to be collected, turns to step 10103, if except pointer refers to
To daily record to be collected outside again generate new journal file, then turn to step 10112.
Step 10103:File pointer reads current log file data from the amount of specifying Offsets in units of block.
The present invention carries out digital independent by base unit of data block, and data block is one section of day with fixed byte length
Will data, the byte length that the possible deficiency of last block data block only read in end of file is specified.Contain in data block
Some data rows, the stem and afterbody of data block are probably incomplete hemistich data, and the extraction step of data row is shown in step
10105- steps 10108.
Step 10104:Judge whether to read data block.
It is dynamic process to write journal file due to client Web server application program, and reading module can not ensure often
It is secondary all to read daily record data block.Therefore after completing a read operation every time, once judged, see whether really read number
According to.Step 10105 is turned to if data are read, step 10110 is turned to if data are not read.
Step 10105:Read character one by one from data block and be added to dynamic character array end.
A block number is read after, it is put into the caching of collection client by the data block, is responsible for depositing in caching and is waited to solve
The data of analysis.If a pointer is responsible for the character in scanning reading caching specially, reading pointer is cached from caching first character
Position, which is risen, reads character, and the character is added into a dynamic character array end, then scans character late.Dynamic character number
The intermediate data state of the responsible temporal data row of group, its length increase with the increase of the character of storage.
Partial data may be still retained in dynamic character array after if last data block is read, this data
When block parses, dynamic character array can retain last time still undrawn remaining data, by the still additional write-in of the fresh character scanned
Dynamic character data trailer, extraction operation is carried out again until parsing a full line data.
Step 10106:Judge whether to read data EOB.
Because the data character that pointer scans read block caching is a kind of circulation read operation, therefore every time after reading
Judge whether to read new character, finished if reading and illustrating that the data block is not yet read, turn to step 10107;If it was found that nothing
Method reads new character, i.e. data EOB, then illustrates that current data block has been completed to read, and need to extract new data block reading,
Turn to step 10109.
Step 10107:Whether the character for judging newly to read meets that the extraction of data row requires.
If the character of newest reading is newline ' n ', illustrate dynamic character array meet data row extraction require, turn
To step 10108;If the character is the general character of non-newline, continue the reading of next character, turn to step 10105.
Step 10108:Parsing extraction a data row, the data row write is entered the transfer queue of client.
The data of dynamic character storage of array have met the extraction requirement of data row, and its data row is copied into transfer queue
The content of dynamic character array is emptied afterwards.
Step 10109:Daily record abnormality detection, pointer offset amount is reset if abnormal.
, it is necessary to carry out daily record abnormality detection before next data block is read.This is due to client Web server
Application program may restart or shut down etc. extremely in the process of running, then corresponding log file contents change, example
Such as, from the beginning original journal file is covered by new daily record data.The flow of daily record abnormality detection is as follows, and first, reading module is more
The offset of daily record is newly read next time, and new offset is that this reads offset of daily record plus reading data block byte length.
Secondly, by current log file size compared with the offset value, subsequently to express clearly, if new offset value is A
Value, current log file size is B values.If A values are more than B values, it is abnormal to conclude that overriding occurs for journal file.At this moment, daily record number
According to need to from the beginning read, it is 0 to reset and read the pointer offset amount of journal file next time.If A values are less than B values, need further to obtain
The number of characters (being set to C values) of dynamic character storage of array, (A-C, i.e. pointer original position are moved to from the read pointer of journal file
Place) at offset, the first character swept to should be newline initial character, then move a character position further along, and scanning should
Character content simultaneously judges.If the content is not newline ' n ', illustrate overriding occurs abnormal, should reset and read daily record text next time
The pointer offset amount of part is 0.If the content should be newline ' n ', illustrate no exceptions situation.The finger of file is read next time
Pin offset is A values.
Step 10110:Read, less than data, to wait specified time.
If Web server end does not have daily record data to write file within a period of time, pointer is set to tail of file and then read
Less than data content.Gather when client is unpredictable can be read new data, wait for a period of time and reattempt reading.
Step 10111:Daily record abnormality detection, pointer offset amount is reset if abnormal.
Step 10111 is similar with step 10109, repeats no more.It is otherwise varied, if being not detected by daily record exception, step
The offset for reading daily record in rapid 10109 next time is that increase reads data block byte length and (is set to D on the basis of this offset
Value), i.e. A+D.The offset for reading daily record in step 10111 next time remains this offset for reading daily record, i.e. A.
Further, for being deployed in the client of (SuSE) Linux OS, step 10111 is one more than step 10109
There is deleted monitoring on being that daily record is no.Specifically, still without getting data block after acquisition module is reading preset times,
Then trigger the detection that file is deleted.During detection, it can first judge that the file of specified file name whether there is:If specified file is not present
The file of name, then directly judge that current file is deleted, new file does not generate again, now then waits new file generation;If deposit
In the file of specified file name, then pass through the " device id+Inode of existing stat orders acquisition specified file name respective file
Number (inode number) " character string, and the character string of opened file are contrasted, and judge that original has been deleted if different
Remove, new file and generated, now, file read pointer is pointed into new file, and set the offset for reading daily record next time to set
For 0.
Step 10112:Circulation runs through the remaining data of current log in units of block from the amount of specifying Offsets, specifies newly-increased
Daily record is next processing file.
It should be noted that in the present embodiment, when entering step 10112, system can be first by the day of current pointer sensing
The digital independent of will finishes, and is read repeatedly by data block and parses data row, its process and above-mentioned steps 10103- steps 10108
Similar, difference is, has been write completely because new daily record generates explanation current log, therefore need not perform step 10104, i.e., no longer
Judge read whether data block can be read every time.Newly-increased journal file, which is pointed to, with backpointer carries out digital independent.
In order to improve the ageing of log collection, the present invention takes the dynamic acquisition daily record number in journal file generating process
According to, every time by data block read and parse line by line, sustainable transmission daily record data;In order to ensure the integrality of data acquisition, this
Monitoring is abnormal in real time in gatherer process for invention, once find daily record be deleted or overlay content if adjust acquisition strategies in time,
Avoid the daily record data of read error.
Meanwhile invention additionally discloses a kind of log data acquisition system, including:
Region division device, the target area of daily record data to be collected is divided at least two pickup areas in advance, respectively
Pickup area includes:Data center and at least one branch, daily record storage system are located at least two acquisition zone
The data center of first pickup area in one of domain;
Log data acquisition device, each branch, data center are arranged at, for gathering in each branch and data
The daily record data of web server in the heart, and the daily record data of each branch of collection is transferred to where the branch
The data center of pickup area;
Flume Primary Receives end, the data center of first pickup area is arranged at, for by each acquisition zone of collection
The daily record data in domain is stored to daily record storage system.
In the embodiment of the present invention, described log data acquisition device includes:
Client is gathered, each branch and data center is arranged at, gathers in each branch and data center
Web server daily record data;
Flume receiving terminals, it is arranged at each branched structure and data center;Wherein,
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center
Stored to the Flume Primary Receives end to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to Flume Primary Receives end and stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, passes through the branch
Flume receiving terminals are transmitted to the Secondary Receive end of corresponding data center, and are passed by the Flume Secondary Receives end of data center
The Flume Primary Receives end is transported to store to daily record storage system.
In the embodiment of the present invention, the Flume Secondary Receives end of the data center of non-first pickup area passes through express network
Private line access is to Flume Secondary Receives end.
The daily record real-time acquisition system of the present invention uses multi-layer Flume server modes in global deployment strategy, i.e.,
Some Flume servers are set inside each pickup area by log collection transfer, if the finally acquisition zone where HDFS
Dry Flume servers collect the daily record of each acquisition zone and write HDFS.This design pattern, first, avoid a large amount of Flume
Server system pressure to caused by HDFS direct read/writes, while HDFS is avoided caused by storing large amount of small documents
Namenode performances reduce problem;Second, the larger I/O operation of HDFS the two expenses is written to by remote transmission data, by data
Decoupling and design, avoid the huge communication delay that data are sent directly to data center 2HDFS by data center 1;3rd, just
In the unified management that HDFS is accessed, while reduce from network security aspect the address of fire wall white list.
And in collection client internal journal gathering algorithm design, acquisition method of the present invention can not only be with behavior unit
Gather collector journal, improve the ageing of log collection, and shield Web server application restart, shut down etc. it is extremely right
The influence of acquisition system of the present invention.Acquisition system abnormal conditions generation after still normal operation, accurately collect in daily record
Hold, the phenomenon for reading hemistich data is effectively prevent compared with the acquisition method that Flume is carried.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the present invention can use the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Apply the form of example.Moreover, the present invention can use the computer for wherein including computer usable program code in one or more
The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram
Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processors of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real
The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
Apply specific embodiment in the present invention to be set forth the principle and embodiment of the present invention, above example
Explanation be only intended to help understand the present invention method and its core concept;Meanwhile for those of ordinary skill in the art,
According to the thought of the present invention, there will be changes in specific embodiments and applications, in summary, in this specification
Appearance should not be construed as limiting the invention.
Claims (26)
1. a kind of log data acquisition method, it is characterised in that described method includes:
The target area of daily record data to be collected is divided at least two pickup areas in advance, each pickup area includes:Number
According to center and at least one branch, what daily record storage system was located at one of at least two pickup area first adopts
Collect the data center in region;
Gather the daily record data of the web server in each branch and data center, and by the day of each branch of collection
Data center of the will data transfer to pickup area where the branch;
The daily record data of each pickup area is stored to daily record by the Flume Primary Receives end of the data center of the first pickup area
Storage system.
2. log data acquisition method as claimed in claim 1, it is characterised in that each branch of described collection and data
The daily record data of web server in center, and the daily record data of each branch of collection is transferred to the institute of branch
Include in the data center of pickup area:
The daily record data of the web server of the branch of collection, this is transferred to by the Flume receiving terminals positioned at branch
The Flume receiving terminals of the data center of pickup area where branch.
3. log data acquisition method as claimed in claim 2, it is characterised in that the daily record data of described each pickup area
Being stored by the Flume Primary Receives end of the data center of the first pickup area to daily record storage system includes:
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center to
The Flume Primary Receives end of the data center of one pickup area is stored to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, is connect by the Flume of the branch
Receiver-side transmission to the Flume Primary Receives end is stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, passes through corresponding data center
Transmit to the Flume Primary Receives end and store to daily record storage system in Flume Secondary Receives end.
4. log data acquisition method as claimed in claim 3, it is characterised in that the data center of non-first pickup area
Flume Secondary Receives end passes through express network private line access to Flume Primary Receives end.
5. the log data acquisition method as described in any claim of claim 1 or 4, it is characterised in that described collection
The daily record data of web server in each branch and data center, and the daily record data of each branch of collection is passed
The defeated data center to pickup area where the branch includes:
The daily record data of web server is read in units of data block and writes transfer queue;
Daily record data in transfer queue is sent to Flume receiving terminals;
The downstream for determining daily record data according to the Flume receiving terminals location type sends ground.
6. log data acquisition method as claimed in claim 5, it is characterised in that described method includes:
The segmentation principle of the daily record data of web server is preset, described segmentation principle includes:With size or time to day
Will data carry out cutting;
The journal file of storing daily record data is generated according to the segmentation principle of the daily record data of the web server of setting.
7. log data acquisition method as claimed in claim 6, it is characterised in that described to be read in units of data block
The daily record data of web server simultaneously writes transfer queue and included:
Step 1, file pointer is pointed to daily record to be collected;
Step 2, the daily record data in current log file is read in units of data block from the amount of specifying Offsets;
Step 3, read character one by one from data block and be put into caching;
Step 4, transfer queue is write by the character in row extraction caching.
8. log data acquisition method as claimed in claim 7, it is characterised in that described to read word one by one from data block
Symbol, which is put into caching, to be included:
Judge whether to read new character to determine whether for data block tail;
It is determined that read for data block tail, then perform step 2.
9. log data acquisition method as claimed in claim 8, it is characterised in that the described character extracted by row in caching
Write-in transfer queue includes:
Judgement reads whether new character is newline;
It is determined that the character read is newline, the character write-in transfer queue in extraction caching;
It is determined that the character read is not newline, then step 3 is performed.
10. the log data acquisition method as described in any claim of claim 8 or 9, it is characterised in that it is determined that reading
For data block tail perform step 2 before, further execution journal abnormality detection,
Daily record exception is determined, then performs step 2 after resetting pointer offset amount.
11. log data acquisition method as claimed in claim 7, it is characterised in that performing the bag that takes a step forward of step 2
Include:Determine whether newly-increased journal file;Wherein,
It is determined that without newly-increased journal file, step 2 is performed;
It is determined that there is newly-increased journal file, then newly-increased journal file is specified to be continued to be next after current log file has been read
The journal file taken.
12. log data acquisition method as claimed in claim 11, it is characterised in that include before step 3 is performed:
Judge whether to read data block;
It is determined that reading data block, then step 3 is performed;
It is determined that not reading data block, then carry out waiting default specified time.
13. log data acquisition method as claimed in claim 12, it is characterised in that it is determined that not reading data block, wait
After default specified time, execution journal abnormality detection, daily record is abnormal, which then to be reset to perform step after pointer offset amount and judge, is
It is no to have newly-increased journal file.
14. a kind of log data acquisition system, it is characterised in that described system includes:
Region division device, for the target area of daily record data to be collected to be divided into at least two pickup areas, each collection
Region includes:Data center and at least one branch, daily record storage system be located at least two pickup area its
One of the first pickup area data center;
Log data acquisition device, for gathering the daily record data of the web server in each branch and data center, and will
The data center of pickup area where the daily record data of each branch of collection is transferred to the branch;
Flume Primary Receives end, the data center of first pickup area is arranged at, for by the daily record number of each pickup area
According to storing to daily record storage system.
15. log data acquisition system as claimed in claim 14, it is characterised in that described log data acquisition device bag
Include:
Client is gathered, each branch and data center is arranged at, gathers the web in each branch and data center
The daily record data of server;
Flume receiving terminals, it is arranged at each branched structure and data center;Wherein,
The daily record data of the web server of the data center of collection, transmitted by the Flume Secondary Receives end of data center to institute
Flume Primary Receives end is stated to store to daily record storage system;
The daily record data of the web server of the branch of first pickup area of collection, is connect by the Flume of the branch
Receiver-side transmission to Flume Primary Receives end is stored to daily record storage system;
The daily record data of the web server of the branch of non-first pickup area of collection, passes through the Flume of the branch
Receiving terminal is transmitted to the Secondary Receive end of corresponding data center, and by the Flume Secondary Receives end of data center transmit to
The Flume Primary Receives end is stored to daily record storage system.
16. log data acquisition system as claimed in claim 15, it is characterised in that the data center of non-first pickup area
Flume Secondary Receives end pass through express network private line access to Flume Primary Receives end.
17. the log data acquisition system as described in any claim of claim 14 or 16, it is characterised in that described adopts
Acquisition means include:
Read module, the daily record data of web server is read in units of data block and writes transfer queue;
Transit module, for the daily record data in transfer queue to be sent to Flume receiving terminals;
The Flume receiving terminals determine that the downstream of daily record data sends ground according to location type.
18. log data acquisition system as claimed in claim 17, it is characterised in that the read module includes:
Principle presets unit, the segmentation principle of the daily record data for presetting web server, described segmentation principle bag
Include:Cutting is carried out to daily record data with size or time;
Cutting unit, the day for the segmentation principle generation storing daily record data of the daily record data of the web server according to setting
Will file.
19. log data acquisition system as claimed in claim 18, it is characterised in that described read module using data block as
Unit, which reads the daily record data of web server and writes transfer queue, to be included:
Step 1, file pointer is pointed to daily record to be collected;
Step 2, the daily record data in current log file is read in units of data block from the amount of specifying Offsets;
Step 3, read character one by one from data block and be put into caching;
Step 4, transfer queue is write by the character in row extraction caching.
20. log data acquisition system as claimed in claim 19, it is characterised in that described read module also includes:Day
The normal detection module of mystery, for execution journal abnormality detection.
21. log data acquisition system as claimed in claim 20, it is characterised in that described read module also includes:
Block tail judging unit, for judging whether to read new character to determine whether for data block tail;
It is determined that read for data block tail, then perform step 2.
22. log data acquisition system as claimed in claim 21, it is characterised in that the read module also includes:
Newline judging unit, for judging to read whether new character is newline;
It is determined that the character read is newline, the character write-in transfer queue in extraction caching;
It is determined that the character read is not newline, then step 3 is performed.
23. the log data acquisition system as described in any claim of claim 21 or 22, it is characterised in that it is determined that reading
Before that arrives performs step 2 for data block tail, daily record abnormality detection module execution journal abnormality detection, daily record exception is determined, then
Step 2 is performed after resetting pointer offset amount.
24. log data acquisition system as claimed in claim 20, it is characterised in that also include in read module:
Newly-increased daily record judging unit, for determining whether newly-increased journal file;Wherein,
It is determined that without newly-increased journal file, step 2 is performed;
It is determined that there is newly-increased journal file, then newly-increased journal file is specified to be continued to be next after current log file has been read
The journal file taken.
25. log data acquisition system as claimed in claim 24, it is characterised in that also include in read module:
Data block judging unit, for judging whether to read data block;
It is determined that reading data block, then step 3 is performed;
It is determined that not reading data block, then carry out waiting default specified time.
26. log data acquisition system as claimed in claim 25, it is characterised in that it is determined that not reading data block, wait
After default specified time, using the daily record abnormality detection module execution journal abnormality detection, abnormal then reset of daily record refers to
Step is performed after pin offset and determines whether newly-increased journal file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710564475.7A CN107341258B (en) | 2017-07-12 | 2017-07-12 | Log data acquisition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710564475.7A CN107341258B (en) | 2017-07-12 | 2017-07-12 | Log data acquisition method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341258A true CN107341258A (en) | 2017-11-10 |
CN107341258B CN107341258B (en) | 2020-03-13 |
Family
ID=60218597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710564475.7A Active CN107341258B (en) | 2017-07-12 | 2017-07-12 | Log data acquisition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341258B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107943647A (en) * | 2017-11-21 | 2018-04-20 | 北京小度互娱科技有限公司 | A kind of reliable distributed information log collection method and system |
CN108241744A (en) * | 2018-01-04 | 2018-07-03 | 北京奇艺世纪科技有限公司 | A kind of log read method and apparatus |
CN108563527A (en) * | 2018-03-21 | 2018-09-21 | 四川斐讯信息技术有限公司 | A kind of detection method and system of data processing situation |
CN109960622A (en) * | 2017-12-22 | 2019-07-02 | 南京欣网互联网络科技有限公司 | A kind of method of data capture based on big data visual control platform |
WO2019134341A1 (en) * | 2018-01-05 | 2019-07-11 | 平安科技(深圳)有限公司 | Log text processing method and apparatus, and storage medium |
CN110162448A (en) * | 2018-02-13 | 2019-08-23 | 北京京东尚科信息技术有限公司 | The method and apparatus of log collection |
CN111427903A (en) * | 2020-03-27 | 2020-07-17 | 四川虹美智能科技有限公司 | Log information acquisition method and device |
CN111464558A (en) * | 2020-04-20 | 2020-07-28 | 公安部交通管理科学研究所 | Data acquisition and transmission method for traffic safety comprehensive service management platform |
CN111880440A (en) * | 2020-07-31 | 2020-11-03 | 仲刚 | Serial link data acquisition method and system |
CN112396429A (en) * | 2020-11-09 | 2021-02-23 | 中国南方电网有限责任公司 | Statistical analysis system for enterprise operation business |
CN115225471A (en) * | 2022-07-15 | 2022-10-21 | 中国工商银行股份有限公司 | Log analysis method and device |
CN116366308A (en) * | 2023-03-10 | 2023-06-30 | 广东堡塔安全技术有限公司 | Cloud computing-based server security monitoring system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8234256B2 (en) * | 2003-11-26 | 2012-07-31 | Loglogic, Inc. | System and method for parsing, summarizing and reporting log data |
CN105868075A (en) * | 2016-03-31 | 2016-08-17 | 浪潮通信信息系统有限公司 | System and method for monitoring and analyzing great deal of logs in real time |
CN106250496A (en) * | 2016-08-02 | 2016-12-21 | 北京集奥聚合科技有限公司 | A kind of method and system of the data collection in journal file |
CN106569936A (en) * | 2016-09-26 | 2017-04-19 | 深圳盒子支付信息技术有限公司 | Method and system for acquiring scrolling log in real time |
-
2017
- 2017-07-12 CN CN201710564475.7A patent/CN107341258B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8234256B2 (en) * | 2003-11-26 | 2012-07-31 | Loglogic, Inc. | System and method for parsing, summarizing and reporting log data |
CN105868075A (en) * | 2016-03-31 | 2016-08-17 | 浪潮通信信息系统有限公司 | System and method for monitoring and analyzing great deal of logs in real time |
CN106250496A (en) * | 2016-08-02 | 2016-12-21 | 北京集奥聚合科技有限公司 | A kind of method and system of the data collection in journal file |
CN106569936A (en) * | 2016-09-26 | 2017-04-19 | 深圳盒子支付信息技术有限公司 | Method and system for acquiring scrolling log in real time |
Non-Patent Citations (1)
Title |
---|
YZGYJYW: "Flume日志采集多级Agent", 《CSDN博客》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107943647A (en) * | 2017-11-21 | 2018-04-20 | 北京小度互娱科技有限公司 | A kind of reliable distributed information log collection method and system |
CN109960622A (en) * | 2017-12-22 | 2019-07-02 | 南京欣网互联网络科技有限公司 | A kind of method of data capture based on big data visual control platform |
CN108241744A (en) * | 2018-01-04 | 2018-07-03 | 北京奇艺世纪科技有限公司 | A kind of log read method and apparatus |
WO2019134341A1 (en) * | 2018-01-05 | 2019-07-11 | 平安科技(深圳)有限公司 | Log text processing method and apparatus, and storage medium |
CN110162448A (en) * | 2018-02-13 | 2019-08-23 | 北京京东尚科信息技术有限公司 | The method and apparatus of log collection |
CN108563527A (en) * | 2018-03-21 | 2018-09-21 | 四川斐讯信息技术有限公司 | A kind of detection method and system of data processing situation |
CN111427903A (en) * | 2020-03-27 | 2020-07-17 | 四川虹美智能科技有限公司 | Log information acquisition method and device |
CN111427903B (en) * | 2020-03-27 | 2023-04-21 | 四川虹美智能科技有限公司 | Log information acquisition method and device |
CN111464558B (en) * | 2020-04-20 | 2022-03-01 | 公安部交通管理科学研究所 | Data acquisition and transmission method for traffic safety comprehensive service management platform |
CN111464558A (en) * | 2020-04-20 | 2020-07-28 | 公安部交通管理科学研究所 | Data acquisition and transmission method for traffic safety comprehensive service management platform |
CN111880440A (en) * | 2020-07-31 | 2020-11-03 | 仲刚 | Serial link data acquisition method and system |
CN111880440B (en) * | 2020-07-31 | 2021-08-03 | 仲刚 | Serial link data acquisition method and system |
CN112396429A (en) * | 2020-11-09 | 2021-02-23 | 中国南方电网有限责任公司 | Statistical analysis system for enterprise operation business |
CN115225471A (en) * | 2022-07-15 | 2022-10-21 | 中国工商银行股份有限公司 | Log analysis method and device |
CN116366308A (en) * | 2023-03-10 | 2023-06-30 | 广东堡塔安全技术有限公司 | Cloud computing-based server security monitoring system |
CN116366308B (en) * | 2023-03-10 | 2023-11-03 | 广东堡塔安全技术有限公司 | Cloud computing-based server security monitoring system |
Also Published As
Publication number | Publication date |
---|---|
CN107341258B (en) | 2020-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107341258A (en) | A kind of log data acquisition method and system | |
US11194552B1 (en) | Assisted visual programming for iterative message processing system | |
US11113353B1 (en) | Visual programming for iterative message processing system | |
US11474673B1 (en) | Handling modifications in programming of an iterative message processing system | |
US11036591B2 (en) | Restoring partitioned database tables from backup | |
CN111917864B (en) | Service verification method and device | |
CN105138615B (en) | A kind of method and system constructing big data distributed information log | |
US9639589B1 (en) | Chained replication techniques for large-scale data streams | |
CN104243425B (en) | A kind of method, apparatus and system carrying out Content Management in content distributing network | |
US8290994B2 (en) | Obtaining file system view in block-level data storage systems | |
CN107704196A (en) | Block chain data-storage system and method | |
CN103970788A (en) | Webpage-crawling-based crawler technology | |
CN109964216A (en) | Identify unknown data object | |
CN106953758A (en) | A kind of dynamic allocation management method and system based on Nginx servers | |
CN104011701A (en) | Content delivery network | |
US20220179991A1 (en) | Automated log/event-message masking in a distributed log-analytics system | |
CN111459986B (en) | Data computing system and method | |
CN107710215A (en) | The method and apparatus of mobile computing device safety in test facilities | |
CN110784498B (en) | Personalized data disaster tolerance method and device | |
CN103020257A (en) | Implementation method and device for data operation | |
US20240106893A1 (en) | Filecoin cluster data transmission method and system based on remote direct memory access | |
US10031948B1 (en) | Idempotence service | |
CN108664914A (en) | Face retrieval method, apparatus and server | |
CN113254320A (en) | Method and device for recording user webpage operation behaviors | |
CN108268497A (en) | The method of data synchronization and device of relevant database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |