CN106649336B - A kind of log compression method and log processing equipment, log processing system - Google Patents

A kind of log compression method and log processing equipment, log processing system Download PDF

Info

Publication number
CN106649336B
CN106649336B CN201510726130.8A CN201510726130A CN106649336B CN 106649336 B CN106649336 B CN 106649336B CN 201510726130 A CN201510726130 A CN 201510726130A CN 106649336 B CN106649336 B CN 106649336B
Authority
CN
China
Prior art keywords
daily record
record data
processing equipment
compression algorithm
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510726130.8A
Other languages
Chinese (zh)
Other versions
CN106649336A (en
Inventor
徐峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Digital Technologies Suzhou Co Ltd
Original Assignee
Huawei Digital Technologies Suzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Digital Technologies Suzhou Co Ltd filed Critical Huawei Digital Technologies Suzhou Co Ltd
Priority to CN201510726130.8A priority Critical patent/CN106649336B/en
Publication of CN106649336A publication Critical patent/CN106649336A/en
Application granted granted Critical
Publication of CN106649336B publication Critical patent/CN106649336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files

Abstract

The embodiment of the invention discloses a kind of log compression method and log processing equipment, log processing system to reduce memory space, reduce carrying cost for flexibly and in real time compression algorithm being selected to carry out daily record data compression.The method comprise the steps that log processing equipment according to preset data extracting rule, extracts target journaling flow from the log flow received and stored sequentially in time;The log processing equipment obtains daily record data from the target journaling flow;The log processing equipment judges whether to need replacing current compression algorithm according to the daily record data;If desired current compression algorithm, log processing equipment selection target compression algorithm from preset compression algorithm according to the daily record data are replaced;The log processing equipment is compressed to obtain compressed data packets according to the targeted compression algorithm to the daily record data, and the current compression algorithm is updated to the targeted compression algorithm.

Description

A kind of log compression method and log processing equipment, log processing system
Technical field
The present invention relates to technical field of information processing, and in particular to a kind of log compression method and log processing equipment, day Will processing system.
Background technique
With the arriving of big data era, such as internet, finance, increase in communication field a large amount of network equipment and Safety equipment, these equipment generate the daily record data of magnanimity daily, and relevant departments need to use these daily record data real-time storages It traces to the source in audit or safety problem and waits business diagnosis.
Before daily record data storage, it will usually carry out data compression.It does so first is that in order to reduce the input/output of disk (Input/Output) pressure is read and write, the process performance of system is improved, second is that in order to save the cost of storage equipment.For data The compression algorithm of compression is relatively more, and every kind of compression algorithm suffers from the advantage and disadvantage of oneself, such as compression ratio, compression speed difference. Wherein compression ratio refers to the compressed ratio of data.Compression speed refers to the number of compressed data in the unit time.The prior art When selecting compression algorithm, under off-line state, by selecting a kind of compression algorithm according to historical data or experimental data, setting When received shipment row, then the compression algorithm that will use selection carries out the Real Time Compression of data.
But in actual application, since real-time logs flow and log content are continually changing, with same Compression algorithm compresses daily record data, can not meet lifting system performance simultaneously and save the need of carrying cost these two aspects It asks, for example, when log flow is larger, it may be necessary to the faster compression algorithm of compression speed, it is ensured that data are not lost;In log When flow is smaller, bigger compression ratio may be considered, to save storage resource.
Summary of the invention
The embodiment of the invention provides a kind of log compression method and log processing equipment, log processing system, to solve Certainly compressed daily record data occupies that resource is larger and the higher problem of carrying cost in the prior art.
First aspect present invention provides a kind of log compression method, it may include:
Log processing equipment is according to preset data extracting rule, from the log flow received and stored sequentially in time Middle extraction target journaling flow, above-mentioned preset data extracting rule include to be extracted in preset time period on the basis of current time Log flow, or the log flow of default size is extracted at the end once extracted in the past for starting point;Above-mentioned log processing Equipment obtains daily record data from above-mentioned target journaling flow;According to above-mentioned daily record data, judgement is above-mentioned log processing equipment It is no to need replacing current compression algorithm;If desired current compression algorithm is replaced, above-mentioned log processing equipment is according to above-mentioned log number According to the selection target compression algorithm from preset compression algorithm, wherein above-mentioned targeted compression algorithm is in above-mentioned preset compression algorithm Different from the compression algorithm of above-mentioned current compression algorithm;Above-mentioned log processing equipment is according to above-mentioned targeted compression algorithm to above-mentioned day Will data are compressed to obtain compressed data packets, and above-mentioned current compression algorithm is updated to above-mentioned targeted compression algorithm.
Log flow is received and stores sequentially in time in the embodiment of the present invention, and log processing equipment is according to default Data extracting rule extract target flow from the log flow of storage, since the data extracting rule includes with current time On the basis of extract log flow in preset time period, or extract default size at the end once extracted in the past for starting point Log flow, it is ensured that the log flow stored successively can be extracted all according to the data extracting rule.Later, at log Reason equipment obtains corresponding daily record data from the target journaling flow of extraction, goes to judge whether to need more according to the daily record data Current compression algorithm is changed, when needing to update current compression algorithm, the selection target compression algorithm from preset compression algorithm, with The daily record data is compressed according to the targeted compression algorithm, obtains compressed data packets.It can be seen that energy of the embodiment of the present invention Daily record data is enough combined, compression algorithm is neatly replaced, carries out corresponding daily record data to obtain more suitably compression algorithm Compression, with reach improve daily record data compression factor, thus reduce storage resource, reduce carrying cost purpose.In addition, In the embodiment of the present invention when extracting target journaling flow, be according to receive and storage log flow time sequencing, successively with Preset time period or default size are to extract window to extract, it can be ensured that all log flows can be completed accordingly to extract Process flow, and handled as a unit with meeting the target journaling flow of extraction window, meet in real-time The flexible replacement of compression algorithm.
Preferably, above-mentioned log flow receives to obtain from the network equipment and/or safety equipment, can be some business datums Or network data.
In some embodiments of the invention, if not needing replacement current compression algorithm, which works as according to above-mentioned Preceding compression algorithm compresses above-mentioned daily record data.As can be seen that being unsuitable in current compression algorithm for compressing above-mentioned day When will data, then more suitable compression algorithm can be neatly selected to compress above-mentioned daily record data, conversely, in current compression When algorithm applies also for compressing above-mentioned daily record data, then it can continue with the current compression algorithm and carry out above-mentioned daily record data Compression.
In some embodiment of the invention, log processing equipment judges whether to need replacing and work as according to above-mentioned daily record data Preceding compression algorithm specifically includes: log processing equipment calculates at least one spy of above-mentioned daily record data according to above-mentioned daily record data Changing value is levied, judges whether at least one features described above changing value is greater than or equal to preset threshold.
In some embodiment of the invention, the embodiment of the invention provides a kind of changing features values, can be based on this feature Changing value judges whether to be greater than or equal to preset threshold, specific as follows: according to the size of above-mentioned daily record data, to calculate above-mentioned log Processing equipment parses the resolution speed of above-mentioned daily record data;Above-mentioned log processing equipment calculates above-mentioned resolution speed and above-mentioned log The difference of a preceding resolution speed indicated by first register in processing equipment, the absolute value of above-mentioned difference is as features described above Changing value, parsing speed when an above-mentioned preceding resolution speed is the daily record data once obtained before above-mentioned log processing equipment is handled Degree, it is above-mentioned before the daily record data that once obtains be it is preceding once extract target journaling flow according to above-mentioned data extracting rule after, In the past the daily record data obtained in the target journaling flow once extracted;Above-mentioned log processing equipment judges features described above changing value Whether it is greater than or equal to the first preset threshold, and parses the resolution speed of above-mentioned daily record data according to above-mentioned log processing equipment Update above-mentioned first register.As can be seen that in this embodiment, provide the characteristic value of daily record data: log processing is set The standby resolution speed for parsing above-mentioned daily record data, and changing features value is the resolution speed and the parsing of log processing equipment upper one The absolute value of the difference of the resolution speed of daily record data.The variation of resolution speed can accurately reflect log processing equipment Processing capacity, changing value be greater than or equal to the first preset threshold when, illustrate that this resolution speed becomes smaller or becomes larger, if Become smaller and illustrates that resolution speed is too slow, it may be necessary to the compression algorithm of smaller compression speed, if becoming larger, it may be necessary to bigger compression The compression algorithm of speed.
In other embodiments of the invention, the embodiment of the invention also provides a kind of changing features values, and can be based on should Changing features value judges whether to be greater than or equal to preset threshold, and specific as follows: above-mentioned log processing equipment is from above-mentioned daily record data Middle to extract continuous N daily record data, above-mentioned N is the natural number more than or equal to 1;Above-mentioned log processing equipment calculates above-mentioned N The average value of daily record data size;Above-mentioned log processing equipment calculate above-mentioned average value with it is previous indicated by the second register The difference of a average value, for the absolute value of above-mentioned difference as features described above changing value, an above-mentioned preceding average value is in the past primary The continuous N daily record data size average value extracted in the daily record data of acquisition, before the daily record data once obtained before above-mentioned is After once extracting target journaling flow according to above-mentioned data extracting rule, obtained in the target journaling flow that the past once extracts Daily record data;Above-mentioned log processing equipment judges whether features described above changing value is greater than or equal to the second preset threshold, and Above-mentioned second register is updated according to the average value of above-mentioned N daily record data size.As can be seen that in this embodiment, mentioning The characteristic value of daily record data: the average value of daily record data size has been supplied, and changing features value was this average value and a upper day The absolute value of the difference of will size of data average value.The variation of the average value of daily record data size can accurately reflect day The size of will data illustrates that this daily record data is smaller or bigger when changing value is greater than or equal to the second preset threshold, if Be it is smaller can replace the faster compression algorithm of compression speed, to improve compression speed;If bigger, it may be necessary to which compression ratio is more Big compression algorithm, to improve compression factor.
In other embodiments of the invention, the embodiment of the invention also provides a kind of changing features values, and can be based on should Changing features value judges whether to be greater than or equal to preset threshold, and specific as follows: above-mentioned log processing equipment is from above-mentioned daily record data Middle to extract continuous N daily record data, above-mentioned N is the natural number more than or equal to 1;Above-mentioned log processing equipment calculates above-mentioned N The multiplicity of daily record data;Above-mentioned log processing equipment calculate above-mentioned multiplicity with it is previous heavy indicated by third register The difference of multiplicity, for the absolute value of above-mentioned difference as features described above changing value, above-mentioned previous multiplicity is in the past primary obtains Daily record data in the multiplicity of continuous N daily record data extracted, the daily record data once obtained before above-mentioned is preceding once to press After extracting target journaling flow according to above-mentioned data extracting rule, the log that is obtained in the target journaling flow that once extracts in the past Data;Above-mentioned log processing equipment judges whether features described above changing value is greater than or equal to third preset threshold, and according to upper The multiplicity for stating N daily record data updates above-mentioned third register.As can be seen that in this embodiment, providing log number According to characteristic value: daily record data multiplicity, and changing features value be this multiplicity and a upper daily record data multiplicity difference The absolute value of value.The variation of the multiplicity of daily record data can accurately reflect the transmission situation of daily record data, in changing value It, can if this multiplicity is greater than third preset threshold than the difference of last multiplicity when more than or equal to third preset threshold To select the compression algorithm of bigger compression speed, conversely, can choose the bigger compression algorithm of compression ratio.
In some embodiment of the invention, log processing equipment is when determination needs replacing current compression algorithm, further Specific as follows according to daily record data selection target compression algorithm: above-mentioned log processing equipment is according to above-mentioned daily record data size, meter It counts in stating the resolution speed that log processing equipment parses above-mentioned daily record data, and calculates each in above-mentioned preset compression algorithm The actual compression speed and actual compression ratio of compression algorithm;Above-mentioned log processing equipment is parsed according to above-mentioned log processing equipment The resolution speed of daily record data, the actual compression speed of each compression algorithm and actual compression ratio are stated, from above-mentioned preset compression Above-mentioned targeted compression algorithm is selected in algorithm.
Further, above-mentioned log processing equipment parses the parsing speed of above-mentioned daily record data according to above-mentioned log processing equipment Degree, the actual compression speed of each compression algorithm and actual compression ratio, select above-mentioned target from above-mentioned preset compression algorithm Compression algorithm includes: for each compression algorithm, and it is above-mentioned that above-mentioned log processing equipment calculates above-mentioned log processing equipment parsing The ratio of the actual compression speed of the resolution speed and compression algorithm of daily record data;Above-mentioned log processing equipment calculates above-mentioned ratio Value and the actual compression ratio of compression algorithm reciprocal and;Above-mentioned log processing equipment is less than or equal to 1 from above-mentioned and satisfaction The maximum compression algorithm of actual compression ratio is selected in all compression algorithms as targeted compression algorithm.As can be seen that the present invention is real Example is applied in the case where selection target compression algorithm mainly follows daily record data to be ensured and is not lost, selects compression ratio maximum Compression algorithm reduces carrying cost to improve the compression factor of daily record data.
Second aspect of the present invention provides a kind of log decompression method, it may include: log decompression apparatus is from database Middle reading destination packed data packet;Log decompression apparatus reads compression algorithm identification from destination packed data packet, according to this Compression algorithm identification selection target decompression algorithm from decompression algorithm;Log decompression apparatus is according to target decompression algorithm Compressed daily record data in the compressed data packets is unziped it, reduction obtains Source log data.It can be seen that this hair Decompression in bright embodiment includes after unziping it to compressed data packets, further according to the compression algorithm mark in compressed data packets Decompression algorithm is found out in knowledge, is unziped it using decompression algorithm to the compressed daily record data in compressed data packets, when So, the embodiment of the present invention is mainly used for being described in detail and how restore to compressed daily record data.
Third aspect present invention provides a kind of log processing equipment, it may include:
Extraction module is used for according to preset data extracting rule, from the log stream received and stored sequentially in time Target journaling flow is extracted in amount, above-mentioned preset data extracting rule includes that preset time period is extracted on the basis of current time The log flow of default size is extracted at interior log flow, or the former end once extracted for starting point;
Judgment module, for obtaining daily record data from above-mentioned target journaling flow, according to above-mentioned daily record data, judgement is It is no to need replacing current compression algorithm;
Selecting module, for if desired replacing current compression algorithm, according to above-mentioned daily record data from preset compression algorithm Selection target compression algorithm, wherein above-mentioned targeted compression algorithm is to be different from above-mentioned current compression in above-mentioned preset compression algorithm The compression algorithm of algorithm;
Log compression module obtains compression number for being compressed according to above-mentioned targeted compression algorithm to above-mentioned daily record data According to packet;
Update module, for above-mentioned current compression algorithm to be updated to above-mentioned targeted compression algorithm.
In some embodiment of the invention, above-mentioned log compression module is also used to, if not needing to replace above-mentioned current compression Algorithm compresses above-mentioned daily record data according to above-mentioned current compression algorithm.
In some embodiment of the invention, above-mentioned judgment module is specifically used for, and according to above-mentioned daily record data, calculates above-mentioned day At least one changing features value of will data, judges whether at least one features described above changing value is greater than or equal to preset threshold.
In some embodiment of the invention, above-mentioned judgment module is specifically used for, and according to the size of above-mentioned daily record data, calculates Above-mentioned log processing equipment parses the dissection process speed of above-mentioned daily record data;Calculate above-mentioned resolution speed and above-mentioned log processing The absolute value of the difference of a preceding resolution speed indicated by first register in equipment, above-mentioned difference changes as features described above Value, resolution speed when an above-mentioned preceding resolution speed is the daily record data once obtained before above-mentioned log processing equipment is handled, Before above-mentioned the daily record data that once obtains be it is preceding once extract target journaling flow according to above-mentioned data extracting rule after, in the past The daily record data obtained in the target journaling flow once extracted;It is pre- to judge whether features described above changing value is greater than or equal to first If threshold values;Above-mentioned update module is also used to, and is updated according to the resolution speed that above-mentioned log processing equipment parses above-mentioned daily record data Above-mentioned first register.
In other embodiments of the invention, above-mentioned judgment module is specifically used for, and extracts from above-mentioned daily record data continuous N daily record data, above-mentioned N is natural number more than or equal to 1;Calculate the average value of above-mentioned N daily record data size;It calculates The difference of previous average value indicated by above-mentioned average value and the second register, the absolute value of above-mentioned difference is as features described above Changing value, an above-mentioned preceding average value are that the continuous N daily record data size extracted in the daily record data once obtained in the past is put down Mean value, the daily record data once obtained before above-mentioned once extract target journaling flow according to above-mentioned data extracting rule to be preceding Afterwards, the daily record data obtained in the target journaling flow once extracted in the past;Judge whether features described above changing value is greater than or waits In the second preset threshold;Above-mentioned update module is also used to, and updates above-mentioned second according to the average value of above-mentioned N daily record data size Register.
In other embodiments of the invention, above-mentioned judgment module is specifically used for, and extracts from above-mentioned daily record data continuous N daily record data, above-mentioned N is natural number more than or equal to 1;Calculate the multiplicity of above-mentioned N daily record data;It calculates above-mentioned The absolute value of the difference of previous multiplicity indicated by multiplicity and third register, above-mentioned difference changes as features described above Value, above-mentioned previous multiplicity is that the multiplicity of continuous N daily record data is extracted in the daily record data once obtained in the past, above-mentioned The preceding daily record data once obtained be it is preceding once extract target journaling flow according to above-mentioned data extracting rule after, it is in the past primary The daily record data obtained in the target journaling flow of extraction;Judge whether features described above changing value is greater than or equal to third and presets valve Value;Above-mentioned update module is also used to, and updates above-mentioned third register according to the multiplicity of above-mentioned N daily record data.
In some embodiment of the invention, above-mentioned selecting module is specifically used for, according to above-mentioned daily record data size, in calculating The resolution speed that log processing equipment parses above-mentioned daily record data is stated, and calculates each compression in above-mentioned preset compression algorithm The actual compression speed and actual compression ratio of algorithm;The parsing speed of above-mentioned daily record data is parsed according to above-mentioned log processing equipment Degree, the actual compression speed of each compression algorithm and actual compression ratio, select above-mentioned target from above-mentioned preset compression algorithm Compression algorithm.
Further, above-mentioned selecting module is further specifically used for, and for each compression algorithm, calculates at above-mentioned log Reason equipment parses the ratio of the resolution speed of above-mentioned daily record data and the actual compression speed of the compression algorithm;Calculate above-mentioned ratio With the reciprocal of the actual compression ratio of compression algorithm and;It is selected from above-mentioned and all compression algorithms of the satisfaction less than or equal to 1 The maximum compression algorithm of actual compression ratio is as targeted compression algorithm.
Fourth aspect present invention provides a kind of log decompression apparatus, it may include:
Read module is used to read destination packed data packet from database;
Algorithms selection module, for reading compression algorithm identification from destination packed data packet, according to the compression algorithm mark Know the selection target decompression algorithm from decompression algorithm;
Decompression module, for according to target decompression algorithm to the compressed daily record data in the compressed data packets into Row decompression, reduction obtain Source log data.
Fifth aspect present invention provides a kind of log processing system, the log processing equipment provided including the third aspect.
In some embodiment of the invention, above-mentioned log processing system further includes log decompression apparatus;Above-mentioned log solution Compression device is connect with above-mentioned log processing equipment, for obtaining compressed data packets from above-mentioned log processing equipment, according to above-mentioned Compression algorithm identification in compressed data packets obtains corresponding decompression algorithm, and according to above-mentioned decompression algorithm to above-mentioned compression Compressed daily record data in data packet unzips it.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 a is the structural schematic diagram for the log processing system that some embodiments of the invention provide;
Fig. 1 b is the structural schematic diagram for the log processing equipment that some embodiments of the invention provide;
Fig. 2 Fig. 2 is the flow diagram of log compression method provided in an embodiment of the present invention;
Fig. 3 a is the schematic diagram for the extraction log flow that some embodiments of the invention provide;
Fig. 3 b is the schematic diagram for the extraction log flow that other embodiments of the invention provide
Fig. 4 is the flow diagram for the log decompression method that some embodiments of the invention provide;
Fig. 5 is the structural schematic diagram of log processing equipment provided in an embodiment of the present invention;
Fig. 6 is the structural schematic diagram for the log decompression apparatus that some embodiments of the invention provide;
Fig. 7 is another structural schematic diagram of log compression device provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the attached drawing of the embodiment of the present invention, technical solution in the embodiment of the present invention is described.
The embodiment of the invention provides a kind of log compression methods, for flexibly and in real time selection compression algorithm to carry out day Will data compression, the embodiment of the present invention additionally provide a kind of log decompression method simultaneously.Additionally, it is provided the log compression side The corresponding log processing equipment of method, the corresponding decompression apparatus of log decompression method and log processing system.
Fig. 1 a is please referred to, Fig. 1 a is the structural schematic diagram for the log processing system that some embodiments of the invention provide;Scheming In 1a, log processing system includes log processing equipment 10 and log decompression apparatus 20, wherein log processing equipment 10 is main It is used to obtain daily record data, daily record data is compressed, obtain compressed data packets, stored with compressed data packets.And Log decompression apparatus 20 is mainly used for when needed, decompressing the compressed compressed data packets of log processing equipment 10 Contracting, to complete to handle analysis of daily record data etc..
Further, Fig. 1 b is please referred to, Fig. 1 b is that the structure for the log processing equipment that some embodiments of the invention provide is shown It is intended to;As shown in Figure 1 b, log processing equipment 10 includes log receiving module 110, log analyzing module 120, log buffer mould Block 130, log compression module 140, log memory module 150, compression algorithm selecting module 160 and update module 170.Each mould The major function of block is as follows:
Log receiving module 110 then will be received for receiving log flow from the network equipment and safety equipment Log flow is cached;
Log analyzing module 120, for parse the received log flow of log receiving module 110 and extract useful field with Obtain daily record data.
Log buffer module 130 is cached for the daily record data after parsing log analyzing module 120.
Log compression module 140, the daily record data being cached for compressing log buffer module 130, to be compressed Data packet.
Log memory module 150, for storing compressed data packets to database.
Compression algorithm selecting module 160, for the daily record data after being parsed in real time according to log analyzing module 120, selection A kind of optimal compression algorithm be supplied to log compression module 140 cached in daily record data compression.
Update module 170, for updating in safety equipment in the register be related to when daily record data processing.
Log compression method provided in an embodiment of the present invention is applied in the log processing equipment that Fig. 1 b is provided, by real-time Ground is according to daily record data, and flexible choice compression algorithm carries out daily record data compression, to improve compression ratio.It below will be with specific implementation Example, describes to the present invention in detail.
It please refers to Fig. 2 and combines Fig. 1 b, Fig. 2 is the flow diagram of log compression method provided in an embodiment of the present invention; As shown in Fig. 2, a kind of log compression method can include:
201, log processing equipment receives log flow from the network equipment and/or safety equipment, and according to reception log stream The time of amount stores the log flow received;
Specifically, day is received from the network equipment and/or safety equipment for the log receiving module 110 in log processing equipment Will flow.Wherein, the network equipment can be some servers, and safety equipment can be firewall, router and exchanger etc..With For router, from router obtain log flow be it is some be coupled with Internet protocol (Internet Protocol, letter Claiming IP) data packet of head in addition to source IP, Target IP, gateway etc. in data packet further includes data content.Log analyzing module from The data packet is got in log receiving module, important information is then extracted from data packet, such as source IP, destination IP, gateway, number According to content etc., daily record data is generated.
202, log processing equipment extracts target journaling from the log flow of storage according to preset data extracting rule Flow;
It is appreciated that specifically by the log analyzing module 120 in Fig. 1 b according to preset data extracting rule, from storage Target journaling flow is extracted in log flow.It is to be appreciated that preset data extracting rule is specifically to receive and store The time sequencing of log flow is moved on this time shaft using different condition as window is extracted as time shaft, should Extraction window specifically may is that the preset time period or default big on the basis of current time on the basis of current time It is small.
Fig. 3 a and Fig. 3 b is please referred to, wherein Fig. 3 a is the signal for the extraction log flow that some embodiments of the invention provide Figure;In fig. 3 a, log flow receives and stores sequentially in time, and T1 moment and T2 moment are to receive and store log Former and later two times of flow, and to be also current log parsing module 120 extract target journaling stream from the log flow of storage to T2 The time of amount, that is to say, that the target journaling flow that log analyzing module 120 is extracted is received and stored before current time Log flow.And it is moved using preset time period as an extraction window, sequentially in time each time from log The log flow for meeting the preset time segment length is extracted in flow as target journaling flow.
Fig. 3 b is please referred to, goes to mention using default size as an extraction window in fig 3b with being different in for Fig. 3 a Log flow is taken, was starting point to extract default size at the specifically former end once extracted as target journaling flow Log flow.
203, log processing equipment extracts daily record data from target journaling flow;
Also it is to be appreciated that the log analyzing module 120 in log processing equipment obtains the target from target journaling flow The corresponding daily record data of log flow, due to including several data in the daily record data, can be used as one group of log number According to.
It further illustrates, daily record data is the important information for meeting user demand in target journaling flow, after can be used for Log analysis maintenance of phase log processing equipment etc..
After executing step 203, while turning to step 204 and 205.
204, above-mentioned daily record data is saved in caching by log processing equipment;
Wherein, the log that log processing equipment can will be obtained by log buffer module 130 from target journaling flow Data are cached.Log compression module 140 directly reads the log when completing the compression to the log flow from caching Data, it can turn to step 207.
It is appreciated that the embodiment of the present invention neatly replaces compression algorithm to realize according to daily record data, that is, Say that the corresponding daily record data of target journaling flow every time in advance is compressed possibly also with different compression algorithms, in order to ensure Log processing equipment can accurately identify the corresponding relationship of one group of daily record data and compression algorithm, in log analyzing module 120 daily record data for obtaining from target journaling flow will all take timestamp, in turn, what log buffer module 130 cached Also timestamp will be taken in daily record data, which is that log analyzing module reads target journaling from the log flow of storage The time of flow.
205, log processing equipment judges whether to need replacing current compression algorithm according to above-mentioned daily record data;
When needing replacing current compression algorithm, step 206 is turned to, when not needing replacement current compression algorithm, is turned to Step 208.
Specifically, the compression algorithm selecting module 160 in log processing equipment is directly read from log analyzing module 120 in real time Then the daily record data for taking it to obtain from target journaling flow judges whether it needs to update current pressure according to the daily record data Compression algorithm.It is appreciated that compression algorithm selecting module 160 from the daily record data that log analyzing module 120 obtains be with the time Stamp, accordingly, which is also cached by log buffer module 130, then, current compression algorithm is for pressing The compression algorithm of the daily record data to contract before timestamp instruction.
The changing features value of 3 daily record datas is specifically provided in the embodiment of the present invention, is needed replacing for judging whether Current compression algorithm all calculates triggering compression as long as at least one in three changing features values is greater than or equal to preset threshold Method replacement.It describes in detail separately below:
The first situation: 1 in above-mentioned 3 changing features values sets for former and later two times corresponding log processing The absolute value of the difference of the resolution speed of standby parsing daily record data.It is appreciated that former and later two times are, adjacent two from day The time of target journaling flow is read in will flow, resolution speed refers to that log analyzing module 120 is obtained from target journaling flow Take the processing speed of daily record data, and after log analyzing module 120 is when from daily record data is obtained in target journaling flow, and straight It connects to obtain its resolution speed.As an example it is assumed that the size of daily record data is 2M, the time needed for obtaining the daily record data is 1s, then its resolution speed is Vmax=2M/s.Or, it is assumed that the log item number for including in daily record data is Dcur, and obtaining should Time needed for daily record data is 1s, then its resolution speed is Vmax=Dcur item/s.In embodiments of the present invention preferably with Unit of the item number as daily record data.
In this case, compression algorithm selecting module 160 directly gets parsing speed from log analyzing module 120 Degree, in addition, storing log analyzing module 120 by the first register in embodiments of the present invention parses a daily record data Resolution speed.After getting a upper resolution speed in the first register, upper the one of resolution speed and the first register is calculated Then the difference of a resolution speed judges whether this changing features value is greater than or equal to first as a changing features value Preset threshold.Then update module 170 will update the first register with the resolution speed newly obtained.
It is appreciated that each changing features value indicates a kind of variable quantity of feature, it is corresponding with corresponding preset threshold, The difference of resolution speed in embodiments of the present invention corresponds to the first preset threshold, which can be by multiple Experiment obtains.
It is to be appreciated that resolution speed provided in an embodiment of the present invention depends on disk read-write speed, such as inquiry data behaviour It will affect resolution speed Vmax, therefore, resolution speed is also the maximum of log processing equipment at this time in the embodiment of the present invention Processing speed.Furthermore it is also possible to identify maxFlag for the memory setting upper limit for storing compressed data packets, can not divide When being used for compressed data packets with more memories, maxFlag=true.In maxFlag=true or in maxFlag= The maximum processing speed of false, log processing equipment are equal to above-mentioned Vmax.
Under second situation, compression algorithm selecting module 160 calculates the average value of the daily record data size, then from second Register read goes out a upper average value.At a upper average value i.e. compression algorithm selecting module 160 calculate upper one Between stab corresponding daily record data size average value, then calculate two average value difference, the absolute value of the difference is as 1 A changing features value.Then compression algorithm selecting module 160 is according to going to judge whether this feature changing value is greater than or equal to second Preset threshold.Update module 170 updates the second register according to the average value that this is calculated.
It, can be with when compression algorithm selecting module 160 calculates the average value of the daily record data size in order to reduce operand Continuous N daily record data is obtained from daily record data, then calculates the average value of this N daily record data size, as the group The average value of daily record data.
For example, indicating the size of each daily record data with Si, sum SUM (Si) to the size of N daily record data, then It averages again to the size of N daily record data, obtains Savg=SUM (Si)/N.
In the case of the third, compression algorithm selecting module 160 calculates the multiplicity of the daily record data, then deposits from third The multiplicity of the corresponding daily record data of a timestamp is read in device, the absolute value of the difference of two multiplicities is as 3 spies Levy 1 in changing value.Then compression algorithm selecting module 160 goes to judge whether its difference is greater than or equal to third and presets valve Value.Update module 170 updates third register according to the multiplicity that this is calculated.
Equally, in order to reduce operand, continuous N daily record data can be obtained from daily record data, calculate N item day The multiplicity of will data.The multiplicity for calculating N daily record data includes following two step:
1, the multiplicity of each same field in N daily record data, calculation formula are calculated are as follows: REi=SUM (1/Ci)/ N, wherein Ci is the quantity of the daily record data of identical numerical value in same field in N daily record data.
2, the multiplicity of N daily record data, calculation formula are as follows: REr=SUM (SRi*REi) are calculated;
Wherein SRi is the byte number of each field in N daily record data.
For example, as shown in the table, continuous 5 daily record datas are obtained from the interlude of daily record data, in log Sequence of positions in data is respectively 21,22,23,24 and 25, includes same field, respectively source in each daily record data IP, Target IP and gateway, as shown in following table:
Source IP (field 1) Target IP (3 field 2) Gateway (field 3)
21 122.202.19.12 202.198.27.102 255.255.255.0
22 122.202.19.12 202.196.30.224 255.255.255.0
23 212.168.17.25 212.178.98.201 255.255.255.0
24 208.108.201.25 224.19.205.01 255.255.255.0
25 208.108.201.25 212.178.98.201 255.255.255.0
For source IP, there are 2 122.202.19.12 and 2 208.108.201.25, then for source IP this field Multiplicity are as follows: RE (1)=(1/2+1/2+1)/5=0.4.For Target IP field, there are 2 212.178.98.201, the field Multiplicity be RE (2)=(1/2+1+1+1)/5=0.7.For gateway field, there is 5 255.255.255.0, the field Multiplicity be RE (3)=(1/5)/5=0.04.
The byte number of field in 5 daily record datas is only 4 byte of integer, then the multiplicity of 5 daily record datas is REr=RE (1) * 4+RE (2) * 4+RE (3) * 4=0.4*4+0.7*4+0.04*4=4.56.
206, current compression algorithm is if desired replaced, log processing equipment is according to above-mentioned daily record data from preset compression algorithm Middle selection target compression algorithm;
Wherein, targeted compression algorithm is the compression algorithm for being different from current compression algorithm in preset compression algorithm.
It is appreciated that being prefixed a variety of compression algorithms in log processing equipment, each compression algorithm has oneself Compression ratio and compression speed, some compression algorithms lay particular emphasis on compression ratio, some compression algorithms lay particular emphasis on compression speed, but same Compression algorithm would also vary from different compressed datas, actual compression ratio, for example, same compression algorithm is to character string log number According to actual compression ratio be different to the actual compression ratio of digital daily record data.Equally, same compression is calculated to different compressed datas, Actual compression speed would also vary from, for example, same compression algorithm is different to the actual compression speed of character string daily record data In the actual compression speed to digital daily record data.
In embodiments of the present invention the compression algorithm selecting module 160 of log processing equipment determined need replacing it is current When compression algorithm, carry out selection target compression algorithm again according to the daily record data, selection mode is as follows: firstly, from daily record data Middle acquisition daily record data one by one presses the N daily record data using each compression algorithm in preset compression algorithm Contracting, obtains the actual compression ratio Ri of each compression algorithm.
Secondly, obtaining the actual compression speed Vcom of each compression algorithm.Compression speed in usual compression algorithm is As unit of the byte number per second that can be compressed, but in the embodiment of the present invention each compression algorithm actual compression speed Vcom refers to the item number of the daily record data per second that can be compressed, and convert therefore, it is necessary to the compression speed to compression algorithm It is unified with guarantor unit to Vcom, wherein Vcom=C.speed/Savg, Savg are the average value of N daily record data size, Compression speed in any one compression algorithm of C.speed, C.speed be as unit of the byte number per second that can be compressed, Vcom is as unit of the item number of the daily record data per second that can be compressed.
Finally, parsing the resolution speed Vmax of daily record data according to above-mentioned log processing equipment, the unit of Vmax is kept to be Item number/second is that is, identical as the unit of actual compression speed Vcom.If the daily record data item number for including in daily record data is Dcur, So system handles the time of Dcur daily record data as T0, T0=Dcur/Vmax.So each compression algorithm is in practical pressure Under contracting speed Vcom and actual compression ratio Ri, the time of Dcur daily record data of compression is T1=(Dcur/Ri)/Vmax+ Dcur/Vcom.As long as meet T1≤T0, it is ensured that daily record data is not lost, and therefore, when selecting compression algorithm, needs full Sufficient T1≤T0.T1≤T0 is converted to obtain 1/Ri+Vmax/Vcom < 1, so need to only determine the reality of each compression algorithm The resolution speed Vmax of border compression ratio Ri, actual compression speed Vcom and log processing equipment parsing daily record data, that is, can determine whether Whether compression meets the condition for not losing daily record data, then selects actual compression in all compression algorithms for meeting this condition Compression algorithm more maximum than Ri, then can be improved the compression factor of daily record data, to reduce carrying cost.
207, log processing equipment reads the daily record data from caching, according to targeted compression algorithm to the daily record data into Row compression, obtains compressed data packets;
Execute the step 207 rear steering steps 209.
It is appreciated that due to compression algorithm selecting module 160 from the daily record data that log analyzing module 120 obtains be with Timestamp, after selecting targeted compression algorithm, targeted compression algorithm is sent to log pressure by compression algorithm selecting module 160 Contracting module 140 can take the timestamp in targeted compression algorithm.
Further, the log compression module 140 in log processing equipment can be carried according in the targeted compression algorithm Timestamp, the corresponding daily record data of the timestamp is obtained from log buffer module 130, then with the targeted compression algorithm to this Daily record data is compressed, and compressed data packets are obtained.
Wherein, compressed data packets include at least compression algorithm identification, the size of the compressed daily record data, the day The daily record data item number and the compressed daily record data that will data include.
The compressed data pack arrangement is as follows:
If 208, not needing replacement current compression algorithm, log processing equipment reads the daily record data from caching, according to working as Preceding compression algorithm compresses the daily record data, obtains compressed data packets;
208 rear steerings 209 are executed.
It is to be appreciated that the compression algorithm selecting module 160 in log processing equipment determines that not needing replacement current compression calculates When method, it will notify log compression module 140, the timestamp of the daily record data can be carried in the notification.Therefore, log compression Module 140 reads the timestamp corresponding day according to this timestamp when receiving the notice from log buffer module 130 Will data compress the daily record data with current compression algorithm, obtain compressed data packets, the formats of the compressed data packets according to It is so as shown above.
209, log processing equipment stores data packet to database.
Wherein, storage is specifically executed by the log memory module 150 in log processing equipment, which can be service Database in device, the database can be accessed by log processing equipment and log decompression apparatus simultaneously.
Above primarily directed to log processing equipment be introduced, below will be right based on log decompression apparatus The present invention is described further.Referring to Fig. 4, Fig. 4 is the process for the log decompression method that some embodiments of the invention provide Schematic diagram;As shown in figure 4, a kind of log decompression method, it may include:
401, log decompression apparatus reads destination packed data packet from database;
It is appreciated that the compressed compressed data packets of log processing equipment can be with classification storage to database, it can also be by Time storage can according to need setting searching number in storage, need to handle the compressed data packets in user (including The Source log data of compressed data packets are further analysed) when, it can be by searching number from the compressed data packets of storage Retrieval.
402, log decompression apparatus reads compression algorithm identification from destination packed data packet, according to the compression algorithm mark Know the selection target decompression algorithm from decompression algorithm;
It is appreciated that target decompression algorithm is preset in log decompression apparatus, the target decompression algorithm and log Compression algorithm in processing equipment corresponds, that is to say, that the compressed daily record data of each compression algorithm is used, it can be with Corresponding decompression algorithm is found in log decompression apparatus to be decompressed.
403, log decompression apparatus is according to target decompression algorithm to the compressed log number in the compressed data packets According to unziping it, reduction obtains Source log data.
It is appreciated that the decompression in the embodiment of the present invention includes after unziping it to compressed data packets, further according to pressure Compression algorithm identification in contracting data packet finds out decompression algorithm, using decompression algorithm to compressed in compressed data packets Daily record data unzips it, certainly, the embodiment of the present invention be mainly used for be described in detail how to compressed daily record data into Row reduction.
Referring to Fig. 5, Fig. 5 is the structural schematic diagram of log processing equipment provided in an embodiment of the present invention;As shown in figure 5, A kind of log processing equipment can include:
Extraction module 510 is used for according to preset data extracting rule, from the log received and stored sequentially in time Target journaling flow is extracted in flow, the preset data extracting rule includes that preset time is extracted on the basis of current time The log flow of default size is extracted at log flow in section, or the former end once extracted for starting point;
Judgment module 520, according to the daily record data, is sentenced for obtaining daily record data from the target journaling flow It is disconnected whether to need replacing current compression algorithm;
Selecting module 530, for if desired replacing current compression algorithm, according to the daily record data from preset compression algorithm Middle selection target compression algorithm, wherein the targeted compression algorithm is to be different from the current pressure in the preset compression algorithm The compression algorithm of compression algorithm;
Log compression module 540 is pressed for being compressed according to the targeted compression algorithm to the daily record data Contracting data packet;
Update module 550, for the current compression algorithm to be updated to the targeted compression algorithm.
Wherein, which is the log processing equipment 10 that above-mentioned Fig. 1 a and Fig. 1 b is introduced, and extraction module 510 It can be realized by the log analyzing module 110 in above-mentioned Fig. 1 b, judgment module 520 and selecting module 530 can be by Fig. 1 b Compression algorithm selecting module 160 realize that log compression module 540 realized by the log compression module 140 in Fig. 1 b.Update mould Block 550 is realized by the update module 170 in Fig. 1 b.
Above-mentioned compressed data packets include at least compression algorithm identification, the size of the compressed daily record data, the day The daily record data item number and the compressed daily record data that will data include.
In some embodiment of the invention, above-mentioned log compression module 540 is also used to, if not needing to replace the current pressure Compression algorithm compresses the daily record data according to the current compression algorithm.
In some embodiment of the invention, above-mentioned judgment module 520 is specifically used for, and according to the daily record data, calculates institute At least one the changing features value for stating daily record data, judges whether at least one described changing features value is greater than or equal to default valve Value.
In other embodiments of the invention, above-mentioned judgment module 520 is specifically used for, according to the big of the daily record data It is small, calculate the dissection process speed that the log processing equipment parses the daily record data;Calculate the resolution speed with it is described The difference of a preceding resolution speed indicated by first register in log processing equipment, described in the absolute value of the difference is used as Changing features value, solution when a preceding resolution speed is the daily record data once obtained before the log processing equipment is handled Speed is analysed, the daily record data once obtained before described once extracts target journaling flow according to the data extracting rule to be preceding Afterwards, the daily record data obtained in the target journaling flow once extracted in the past;Judge whether the changing features value is greater than or waits In the first preset threshold;Above-mentioned update module 550 is also used to, and the solution of the daily record data is parsed according to the log processing equipment It analyses speed and updates first register.
In other embodiments of the invention, above-mentioned judgment module 520 is specifically used for, the company of extraction from the daily record data N continuous daily record data, the N are the natural number more than or equal to 1;Calculate the average value of the N daily record data size;Meter The difference of previous average value indicated by the average value and the second register is calculated, the absolute value of the difference is as the spy Changing value is levied, a preceding average value is the continuous N daily record data size extracted in the daily record data once obtained in the past Average value, the daily record data once obtained before described once extract target journaling flow according to the data extracting rule to be preceding Afterwards, the daily record data obtained in the target journaling flow once extracted in the past;Judge whether the changing features value is greater than or waits In the second preset threshold;Above-mentioned update module 550 is also used to, according to the update of the average value of the N daily record data size Second register.
In other of the invention embodiments, above-mentioned judgment module 520 from the daily record data also particularly useful for extracting Continuous N daily record data, the N are the natural number more than or equal to 1;Calculate the multiplicity of the N daily record data;It calculates The difference of previous multiplicity indicated by the multiplicity and third register, the absolute value of the difference is as the feature Changing value, the previous multiplicity are the repetition of the continuous N daily record data extracted in the daily record data once obtained in the past Degree, it is described before the daily record data that once obtains be it is preceding once extract target journaling flow according to the data extracting rule after, In the past the daily record data obtained in the target journaling flow once extracted;Judge whether the changing features value is greater than or equal to Three preset thresholds;Above-mentioned update module 550 is also used to update the third deposit according to the multiplicity of the N daily record data Device.
In some embodiment of the invention, above-mentioned selecting module 530 is specifically used for, according to the daily record data size, meter The resolution speed that the log processing equipment parses the daily record data is calculated, and calculates each in the preset compression algorithm The actual compression speed and actual compression ratio of compression algorithm;The parsing of the daily record data is parsed according to the log processing equipment Speed, the actual compression speed of each compression algorithm and actual compression ratio, select the mesh from the preset compression algorithm Mark compression algorithm.
In some embodiment of the invention, above-mentioned selecting module 530 is further specifically used for, and calculates for each compression Method calculates the log processing equipment and parses the resolution speed of the daily record data and the actual compression speed of the compression algorithm Ratio;Calculate the reciprocal of the actual compression ratio of the ratio and compression algorithm and;It is less than or equal to 1 institute from described and satisfaction Have and selects the maximum compression algorithm of actual compression ratio in compression algorithm as targeted compression algorithm.
Referring to Fig. 6, Fig. 6 is the structural schematic diagram for the log decompression apparatus that some embodiments of the invention provide;Such as Fig. 6 It is shown, a kind of log decompression apparatus can include:
Read module 610 is used to read destination packed data packet from database;
Algorithms selection module 620, for reading compression algorithm identification from destination packed data packet, according to the compression algorithm Mark selection target decompression algorithm from decompression algorithm;
Decompression module 630, for according to target decompression algorithm to the compressed log number in the compressed data packets According to unziping it, reduction obtains Source log data.
Wherein, which is log decompression apparatus 20 described in above-mentioned Fig. 1 a.
Due to carrying compression algorithm identification in compressed data packets, algorithms selection module 620 can be according to the compression algorithm Mark selects corresponding decompression algorithm from preset decompression algorithm, and then decompression module 630 is according to decompression algorithm Compressed daily record data in compressed data packets is unziped it, reduction obtains daily record data.
In addition, some embodiments of the invention additionally provide a kind of log processing system, the log processing system include as Log processing equipment 10 shown in fig. 5.Further, which further comprises log decompression apparatus such as Fig. 6 institute 20 shown, can be refering to above-mentioned being discussed in detail to log processing equipment 10 and log decompression apparatus 20, and details are not described herein.
Referring to FIG. 7, Fig. 7 is another structural schematic diagram of log compression device provided in an embodiment of the present invention, wherein can wrap Include at least one processor 701 (such as CPU, Central Processing Unit), at least one network interface or other Communication interface, memory 702 and at least one communication bus, for realizing the connection communication between these devices.The processing Device 701 is for executing the executable module stored in memory, such as computer program.The memory 702 may include height Fast random access memory (RAM, Random Access Memory), it is also possible to further include non-labile memory (non- Volatile memory), a for example, at least magnetic disk storage.(it can be wired or nothing by least one network interface Line) realize communication connection between the system gateway and at least one other network element, internet can be used, wide area network is local Net, Metropolitan Area Network (MAN) etc..
As shown in fig. 7, in some embodiments, program instruction is stored in the memory 702, program instruction can be with Executed by processor 701, the processor 701 specifically executes following steps: according to preset data extracting rule, from according to when Between extract target journaling flow in the log flow that sequentially receives and stores, the preset data extracting rule includes with current The log flow in preset time period is extracted on the basis of time, or is that starting point extraction is default big at the end once extracted in the past Small log flow;Daily record data is obtained from the target journaling flow;According to the daily record data, judge whether to need more Change current compression algorithm;If desired current compression algorithm is replaced, mesh is selected from preset compression algorithm according to the daily record data Mark compression algorithm, wherein the targeted compression algorithm is to be different from the current compression algorithm in the preset compression algorithm Compression algorithm;The daily record data is compressed to obtain compressed data packets according to the targeted compression algorithm, it will be described current Compression algorithm is updated to the targeted compression algorithm.
In some embodiments, following steps can also be performed in the processor 701: if not needing to replace described current Compression algorithm, it is standby that the daily record data is compressed according to the current compression algorithm.
In some embodiments, following steps can also be performed in the processor 701: according to the daily record data, meter It is pre- to judge whether at least one described changing features value is greater than or equal at least one the changing features value for calculating the daily record data If threshold values.
In some embodiments, following steps can also be performed in the processor 701: according to the big of the daily record data It is small, calculate the resolution speed that the log processing equipment parses the daily record data;Calculate the resolution speed and the log The difference of a preceding resolution speed indicated by first register in processing equipment, the absolute value of the difference is as the feature Changing value, parsing speed when a preceding resolution speed is the daily record data once obtained before the log processing equipment is handled Degree, it is described before the daily record data that once obtains be it is preceding once extract target journaling flow according to the data extracting rule after, In the past the daily record data obtained in the target journaling flow once extracted;Judge whether the changing features value is greater than or equal to One preset threshold, and first deposit is updated according to the resolution speed that the log processing equipment parses the daily record data Device.
In some embodiments, following steps can also be performed in the processor 701: obtaining from the daily record data Continuous N daily record data, the N are the natural number more than or equal to 1;Calculate the average value of the N daily record data size; The difference of previous average value indicated by the average value and the second register is calculated, described in the absolute value of the difference is used as Changing features value, a preceding average value are that the continuous N daily record data that extracts in the daily record data that once obtains in the past is big Small average value, the daily record data once obtained before described once extract target journaling stream according to the data extracting rule to be preceding After amount, the daily record data that is obtained in the target journaling flow that once extracts in the past;Judge the changing features value whether be greater than or Second register is updated equal to the second preset threshold, and according to the average value of the N daily record data size.
In some embodiments, following steps can also be performed in the processor 701: obtaining from the daily record data Continuous N daily record data, the N are the natural number more than or equal to 1;Calculate the multiplicity of the N daily record data;It calculates The difference of previous multiplicity indicated by the multiplicity and third register, the absolute value of the difference is as the feature Changing value, the previous multiplicity are the repetition of the continuous N daily record data extracted in the daily record data once obtained in the past Degree, it is described before the daily record data that once obtains be it is preceding once extract target journaling flow according to the data extracting rule after, In the past the daily record data obtained in the target journaling flow once extracted;Judge whether the changing features value is greater than or equal to Three preset thresholds, and the third register is updated according to the multiplicity of the N daily record data.
In some embodiments, following steps can also be performed in the processor 701: big according to the daily record data It is small, the resolution speed that the log processing equipment parses the daily record data is calculated, and calculate in the preset compression algorithm The actual compression speed and actual compression ratio of each compression algorithm;According to the resolution speed, each for parsing the daily record data The actual compression speed and actual compression ratio of kind compression algorithm, select the targeted compression to calculate from the preset compression algorithm Method.
In some embodiments, following steps can also be performed in the processor 701: it is directed to each compression algorithm, Calculate the ratio that the log processing equipment parses the resolution speed of the daily record data and the actual compression speed of the compression algorithm Value;Calculate the reciprocal of the actual compression ratio of the ratio and compression algorithm and;It is all less than or equal to 1 from described and satisfaction The maximum compression algorithm of actual compression ratio is selected in compression algorithm as targeted compression algorithm.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.
Above to a kind of log compression method provided by the present invention, log decompression method and log processing equipment, day Will decompression apparatus, log processing system are described in detail, real according to the present invention for those of ordinary skill in the art The thought of example is applied, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification is not answered It is interpreted as limitation of the present invention.

Claims (20)

1. a kind of log compression method characterized by comprising
Log processing equipment is mentioned from the log flow received and stored sequentially in time according to preset data extracting rule Target journaling flow is taken, the preset data extracting rule includes the day extracted in preset time period on the basis of current time The log flow of default size is extracted at will flow, or the former end once extracted for starting point;
The log processing equipment obtains daily record data from the target journaling flow;
The log processing equipment judges whether to need replacing current compression algorithm according to the daily record data;
If desired current compression algorithm is replaced, the log processing equipment is selected from preset compression algorithm according to the daily record data Select targeted compression algorithm, wherein the targeted compression algorithm is to be different from the current compression in the preset compression algorithm to calculate The compression algorithm of method;
The log processing equipment is compressed to obtain compressed data packets according to the targeted compression algorithm to the daily record data, The current compression algorithm is updated to the targeted compression algorithm.
2. the method according to claim 1, wherein the method also includes:
If not needing to replace the current compression algorithm, the log processing equipment is according to the current compression algorithm to the day Will data are compressed.
3. the method according to claim 1, wherein the log processing equipment is sentenced according to the daily record data It is disconnected whether to need replacing current compression algorithm and include:
The log processing equipment calculates at least one changing features value of the daily record data, sentences according to the daily record data Whether at least one described changing features value of breaking is greater than or equal to preset threshold.
4. according to the method described in claim 3, it is characterized in that, the log processing equipment is according to the daily record data, meter It is pre- to judge whether at least one described changing features value is greater than or equal at least one the changing features value for calculating the daily record data If threshold values includes:
The log processing equipment calculates the log processing equipment and parses the log number according to the size of the daily record data According to resolution speed;
Before the log processing equipment calculates in the resolution speed and the log processing equipment indicated by the first register The difference of one parsing speed, for the absolute value of the difference as the changing features value, a preceding resolution speed is institute The resolution speed when daily record data once obtained before the processing of log processing equipment is stated, the daily record data once obtained before described is It is preceding once extract target journaling flow according to the data extracting rule after, obtained in the target journaling flow that once extracts in the past The daily record data taken;
The log processing equipment judges whether the changing features value is greater than or equal to the first preset threshold, and according to described The resolution speed that log processing equipment parses the daily record data updates first register.
5. according to the method described in claim 3, it is characterized in that, the log processing equipment is according to the daily record data, meter It is pre- to judge whether at least one described changing features value is greater than or equal at least one the changing features value for calculating the daily record data If threshold values includes:
The log processing equipment extracts continuous N daily record data from the daily record data, and the N is more than or equal to 1 Natural number;
The log processing equipment calculates the average value of the N daily record data size;
The log processing equipment calculates the difference of previous average value indicated by the average value and the second register, described For the absolute value of difference as the changing features value, a preceding average value is to extract in the daily record data once obtained in the past Continuous N daily record data size average value, it is described before the daily record data that once obtains be preceding once to be extracted according to the data After Rule Extraction to target journaling flow, the daily record data that is obtained in the target journaling flow that once extracts in the past;
The log processing equipment judges whether the changing features value is greater than or equal to the second preset threshold, and according to described The average value of N daily record data size updates second register.
6. according to the method described in claim 3, it is characterized in that, the log processing equipment is according to the daily record data, meter At least one the changing features value for calculating the daily record data, judges whether at least one described changing features value reaches preset threshold Include:
The log processing equipment extracts continuous N daily record data from the daily record data, and the N is more than or equal to 1 Natural number;
The log processing equipment calculates the multiplicity of the N daily record data;
The log processing equipment calculates the difference of previous multiplicity indicated by the multiplicity and third register, described For the absolute value of difference as the changing features value, the previous multiplicity is to extract in the daily record data once obtained in the past Continuous N daily record data multiplicity, it is described before the daily record data that once obtains be that preceding once extract according to the data is advised After then extracting target journaling flow, the daily record data that is obtained in the target journaling flow that once extracts in the past;
The log processing equipment judges whether the changing features value is greater than or equal to third preset threshold, and according to described The multiplicity of N daily record data updates the third register.
7. described in any item methods according to claim 1~6, which is characterized in that the log processing equipment is according to the day Will data selection target compression algorithm from preset compression algorithm includes:
The log processing equipment calculates the log processing equipment and parses the daily record data according to the daily record data size Resolution speed, and calculate the actual compression speed and actual compression of each compression algorithm in the preset compression algorithm Than;
The log processing equipment parses the resolution speed of the daily record data, each compression according to the log processing equipment The actual compression speed and actual compression ratio of algorithm select the targeted compression algorithm from the preset compression algorithm.
8. the method according to the description of claim 7 is characterized in that the log processing equipment is according to the log processing equipment The resolution speed of the daily record data, the actual compression speed of each compression algorithm and actual compression ratio are parsed, from described pre- It sets and selects the targeted compression algorithm to include: in compression algorithm
For each compression algorithm, the log processing equipment calculates the log processing equipment and parses the daily record data The ratio of the actual compression speed of resolution speed and the compression algorithm;The log processing equipment calculates the ratio and compression is calculated The sum reciprocal of the actual compression ratio of method;
The log processing equipment selects actual compression ratio maximum from described and all compression algorithms of the satisfaction less than or equal to 1 Compression algorithm as targeted compression algorithm.
9. the method according to claim 1, wherein the compressed data packets include at least compression algorithm identification, The daily record data item number and the compressed log number that the size of the compressed daily record data, the daily record data include According to.
10. a kind of log processing equipment characterized by comprising
Extraction module is used for according to preset data extracting rule, from the log flow received and stored sequentially in time Target journaling flow is extracted, the preset data extracting rule includes to extract in preset time period on the basis of current time The log flow of default size is extracted at log flow, or the former end once extracted for starting point;
Judgment module, according to the daily record data, judges whether to need for obtaining daily record data from the target journaling flow Replace current compression algorithm;
Selecting module is selected from preset compression algorithm for if desired replacing current compression algorithm according to the daily record data Targeted compression algorithm, wherein the targeted compression algorithm is to be different from the current compression algorithm in the preset compression algorithm Compression algorithm;
Log compression module, for being compressed to obtain compressed data to the daily record data according to the targeted compression algorithm Packet;
Update module, for the current compression algorithm to be updated to the targeted compression algorithm.
11. log processing equipment according to claim 10, which is characterized in that
The log compression module is also used to, if not needing to replace the current compression algorithm, according to the current compression algorithm The daily record data is compressed.
12. log processing equipment according to claim 10, which is characterized in that
The judgment module is specifically used for, and according to the daily record data, calculates at least one changing features of the daily record data Value, judges whether at least one described changing features value is greater than or equal to preset threshold.
13. log processing equipment according to claim 12, which is characterized in that
The judgment module is specifically used for, and according to the size of the daily record data, calculates described in the log processing equipment parsing The dissection process speed of daily record data;It calculates in the resolution speed and the log processing equipment indicated by the first register The difference of a preceding resolution speed, as the changing features value, a preceding resolution speed is the absolute value of the difference The resolution speed when daily record data once obtained before the log processing equipment processing, the daily record data once obtained before described For it is preceding once extract target journaling flow according to the data extracting rule after, in the past in the target journaling flow that once extracts The daily record data of acquisition;Judge whether the changing features value is greater than or equal to the first preset threshold;
The update module is also used to, according to the log processing equipment parse the daily record data resolution speed update described in First register.
14. log processing equipment according to claim 12, which is characterized in that
The judgment module is specifically used for, and extracts continuous N daily record data from the daily record data, the N be greater than or Natural number equal to 1;Calculate the average value of the N daily record data size;It calculates the average value and the second register is signified The difference of the previous average value shown, as the changing features value, a preceding average value is the absolute value of the difference In the past the continuous N daily record data size average value extracted in the daily record data once obtained, the log once obtained before described Data be it is preceding once extract target journaling flow according to the data extracting rule after, the target journaling stream that once extracts in the past The daily record data obtained in amount;Judge whether the changing features value is greater than or equal to the second preset threshold;
The update module is also used to, and updates second register according to the average value of the N daily record data size.
15. log processing equipment according to claim 12, which is characterized in that
The judgment module is specifically used for, and extracts continuous N daily record data from the daily record data, the N be greater than or Natural number equal to 1;Calculate the multiplicity of the N daily record data;It calculates indicated by the multiplicity and third register The difference of previous multiplicity, for the absolute value of the difference as the changing features value, the previous multiplicity is in the past The multiplicity of continuous N daily record data is extracted in the daily record data once obtained, before the daily record data once obtained before described is After once extracting target journaling flow according to the data extracting rule, obtained in the target journaling flow that the past once extracts Daily record data;Judge whether the changing features value is greater than or equal to third preset threshold;
The update module is also used to, and updates the third register according to the multiplicity of the N daily record data.
16. 0~15 described in any item log processing equipment according to claim 1, which is characterized in that
The selecting module is specifically used for, and according to the daily record data size, calculates the log processing equipment and parses the day The resolution speed of will data, and calculate the actual compression speed and reality of each compression algorithm in the preset compression algorithm Compression ratio;The resolution speed of the daily record data, the practical pressure of each compression algorithm are parsed according to the log processing equipment Contracting speed and actual compression ratio select the targeted compression algorithm from the preset compression algorithm.
17. log processing equipment according to claim 16, which is characterized in that
The selecting module is further specifically used for, and for each compression algorithm, calculates log processing equipment parsing institute State the ratio of the resolution speed of daily record data and the actual compression speed of the compression algorithm;Calculate the ratio and compression algorithm The sum reciprocal of actual compression ratio;Actual compression ratio is selected most from described and all compression algorithms of the satisfaction less than or equal to 1 Big compression algorithm is as targeted compression algorithm.
18. log processing equipment according to claim 10, which is characterized in that the compressed data packets include at least compression Daily record data item number that algorithm mark, the size of the compressed daily record data, the daily record data include and compressed The daily record data.
19. a kind of log processing system characterized by comprising the described in any item log processings of claim 10~18 are set It is standby.
20. log processing system according to claim 19, which is characterized in that the log processing system further includes log Decompression apparatus;
The log decompression apparatus is connect with the log processing equipment, for obtaining compression number from the log processing equipment According to packet, corresponding decompression algorithm is obtained according to the compression algorithm identification in the compressed data packets, and according to the decompression Algorithm unzips it the compressed daily record data in the compressed data packets.
CN201510726130.8A 2015-10-30 2015-10-30 A kind of log compression method and log processing equipment, log processing system Active CN106649336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510726130.8A CN106649336B (en) 2015-10-30 2015-10-30 A kind of log compression method and log processing equipment, log processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510726130.8A CN106649336B (en) 2015-10-30 2015-10-30 A kind of log compression method and log processing equipment, log processing system

Publications (2)

Publication Number Publication Date
CN106649336A CN106649336A (en) 2017-05-10
CN106649336B true CN106649336B (en) 2019-10-25

Family

ID=58830631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510726130.8A Active CN106649336B (en) 2015-10-30 2015-10-30 A kind of log compression method and log processing equipment, log processing system

Country Status (1)

Country Link
CN (1) CN106649336B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506284B (en) * 2017-08-31 2021-06-15 麒麟合盛网络技术股份有限公司 Log processing method and device
CN109962710A (en) * 2017-12-14 2019-07-02 阿里巴巴集团控股有限公司 Data compression method, electronic equipment and computer readable storage medium
CN108197168A (en) * 2017-12-18 2018-06-22 中国移动通信集团广东有限公司 A kind of data compression method, equipment and computer readable storage medium
CN108512704A (en) * 2018-04-09 2018-09-07 网易(杭州)网络有限公司 The processing method and processing device of daily record
CN109062774A (en) * 2018-06-21 2018-12-21 平安科技(深圳)有限公司 Log processing method, device and storage medium, server
CN108897890B (en) * 2018-07-11 2020-04-24 吉林吉大通信设计院股份有限公司 Distributed big data log aggregation method based on space-time dual compression
CN110543458A (en) * 2019-09-13 2019-12-06 北京上下文系统软件有限公司 compression algorithm for mobile network internet log data
CN111935261B (en) * 2020-07-30 2023-08-08 北京达佳互联信息技术有限公司 Response message processing method, device, electronic equipment and storage medium
CN112256651B (en) * 2020-09-28 2022-06-14 苏州浪潮智能科技有限公司 Method and device for collecting multi-source heterogeneous logs
CN113746665B (en) * 2021-07-29 2022-04-15 深圳市明源云科技有限公司 Log data processing method, device and storage medium
CN114338825B (en) * 2021-12-15 2022-11-29 中电信数智科技有限公司 TR069 protocol-based SRv network distributed log compression method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120077874A1 (en) * 2010-02-15 2012-03-29 Yun Kau Tam Development of a phytoestrogen product for the prevention or treatment of osteoporosis using red clover
US9026505B1 (en) * 2011-12-16 2015-05-05 Emc Corporation Storing differences between precompressed and recompressed data files
CN104967630B (en) * 2014-04-10 2017-09-22 腾讯科技(深圳)有限公司 The processing method and processing device of web access requests

Also Published As

Publication number Publication date
CN106649336A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN106649336B (en) A kind of log compression method and log processing equipment, log processing system
EP2088711A1 (en) A log analyzing method and system based on distributed compute network
CN102404564B (en) Utilize the relative and data compression and decompression of absolute increment value
CN105553937B (en) The system and method for data compression
US10727864B2 (en) Server and method for compressing data by device
CN105653484B (en) A kind of deblocking compression multi-channel transmission method
CN105144157B (en) System and method for the data in compressed data library
CN110399546B (en) Link duplicate removal method, device, equipment and storage medium based on web crawler
CN104348490A (en) Combined data compression algorithm based on effect optimization
CN104040899B (en) Generate the code alphabet of symbol to be that the word being used together with program generates code word
CN104918046A (en) Local descriptor compression method and device
CN112463784A (en) Data deduplication method, device, equipment and computer readable storage medium
CN111813756A (en) Log retrieval system, method and device, electronic equipment and storage medium
CN106790334A (en) A kind of page data transmission method and system
CN110442489A (en) The method and storage medium of data processing
CN111526151A (en) Data transmission method and device, electronic equipment and storage medium
CN109687875B (en) Time sequence data processing method
CN112269726A (en) Data processing method and device
CN107092529B (en) OLAP service method, device and system
CN109617960A (en) A kind of web AR data presentation method based on attributed separation
CN112054805B (en) Model data compression method, system and related equipment
US11086880B2 (en) Systems and methods for lossy data compression using key artifacts and dynamically generated cycles
CN108228759A (en) Storage processing method, device, computer equipment and the storage medium of record set
CN110311754B (en) Data receiving method and device, storage medium and electronic equipment
US8593310B1 (en) Data-driven variable length encoding of fixed-length data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant