Embodiment
It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the present invention in detail in conjunction with the embodiments.
The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged, in the appropriate case so that embodiments of the invention described herein.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.
Embodiment 1
Embodiments provide a kind of disposal route of database journal.
Fig. 1 is the process flow diagram of the disposal route of database journal according to the embodiment of the present invention.As shown in Figure 1, it is as follows that the method comprising the steps of:
Step S11, obtains journal file.
Concrete, by above-mentioned steps S11, get the journal file that server generates.
Step S13, reads the segmentation rule of journal file.
Concrete, by above-mentioned steps S13, reading pre-set is good, to the segmentation rule that journal file is split.
Step S15, splits journal file according to segmentation rule, obtains at least two pre-service journal files.
Concrete, by above-mentioned steps S15, according to the segmentation rule read, the whole journal file got is split, journal file is divided into plural pre-service journal file, for follow-up use.
At least two pre-service journal files are write preprocessed data storehouse by step S17 successively.
Concrete, by above-mentioned steps S17, the pre-service journal file after segmentation is read respectively, and writes successively in the middle of the preprocessed data storehouse for storing pre-service journal file.
By above-mentioned steps S11 to step S17, respectively the journal file that script Single document is very large is split, be divided into the preprocessed file that several files are less, and the pre-service journal file through dividing processing is stored in the middle of preprocessed data storehouse.
In the middle of practical application, because the journal file huge to Single document reads need a lot of hardware resources, such as: memory size and hard disk cache capacity etc.Also be subject to the restriction of disk read-write speed, cause reading efficiency very low.So, journal file is split, is divided into the pre-service journal file that several are little.These little pre-service journal files are because file size is less, and computing machine reads quickly, reads successively to each preprocessed file, and stored in preprocessed data storehouse, to accelerate the reading speed of journal file.
In summary, the invention solves and in prior art, read-write operation is carried out repeatedly to whole journal file and make the pre-service of web log file length consuming time, cause the inefficient problem of log processing, achieve and improve the processing speed of journal file and the effect for the treatment of effeciency.
Preferably, in the embodiment that the application provides, write in preprocessed data storehouse in step S17 successively by least two pre-service journal files, step comprises:
Step S171, read the file attribute of journal file, wherein, file attribute at least comprises: log recording initial time and log recording termination time.
Step S173, according to file attribute, the time configuration information in amendment preprocessed data storehouse.
Step S175, according to time configuration information, reads the pre-service journal file corresponding with time configuration information.
Step S177, writes preprocessed data storehouse by the pre-service journal file corresponding with time configuration information successively.
Concrete, by above-mentioned steps S171 to step S177, obtain log recording initial time and the log recording termination time of journal file.According to log recording initial time and log recording termination time, the time configuration information in amendment preprocessed data storehouse, this time configuration information can comprise pre-service initial time and pre-service termination time.The pre-service journal file matched with time configuration information, according to time configuration information, is determined in preprocessed data storehouse.The pre-service journal file determined is write in the middle of preprocessed data storehouse successively.
Each pre-service journal file has pre-service journal file attribute, pre-service journal file attribute at least comprises: pre-service log recording initial time and pre-service log recording termination time, pre-service log recording initial time can be determined according to the time of the Article 1 log recording of this pre-service journal file, and the pre-service log recording termination time then can be determined according to the time of the last item log recording of this pre-service journal file.
Preprocessed data storehouse is according to time configuration information, determine the pre-service journal file matched with time configuration information, can be that pre-service log recording initial time is more than or equal to pre-service initial time and is less than or equal to the pre-service journal file of pre-service termination time, also can be the pre-service log recording termination time be more than or equal to pre-service initial time and be less than or equal to the pre-service journal file of pre-service termination time, can also be that pre-service log recording initial time is more than or equal to pre-service initial time and the pre-service log recording termination time is less than or equal to the pre-service journal file of pre-service termination time.Specifically how to select those skilled in the art to set according to actual needs, do not repeat at this.
Under some scene, in order to reduce journal file taking database side storage resources, after pre-service journal file is obtained to journal file segmentation, this original journal file can be deleted, log recording initial time and the log recording termination time of reading this journal file is needed if follow-up, the log recording initial time of this journal file and log recording termination time can be read out before raw log files is deleted and carry out record, the reading in order to subsequent step is called.
Preferably, as shown in Figure 2, in the embodiment that the application provides, after step S17 is by pre-service journal file successively write into Databasce, method also comprises:
Step S19, according to time configuration information, calculates the analyzing and processing period.
Step S21, according to the analyzing and processing period, reads the pre-service journal file in preprocessed data storehouse.
Step S23, analyzes pre-service journal file, obtains standard logs tables of data.
Step S25, by standard logs tables of data write log database.
Concrete, by above-mentioned steps S19 to step S25, the pre-service journal file in preprocessed data storehouse is further processed.According to the pre-service initial time in time configuration information, pre-service termination time, calculate the analyzing and processing period needing to carry out analyzing and processing.Because there is the pre-service journal file of each time in preprocessed data storehouse, so need the time interval determined according to the analyzing and processing period, read the pre-service journal file in the preprocessed data storehouse be in this time interval.These pre-service journal files are carried out analyzing and processing, obtain may be used for directly being stored in the middle of database, for the standard logs tables of data analyzed, by the standard logs tables of data that obtains stored in the middle of log database, to log content, it reads, analyzes at any time.
In the middle of practical application, according to the analyzing and processing period, pre-service journal file is processed, according to the loading condition of server, the pre-service journal file in preprocessed data storehouse can be processed.Also can, by the real time modifying to the analyzing and processing period, realize processing in real time the pre-service journal file in preprocessed data storehouse.Predetermined time interval can also be passed through, after the processing procedure of the pre-service journal file in preprocessed data storehouse has been evenly distributed to each generation journal file.Like this, only need to revise the analyzing and processing period, just according to actual conditions, the processing procedure of analyzing and processing journal file can be controlled.
Wherein, the analyzing and processing period is the time interval determined by process start time and process termination time, can directly using log recording initial time as the process start time in the analyzing and processing period, the log recording termination time as the process termination time in the analyzing and processing period, to carry out analyzing and processing for each journal file in time.
Preferably, in the embodiment that the application provides, analyze pre-service journal file in step S23, obtain in standard logs tables of data, step comprises:
Step S231, extracts the log content of pre-service journal file, obtains target data.
Step S233, according to the target data type pre-set, is converted to target data type by target data and obtains result.
Step S235, aggregation process result, generates standard logs tables of data.
Concrete, by above-mentioned steps S231 to step S235, read pre-service journal file, by the content extraction in pre-service journal file out, be converted to the data content that can be stored in database, obtain target data.Target data is changed, is converted to unified data type.Target data after conversion is gathered, generates standard logs tables of data, be stored in the middle of log database, to transfer at any time.
Preferably, as shown in Figure 3, in the embodiment that the application provides, before step S13 reads the segmentation rule to journal file, method also comprises:
Step S121, obtains the amount of capacity of journal file.
Step S123, read the capacity threshold pre-set, wherein, capacity threshold is for judging the size of journal file.
Step S125, according to amount of capacity and the capacity threshold of journal file, judges whether to split journal file;
Wherein, when the capacity of journal file is greater than capacity threshold, determine to carry out dividing processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determine not carry out dividing processing to journal file.
Concrete, by step S121 to step S125, the amount of capacity of journal file is judged, when the amount of capacity of journal file is more than or equal to the capacity threshold pre-set, dividing processing is carried out to journal file.When the amount of capacity of journal file is less than the capacity threshold pre-set, then dividing processing is not carried out to journal file.
In the middle of practical application, at night or other idle periods, website visiting amount is also little, so the journal file produced is just relatively little.And to the process of less journal file in process, very large burden can't be caused to system.So add determining step here, judge whether to carry out dividing processing to journal file.
Preferably, in the embodiment that the application provides, the segmentation rule of journal file is at least comprised: fixing number split plot design, fixed capacity split plot design.
Preferably, in the embodiment that the application provides, when splitting rule for fixing number split plot design, step S15 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step S151a, reads the segmentation number pre-set.
Step S153a, on average splits journal file according to segmentation number, is fixed the pre-service journal file of number.
Concrete, by above-mentioned steps S151a and step S153a, according to fixing number split plot design, dividing processing is carried out to journal file.Read the segmentation number pre-set, all journal file average marks are slit into the pre-service journal file pre-setting segmentation number.
Preferably, in the embodiment that the application provides, when segmentation rule is fixed capacity split plot design, step S15 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step S151b, reads the segmentation capacity pre-set.
Step S153b, splits journal file according to segmentation capacity, obtains the pre-service journal file that some capacity are identical.
Concrete, by above-mentioned steps S151b and step S153b, according to fixed capacity split plot design, dividing processing is carried out to journal file.Read the Single document size of pre-service journal file pre-set, journal file being split by journal file of order, be divided into the pre-service journal file that capacity etc. is large.In the middle of practical application, most journal files is being split to finally, and remaining capacity does not meet the segmentation capacity pre-set, and now, remaining daily record can be generated a pre-service journal file.
In conjunction with actual application, can following steps be divided into:
Step one: segmentation journal file.Utilize daily record partition tools by large log file (such as: journal file capacity is more than 20GB).Journal file is divided into several pre-service journal files M1, M2 ... Mn.
Step 2: revise preliminary entry time.The task scheduling of the logreader of halt system and the task scheduling of schedulework, two time configurations in the configuration file of amendment logreader: pre-service initial time (lastsuccesstime), pre-service termination time (untiltime).According to the pre-service initial time in the configuration file of Logreader and pre-service termination time, little pre-service journal file good for the segmentation of correspondence is read in preprocessed data storehouse (receiver database) one by one.
Step 3: amendment second time entry time.
The time (postprocess time) of Allocation Analysis pre-service journal file, be set to be greater than and the closest integral point time (such as: 10:00,11:00,12:00) running through the good pre-service journal file of all segmentations.Operating analysis pre-service journal file step (postprocess).After completing above-mentioned steps, data warehouse technology (ETL) is utilized to store.
Step 4: journal file is read in circulation.
Step one is repeated to step 3, until can not find large journal file to other journal file.
The present invention, after carrying out dividing processing to journal file, has carried out and control pre-service journal file being carried out to analyzing and processing point correlation time, can accelerate to process the performance inserting log database, solves disposable deficiency of reading in daily record internal memory and performance.
Embodiment 2
The embodiment of the present invention additionally provides a kind for the treatment of apparatus of database journal, and as shown in Figure 4, this device can comprise: the first acquisition module 31, first read module 33, segmentation module 35 and the first memory module 37.
Wherein, the first acquisition module 31, for obtaining journal file.
Concrete, by above-mentioned first acquisition module 31, get the journal file that server generates.
First read module 33, for reading the segmentation rule of journal file.
Concrete, by above-mentioned first read module 33, reading pre-set is good, to the segmentation rule that journal file is split.
Segmentation module 35, for splitting journal file according to segmentation rule, obtains at least two pre-service journal files.
Concrete, by above-mentioned segmentation module 35, according to the segmentation rule read, the whole journal file got is split, journal file is divided into plural pre-service journal file, for follow-up use.
First memory module 37, for writing preprocessed data storehouse successively by least two pre-service journal files.
Concrete, by above-mentioned first memory module 37, the pre-service journal file after segmentation is read respectively, and writes successively in the middle of the preprocessed data storehouse for storing pre-service journal file.
By above-mentioned first acquisition module 31, first read module 33, segmentation module 35 and the first memory module 37, respectively the journal file that script Single document is very large is split, be divided into the preprocessed file that several files are less, and the pre-service journal file through dividing processing is stored in the middle of preprocessed data storehouse.
In the middle of practical application, because the journal file huge to Single document reads need a lot of hardware resources, such as: memory size and hard disk cache capacity etc.Also be subject to the restriction of disk read-write speed, cause reading efficiency very low.So, journal file is split, is divided into the pre-service journal file that several are little.These little pre-service journal files are because file size is less, and computing machine reads quickly, reads successively to each preprocessed file, and stored in preprocessed data storehouse, to accelerate the reading speed of journal file.
In summary, the invention solves and in prior art, read-write operation is carried out repeatedly to whole journal file and make the pre-service of web log file length consuming time, cause the inefficient problem of log processing, achieve and improve the processing speed of journal file and the effect for the treatment of effeciency.
Further, in execution, at least two pre-service journal files are write in the process in preprocessed data storehouse successively in above-mentioned first memory module 37, also comprise:
First, read the file attribute of journal file, wherein, file attribute at least comprises: log recording initial time and log recording termination time.
And then, according to file attribute, the time configuration information in amendment preprocessed data storehouse.
Then, according to time configuration information, read the pre-service journal file corresponding with time configuration information.
Finally, successively the pre-service journal file corresponding with time configuration information is write preprocessed data storehouse.
Concrete, by above-mentioned first memory module 37, obtain log recording initial time and the log recording termination time of journal file.According to log recording initial time and log recording termination time, the time configuration information in amendment preprocessed data storehouse, this time configuration information can comprise pre-service initial time and pre-service termination time.The pre-service journal file matched with time configuration information, according to time configuration information, is determined in preprocessed data storehouse.The pre-service journal file determined is write in the middle of preprocessed data storehouse successively.
Each pre-service journal file has pre-service journal file attribute, pre-service journal file attribute at least comprises: pre-service log recording initial time and pre-service log recording termination time, pre-service log recording initial time can be determined according to the time of the Article 1 log recording of this pre-service journal file, and the pre-service log recording termination time then can be determined according to the time of the last item log recording of this pre-service journal file.
Preprocessed data storehouse is according to time configuration information, determine the pre-service journal file matched with time configuration information, can be that pre-service log recording initial time is more than or equal to pre-service initial time and is less than or equal to the pre-service journal file of pre-service termination time, also can be the pre-service log recording termination time be more than or equal to pre-service initial time and be less than or equal to the pre-service journal file of pre-service termination time, can also be that pre-service log recording initial time is more than or equal to pre-service initial time and the pre-service log recording termination time is less than or equal to the pre-service journal file of pre-service termination time.Specifically how to select those skilled in the art to set according to actual needs, do not repeat at this.
Under some scene, in order to reduce journal file taking database side storage resources, after pre-service journal file is obtained to journal file segmentation, this original journal file can be deleted, log recording initial time and the log recording termination time of reading this journal file is needed if follow-up, the log recording initial time of this journal file and log recording termination time can be read out before raw log files is deleted and carry out record, the reading in order to subsequent step is called.
Preferably, as shown in Figure 5, in the embodiment that the application provides, device also comprises: computing module 39, second read module 41, processing module 43 and the second memory module 45.
Wherein, computing module 39, for according to time configuration information, calculates the analyzing and processing period;
Second read module 41, for according to the analyzing and processing period, reads the pre-service journal file in preprocessed data storehouse;
Processing module 43, for analyzing pre-service journal file, obtains standard logs tables of data;
Second memory module 45, for writing log database by standard logs tables of data.
Concrete, by computing module 39, second read module 41, processing module 43 and the second memory module 45, the pre-service journal file in preprocessed data storehouse is further processed.According to the pre-service initial time in time configuration information, pre-service termination time, calculate the analyzing and processing period needing to carry out analyzing and processing.Because there is the pre-service journal file of each time in preprocessed data storehouse, so need the time interval determined according to the analyzing and processing period, read the pre-service journal file in the preprocessed data storehouse be in this time interval.These pre-service journal files are carried out analyzing and processing, obtain may be used for directly being stored in the middle of database, for the standard logs tables of data analyzed, by the standard logs tables of data that obtains stored in the middle of log database, to log content, it reads, analyzes at any time.
In the middle of practical application, according to the analyzing and processing period, pre-service journal file is processed, according to the loading condition of server, the pre-service journal file in preprocessed data storehouse can be processed.Also can, by the real time modifying to the analyzing and processing period, realize processing in real time the pre-service journal file in preprocessed data storehouse.Predetermined time interval can also be passed through, after the processing procedure of the pre-service journal file in preprocessed data storehouse has been evenly distributed to each generation journal file.Like this, only need to revise the analyzing and processing period, just according to actual conditions, the processing procedure of analyzing and processing journal file can be controlled.
Wherein, the analyzing and processing period is the time interval determined by process start time and process termination time, directly log recording initial time can be processed the start time in the analyzing and processing period, the log recording termination time as the process termination time in the analyzing and processing period, to carry out analyzing and processing for each journal file in time.
Further, analyze pre-service journal file in above-mentioned processing module 43, obtain, in the process of standard logs tables of data, also comprising:
First, the log content of pre-service journal file is extracted, obtains target data.
Then, according to the target data type pre-set, target data is converted to target data type and obtains result.
Finally, aggregation process result, generates standard logs tables of data.
Concrete, by above-mentioned processing module 43, read pre-service journal file, by the content extraction in pre-service journal file out, be converted to the data content that can be stored in database, obtain target data.Target data is changed, is converted to unified data type.Target data after conversion is gathered, generates standard logs tables of data, be stored in the middle of log database, to transfer at any time.
Preferably, as shown in Figure 6, in the embodiment that the application provides, said apparatus also comprises: the second acquisition module 321, third reading delivery block 323 and judge module 325.
Wherein, the second acquisition module 321, for obtaining the amount of capacity of journal file.
Third reading delivery block 323, for reading the capacity threshold pre-set, wherein, capacity threshold is for judging the size of journal file.
Judge module 325, for according to the amount of capacity of journal file and capacity threshold, judges whether to split journal file;
Wherein, when the capacity of journal file is greater than capacity threshold, determine to carry out dividing processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determine not carry out dividing processing to journal file.
Concrete, by above-mentioned second acquisition module 321, third reading delivery block 323 and judge module 325, the amount of capacity of journal file is judged, when the amount of capacity of journal file is more than or equal to the capacity threshold pre-set, dividing processing is carried out to journal file.When the amount of capacity of journal file is less than the capacity threshold pre-set, then dividing processing is not carried out to journal file.
In the middle of practical application, at night or other idle periods, website visiting amount is also little, so the journal file produced is just relatively little.And to the process of less journal file in process, very large burden can't be caused to system.So add determining step here, judge whether to carry out dividing processing to journal file.
Further, the segmentation rule of journal file is at least comprised: fixing number split plot design, fixed capacity split plot design.
Further, when splitting rule for fixing number split plot design, above-mentioned segmentation module 35 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step one: read the segmentation number pre-set.
Step 2: on average split journal file according to segmentation number, be fixed the pre-service journal file of number.
Concrete, by above-mentioned steps one and step 2, according to fixing number split plot design, dividing processing is carried out to journal file.Read the segmentation number pre-set, all journal file average marks are slit into the pre-service journal file pre-setting segmentation number.
Further, in the embodiment that the application provides, when segmentation rule is fixed capacity split plot design, above-mentioned segmentation module 35 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step one: read the segmentation capacity pre-set.
Step 2: split journal file according to segmentation capacity, obtains the pre-service journal file that some capacity are identical.
Concrete, by above-mentioned steps one and step 2, according to fixed capacity split plot design, dividing processing is carried out to journal file.Read the Single document size of pre-service journal file pre-set, journal file being split by journal file of order, be divided into the pre-service journal file that capacity etc. is large.In the middle of practical application, most journal files is being split to finally, and remaining capacity does not meet the segmentation capacity pre-set, and now, remaining daily record can be generated a pre-service journal file.
In conjunction with actual application, can following steps be divided into:
Step one: segmentation journal file.Utilize daily record partition tools by large log file (such as: journal file capacity is more than 20GB).Journal file is divided into several pre-service journal files M1, M2 ... Mn.
Step 2: revise preliminary entry time.The task scheduling of the logreader of halt system and the task scheduling of schedulework, two time configurations in the configuration file of amendment logreader: pre-service initial time (lastsuccesstime), pre-service termination time (untiltime).According to the pre-service initial time in the configuration file of Logreader and pre-service termination time, little pre-service journal file good for the segmentation of correspondence is read in preprocessed data storehouse (receiver database) one by one.
Step 3: amendment second time entry time.
The time (postprocess time) of Allocation Analysis pre-service journal file, be set to be greater than and the closest integral point time (such as: 10:00,11:00,12:00) running through the good pre-service journal file of all segmentations.Operating analysis pre-service journal file step (postprocess).After completing above-mentioned steps, data warehouse technology (ETL) is utilized to store.
Step 4: journal file is read in circulation.
Step one is repeated to step 3, until can not find large journal file to other journal file.
The present invention, after carrying out dividing processing to journal file, has carried out and control pre-service journal file being carried out to analyzing and processing point correlation time, can accelerate to process the performance inserting log database, solves disposable deficiency of reading in daily record internal memory and performance.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
In several embodiments that the application provides, should be understood that, disclosed device, the mode by other realizes.Such as, device embodiment described above is only schematic, the such as division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.
If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises all or part of step of some instructions in order to make a computer equipment (can be personal computer, mobile terminal, server or the network equipment etc.) perform method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or CD etc. various can be program code stored medium.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.