CN104391954A - Database log processing method and device - Google Patents

Database log processing method and device Download PDF

Info

Publication number
CN104391954A
CN104391954A CN201410709417.5A CN201410709417A CN104391954A CN 104391954 A CN104391954 A CN 104391954A CN 201410709417 A CN201410709417 A CN 201410709417A CN 104391954 A CN104391954 A CN 104391954A
Authority
CN
China
Prior art keywords
journal file
service
capacity
journal
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410709417.5A
Other languages
Chinese (zh)
Other versions
CN104391954B (en
Inventor
戴飞
张同欣
刘凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410709417.5A priority Critical patent/CN104391954B/en
Publication of CN104391954A publication Critical patent/CN104391954A/en
Application granted granted Critical
Publication of CN104391954B publication Critical patent/CN104391954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a database log processing method and device. The method includes acquiring a log file; reading segmentation rules of the log file; segmenting the log file according to the segmentation rules, and acquiring at least two pre-processed log files; writing the at least two pre-processed log files into a pre-processing database sequentially. By the aid of the method, the problems that in the prior art, repeated reading and writing operations are performed on the entire log file, the pre-processing time for a website accessing the log file is long, and the log processing efficiency is low are solved.

Description

The disposal route of database journal and device
Technical field
The present invention relates to computer realm, in particular to a kind of disposal route and device of database journal.
Background technology
Along with the development of internet, the visit capacity of website and data volume all straight line rise, and single server cannot meet application needs.So way comparatively conventional is at present the method adopting computer cluster equally loaded, by one or more front end load server, by the load distribution of server on one group of server of rear end, back-end server receives request and log.
Along with the growth of website visiting amount, the journal file receiving request of access for recording server also constantly expands along with the growth of website visiting amount.But, the requirement of journal file on the processing time is not reduced.Therefore, how to improve the treatment effeciency of journal file, becoming must problems faced.
General log processing method directly reads raw log files, then to the data analysis in raw log files.Because journal file itself is very huge, and is subject to the restriction of disk read-write speed, the reading speed of journal file is very slow.And all to again read all original log when carrying out different analyses to journal file, cause efficiency very low like this.
Making the pre-service of web log file length consuming time for carrying out repeatedly read-write operation to whole journal file in prior art, causing the inefficient problem of log processing, not yet proposing effective solution at present.
Summary of the invention
Fundamental purpose of the present invention is the disposal route and the device that provide a kind of database journal, in prior art, read-write operation is carried out repeatedly to whole journal file make the pre-service of web log file length consuming time to solve, cause the inefficient problem of log processing.
To achieve these goals, according to an aspect of the embodiment of the present invention, a kind of disposal route of database journal is provided.The method comprises: obtain journal file; Read the segmentation rule of journal file; According to segmentation rule, journal file is split, obtain at least two pre-service journal files; At least two pre-service journal files are write preprocessed data storehouse successively.
To achieve these goals, according to the another aspect of the embodiment of the present invention, provide a kind for the treatment of apparatus of database journal, this device comprises the first acquisition module, for obtaining journal file; First read module, for reading the segmentation rule of journal file; Segmentation module, for splitting journal file according to segmentation rule, obtains at least two pre-service journal files; First memory module, for writing preprocessed data storehouse successively by least two pre-service journal files.
According to inventive embodiments, by obtaining journal file; Read the segmentation rule of journal file; According to segmentation rule, journal file is split, obtain at least two pre-service journal files; At least two pre-service journal files are write preprocessed data storehouse successively, solves prior art and read-write operation is carried out repeatedly to whole journal file make the pre-service of web log file length consuming time, cause the inefficient problem of log processing.Achieve and improve the processing speed of journal file and the effect for the treatment of effeciency.
Accompanying drawing explanation
The accompanying drawing forming a application's part is used to provide a further understanding of the present invention, and schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the process flow diagram of the disposal route of database journal according to the embodiment of the present invention one;
Fig. 2 is the process flow diagram of the disposal route according to the preferred database journal of the embodiment of the present invention one;
Fig. 3 is the process flow diagram of the disposal route according to the preferred database journal of the embodiment of the present invention one;
Fig. 4 is the structural representation of the treating apparatus according to the embodiment of the present invention two database journal;
Fig. 5 is the structural representation of the treating apparatus according to the preferred database journal of the embodiment of the present invention two; And
Fig. 6 is the structural representation of the treating apparatus according to the preferred database journal of the embodiment of the present invention two.
Embodiment
It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the present invention in detail in conjunction with the embodiments.
The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged, in the appropriate case so that embodiments of the invention described herein.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.
Embodiment 1
Embodiments provide a kind of disposal route of database journal.
Fig. 1 is the process flow diagram of the disposal route of database journal according to the embodiment of the present invention.As shown in Figure 1, it is as follows that the method comprising the steps of:
Step S11, obtains journal file.
Concrete, by above-mentioned steps S11, get the journal file that server generates.
Step S13, reads the segmentation rule of journal file.
Concrete, by above-mentioned steps S13, reading pre-set is good, to the segmentation rule that journal file is split.
Step S15, splits journal file according to segmentation rule, obtains at least two pre-service journal files.
Concrete, by above-mentioned steps S15, according to the segmentation rule read, the whole journal file got is split, journal file is divided into plural pre-service journal file, for follow-up use.
At least two pre-service journal files are write preprocessed data storehouse by step S17 successively.
Concrete, by above-mentioned steps S17, the pre-service journal file after segmentation is read respectively, and writes successively in the middle of the preprocessed data storehouse for storing pre-service journal file.
By above-mentioned steps S11 to step S17, respectively the journal file that script Single document is very large is split, be divided into the preprocessed file that several files are less, and the pre-service journal file through dividing processing is stored in the middle of preprocessed data storehouse.
In the middle of practical application, because the journal file huge to Single document reads need a lot of hardware resources, such as: memory size and hard disk cache capacity etc.Also be subject to the restriction of disk read-write speed, cause reading efficiency very low.So, journal file is split, is divided into the pre-service journal file that several are little.These little pre-service journal files are because file size is less, and computing machine reads quickly, reads successively to each preprocessed file, and stored in preprocessed data storehouse, to accelerate the reading speed of journal file.
In summary, the invention solves and in prior art, read-write operation is carried out repeatedly to whole journal file and make the pre-service of web log file length consuming time, cause the inefficient problem of log processing, achieve and improve the processing speed of journal file and the effect for the treatment of effeciency.
Preferably, in the embodiment that the application provides, write in preprocessed data storehouse in step S17 successively by least two pre-service journal files, step comprises:
Step S171, read the file attribute of journal file, wherein, file attribute at least comprises: log recording initial time and log recording termination time.
Step S173, according to file attribute, the time configuration information in amendment preprocessed data storehouse.
Step S175, according to time configuration information, reads the pre-service journal file corresponding with time configuration information.
Step S177, writes preprocessed data storehouse by the pre-service journal file corresponding with time configuration information successively.
Concrete, by above-mentioned steps S171 to step S177, obtain log recording initial time and the log recording termination time of journal file.According to log recording initial time and log recording termination time, the time configuration information in amendment preprocessed data storehouse, this time configuration information can comprise pre-service initial time and pre-service termination time.The pre-service journal file matched with time configuration information, according to time configuration information, is determined in preprocessed data storehouse.The pre-service journal file determined is write in the middle of preprocessed data storehouse successively.
Each pre-service journal file has pre-service journal file attribute, pre-service journal file attribute at least comprises: pre-service log recording initial time and pre-service log recording termination time, pre-service log recording initial time can be determined according to the time of the Article 1 log recording of this pre-service journal file, and the pre-service log recording termination time then can be determined according to the time of the last item log recording of this pre-service journal file.
Preprocessed data storehouse is according to time configuration information, determine the pre-service journal file matched with time configuration information, can be that pre-service log recording initial time is more than or equal to pre-service initial time and is less than or equal to the pre-service journal file of pre-service termination time, also can be the pre-service log recording termination time be more than or equal to pre-service initial time and be less than or equal to the pre-service journal file of pre-service termination time, can also be that pre-service log recording initial time is more than or equal to pre-service initial time and the pre-service log recording termination time is less than or equal to the pre-service journal file of pre-service termination time.Specifically how to select those skilled in the art to set according to actual needs, do not repeat at this.
Under some scene, in order to reduce journal file taking database side storage resources, after pre-service journal file is obtained to journal file segmentation, this original journal file can be deleted, log recording initial time and the log recording termination time of reading this journal file is needed if follow-up, the log recording initial time of this journal file and log recording termination time can be read out before raw log files is deleted and carry out record, the reading in order to subsequent step is called.
Preferably, as shown in Figure 2, in the embodiment that the application provides, after step S17 is by pre-service journal file successively write into Databasce, method also comprises:
Step S19, according to time configuration information, calculates the analyzing and processing period.
Step S21, according to the analyzing and processing period, reads the pre-service journal file in preprocessed data storehouse.
Step S23, analyzes pre-service journal file, obtains standard logs tables of data.
Step S25, by standard logs tables of data write log database.
Concrete, by above-mentioned steps S19 to step S25, the pre-service journal file in preprocessed data storehouse is further processed.According to the pre-service initial time in time configuration information, pre-service termination time, calculate the analyzing and processing period needing to carry out analyzing and processing.Because there is the pre-service journal file of each time in preprocessed data storehouse, so need the time interval determined according to the analyzing and processing period, read the pre-service journal file in the preprocessed data storehouse be in this time interval.These pre-service journal files are carried out analyzing and processing, obtain may be used for directly being stored in the middle of database, for the standard logs tables of data analyzed, by the standard logs tables of data that obtains stored in the middle of log database, to log content, it reads, analyzes at any time.
In the middle of practical application, according to the analyzing and processing period, pre-service journal file is processed, according to the loading condition of server, the pre-service journal file in preprocessed data storehouse can be processed.Also can, by the real time modifying to the analyzing and processing period, realize processing in real time the pre-service journal file in preprocessed data storehouse.Predetermined time interval can also be passed through, after the processing procedure of the pre-service journal file in preprocessed data storehouse has been evenly distributed to each generation journal file.Like this, only need to revise the analyzing and processing period, just according to actual conditions, the processing procedure of analyzing and processing journal file can be controlled.
Wherein, the analyzing and processing period is the time interval determined by process start time and process termination time, can directly using log recording initial time as the process start time in the analyzing and processing period, the log recording termination time as the process termination time in the analyzing and processing period, to carry out analyzing and processing for each journal file in time.
Preferably, in the embodiment that the application provides, analyze pre-service journal file in step S23, obtain in standard logs tables of data, step comprises:
Step S231, extracts the log content of pre-service journal file, obtains target data.
Step S233, according to the target data type pre-set, is converted to target data type by target data and obtains result.
Step S235, aggregation process result, generates standard logs tables of data.
Concrete, by above-mentioned steps S231 to step S235, read pre-service journal file, by the content extraction in pre-service journal file out, be converted to the data content that can be stored in database, obtain target data.Target data is changed, is converted to unified data type.Target data after conversion is gathered, generates standard logs tables of data, be stored in the middle of log database, to transfer at any time.
Preferably, as shown in Figure 3, in the embodiment that the application provides, before step S13 reads the segmentation rule to journal file, method also comprises:
Step S121, obtains the amount of capacity of journal file.
Step S123, read the capacity threshold pre-set, wherein, capacity threshold is for judging the size of journal file.
Step S125, according to amount of capacity and the capacity threshold of journal file, judges whether to split journal file;
Wherein, when the capacity of journal file is greater than capacity threshold, determine to carry out dividing processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determine not carry out dividing processing to journal file.
Concrete, by step S121 to step S125, the amount of capacity of journal file is judged, when the amount of capacity of journal file is more than or equal to the capacity threshold pre-set, dividing processing is carried out to journal file.When the amount of capacity of journal file is less than the capacity threshold pre-set, then dividing processing is not carried out to journal file.
In the middle of practical application, at night or other idle periods, website visiting amount is also little, so the journal file produced is just relatively little.And to the process of less journal file in process, very large burden can't be caused to system.So add determining step here, judge whether to carry out dividing processing to journal file.
Preferably, in the embodiment that the application provides, the segmentation rule of journal file is at least comprised: fixing number split plot design, fixed capacity split plot design.
Preferably, in the embodiment that the application provides, when splitting rule for fixing number split plot design, step S15 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step S151a, reads the segmentation number pre-set.
Step S153a, on average splits journal file according to segmentation number, is fixed the pre-service journal file of number.
Concrete, by above-mentioned steps S151a and step S153a, according to fixing number split plot design, dividing processing is carried out to journal file.Read the segmentation number pre-set, all journal file average marks are slit into the pre-service journal file pre-setting segmentation number.
Preferably, in the embodiment that the application provides, when segmentation rule is fixed capacity split plot design, step S15 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step S151b, reads the segmentation capacity pre-set.
Step S153b, splits journal file according to segmentation capacity, obtains the pre-service journal file that some capacity are identical.
Concrete, by above-mentioned steps S151b and step S153b, according to fixed capacity split plot design, dividing processing is carried out to journal file.Read the Single document size of pre-service journal file pre-set, journal file being split by journal file of order, be divided into the pre-service journal file that capacity etc. is large.In the middle of practical application, most journal files is being split to finally, and remaining capacity does not meet the segmentation capacity pre-set, and now, remaining daily record can be generated a pre-service journal file.
In conjunction with actual application, can following steps be divided into:
Step one: segmentation journal file.Utilize daily record partition tools by large log file (such as: journal file capacity is more than 20GB).Journal file is divided into several pre-service journal files M1, M2 ... Mn.
Step 2: revise preliminary entry time.The task scheduling of the logreader of halt system and the task scheduling of schedulework, two time configurations in the configuration file of amendment logreader: pre-service initial time (lastsuccesstime), pre-service termination time (untiltime).According to the pre-service initial time in the configuration file of Logreader and pre-service termination time, little pre-service journal file good for the segmentation of correspondence is read in preprocessed data storehouse (receiver database) one by one.
Step 3: amendment second time entry time.
The time (postprocess time) of Allocation Analysis pre-service journal file, be set to be greater than and the closest integral point time (such as: 10:00,11:00,12:00) running through the good pre-service journal file of all segmentations.Operating analysis pre-service journal file step (postprocess).After completing above-mentioned steps, data warehouse technology (ETL) is utilized to store.
Step 4: journal file is read in circulation.
Step one is repeated to step 3, until can not find large journal file to other journal file.
The present invention, after carrying out dividing processing to journal file, has carried out and control pre-service journal file being carried out to analyzing and processing point correlation time, can accelerate to process the performance inserting log database, solves disposable deficiency of reading in daily record internal memory and performance.
Embodiment 2
The embodiment of the present invention additionally provides a kind for the treatment of apparatus of database journal, and as shown in Figure 4, this device can comprise: the first acquisition module 31, first read module 33, segmentation module 35 and the first memory module 37.
Wherein, the first acquisition module 31, for obtaining journal file.
Concrete, by above-mentioned first acquisition module 31, get the journal file that server generates.
First read module 33, for reading the segmentation rule of journal file.
Concrete, by above-mentioned first read module 33, reading pre-set is good, to the segmentation rule that journal file is split.
Segmentation module 35, for splitting journal file according to segmentation rule, obtains at least two pre-service journal files.
Concrete, by above-mentioned segmentation module 35, according to the segmentation rule read, the whole journal file got is split, journal file is divided into plural pre-service journal file, for follow-up use.
First memory module 37, for writing preprocessed data storehouse successively by least two pre-service journal files.
Concrete, by above-mentioned first memory module 37, the pre-service journal file after segmentation is read respectively, and writes successively in the middle of the preprocessed data storehouse for storing pre-service journal file.
By above-mentioned first acquisition module 31, first read module 33, segmentation module 35 and the first memory module 37, respectively the journal file that script Single document is very large is split, be divided into the preprocessed file that several files are less, and the pre-service journal file through dividing processing is stored in the middle of preprocessed data storehouse.
In the middle of practical application, because the journal file huge to Single document reads need a lot of hardware resources, such as: memory size and hard disk cache capacity etc.Also be subject to the restriction of disk read-write speed, cause reading efficiency very low.So, journal file is split, is divided into the pre-service journal file that several are little.These little pre-service journal files are because file size is less, and computing machine reads quickly, reads successively to each preprocessed file, and stored in preprocessed data storehouse, to accelerate the reading speed of journal file.
In summary, the invention solves and in prior art, read-write operation is carried out repeatedly to whole journal file and make the pre-service of web log file length consuming time, cause the inefficient problem of log processing, achieve and improve the processing speed of journal file and the effect for the treatment of effeciency.
Further, in execution, at least two pre-service journal files are write in the process in preprocessed data storehouse successively in above-mentioned first memory module 37, also comprise:
First, read the file attribute of journal file, wherein, file attribute at least comprises: log recording initial time and log recording termination time.
And then, according to file attribute, the time configuration information in amendment preprocessed data storehouse.
Then, according to time configuration information, read the pre-service journal file corresponding with time configuration information.
Finally, successively the pre-service journal file corresponding with time configuration information is write preprocessed data storehouse.
Concrete, by above-mentioned first memory module 37, obtain log recording initial time and the log recording termination time of journal file.According to log recording initial time and log recording termination time, the time configuration information in amendment preprocessed data storehouse, this time configuration information can comprise pre-service initial time and pre-service termination time.The pre-service journal file matched with time configuration information, according to time configuration information, is determined in preprocessed data storehouse.The pre-service journal file determined is write in the middle of preprocessed data storehouse successively.
Each pre-service journal file has pre-service journal file attribute, pre-service journal file attribute at least comprises: pre-service log recording initial time and pre-service log recording termination time, pre-service log recording initial time can be determined according to the time of the Article 1 log recording of this pre-service journal file, and the pre-service log recording termination time then can be determined according to the time of the last item log recording of this pre-service journal file.
Preprocessed data storehouse is according to time configuration information, determine the pre-service journal file matched with time configuration information, can be that pre-service log recording initial time is more than or equal to pre-service initial time and is less than or equal to the pre-service journal file of pre-service termination time, also can be the pre-service log recording termination time be more than or equal to pre-service initial time and be less than or equal to the pre-service journal file of pre-service termination time, can also be that pre-service log recording initial time is more than or equal to pre-service initial time and the pre-service log recording termination time is less than or equal to the pre-service journal file of pre-service termination time.Specifically how to select those skilled in the art to set according to actual needs, do not repeat at this.
Under some scene, in order to reduce journal file taking database side storage resources, after pre-service journal file is obtained to journal file segmentation, this original journal file can be deleted, log recording initial time and the log recording termination time of reading this journal file is needed if follow-up, the log recording initial time of this journal file and log recording termination time can be read out before raw log files is deleted and carry out record, the reading in order to subsequent step is called.
Preferably, as shown in Figure 5, in the embodiment that the application provides, device also comprises: computing module 39, second read module 41, processing module 43 and the second memory module 45.
Wherein, computing module 39, for according to time configuration information, calculates the analyzing and processing period;
Second read module 41, for according to the analyzing and processing period, reads the pre-service journal file in preprocessed data storehouse;
Processing module 43, for analyzing pre-service journal file, obtains standard logs tables of data;
Second memory module 45, for writing log database by standard logs tables of data.
Concrete, by computing module 39, second read module 41, processing module 43 and the second memory module 45, the pre-service journal file in preprocessed data storehouse is further processed.According to the pre-service initial time in time configuration information, pre-service termination time, calculate the analyzing and processing period needing to carry out analyzing and processing.Because there is the pre-service journal file of each time in preprocessed data storehouse, so need the time interval determined according to the analyzing and processing period, read the pre-service journal file in the preprocessed data storehouse be in this time interval.These pre-service journal files are carried out analyzing and processing, obtain may be used for directly being stored in the middle of database, for the standard logs tables of data analyzed, by the standard logs tables of data that obtains stored in the middle of log database, to log content, it reads, analyzes at any time.
In the middle of practical application, according to the analyzing and processing period, pre-service journal file is processed, according to the loading condition of server, the pre-service journal file in preprocessed data storehouse can be processed.Also can, by the real time modifying to the analyzing and processing period, realize processing in real time the pre-service journal file in preprocessed data storehouse.Predetermined time interval can also be passed through, after the processing procedure of the pre-service journal file in preprocessed data storehouse has been evenly distributed to each generation journal file.Like this, only need to revise the analyzing and processing period, just according to actual conditions, the processing procedure of analyzing and processing journal file can be controlled.
Wherein, the analyzing and processing period is the time interval determined by process start time and process termination time, directly log recording initial time can be processed the start time in the analyzing and processing period, the log recording termination time as the process termination time in the analyzing and processing period, to carry out analyzing and processing for each journal file in time.
Further, analyze pre-service journal file in above-mentioned processing module 43, obtain, in the process of standard logs tables of data, also comprising:
First, the log content of pre-service journal file is extracted, obtains target data.
Then, according to the target data type pre-set, target data is converted to target data type and obtains result.
Finally, aggregation process result, generates standard logs tables of data.
Concrete, by above-mentioned processing module 43, read pre-service journal file, by the content extraction in pre-service journal file out, be converted to the data content that can be stored in database, obtain target data.Target data is changed, is converted to unified data type.Target data after conversion is gathered, generates standard logs tables of data, be stored in the middle of log database, to transfer at any time.
Preferably, as shown in Figure 6, in the embodiment that the application provides, said apparatus also comprises: the second acquisition module 321, third reading delivery block 323 and judge module 325.
Wherein, the second acquisition module 321, for obtaining the amount of capacity of journal file.
Third reading delivery block 323, for reading the capacity threshold pre-set, wherein, capacity threshold is for judging the size of journal file.
Judge module 325, for according to the amount of capacity of journal file and capacity threshold, judges whether to split journal file;
Wherein, when the capacity of journal file is greater than capacity threshold, determine to carry out dividing processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determine not carry out dividing processing to journal file.
Concrete, by above-mentioned second acquisition module 321, third reading delivery block 323 and judge module 325, the amount of capacity of journal file is judged, when the amount of capacity of journal file is more than or equal to the capacity threshold pre-set, dividing processing is carried out to journal file.When the amount of capacity of journal file is less than the capacity threshold pre-set, then dividing processing is not carried out to journal file.
In the middle of practical application, at night or other idle periods, website visiting amount is also little, so the journal file produced is just relatively little.And to the process of less journal file in process, very large burden can't be caused to system.So add determining step here, judge whether to carry out dividing processing to journal file.
Further, the segmentation rule of journal file is at least comprised: fixing number split plot design, fixed capacity split plot design.
Further, when splitting rule for fixing number split plot design, above-mentioned segmentation module 35 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step one: read the segmentation number pre-set.
Step 2: on average split journal file according to segmentation number, be fixed the pre-service journal file of number.
Concrete, by above-mentioned steps one and step 2, according to fixing number split plot design, dividing processing is carried out to journal file.Read the segmentation number pre-set, all journal file average marks are slit into the pre-service journal file pre-setting segmentation number.
Further, in the embodiment that the application provides, when segmentation rule is fixed capacity split plot design, above-mentioned segmentation module 35 is split journal file according to segmentation rule, and the step obtaining at least two pre-service journal files comprises:
Step one: read the segmentation capacity pre-set.
Step 2: split journal file according to segmentation capacity, obtains the pre-service journal file that some capacity are identical.
Concrete, by above-mentioned steps one and step 2, according to fixed capacity split plot design, dividing processing is carried out to journal file.Read the Single document size of pre-service journal file pre-set, journal file being split by journal file of order, be divided into the pre-service journal file that capacity etc. is large.In the middle of practical application, most journal files is being split to finally, and remaining capacity does not meet the segmentation capacity pre-set, and now, remaining daily record can be generated a pre-service journal file.
In conjunction with actual application, can following steps be divided into:
Step one: segmentation journal file.Utilize daily record partition tools by large log file (such as: journal file capacity is more than 20GB).Journal file is divided into several pre-service journal files M1, M2 ... Mn.
Step 2: revise preliminary entry time.The task scheduling of the logreader of halt system and the task scheduling of schedulework, two time configurations in the configuration file of amendment logreader: pre-service initial time (lastsuccesstime), pre-service termination time (untiltime).According to the pre-service initial time in the configuration file of Logreader and pre-service termination time, little pre-service journal file good for the segmentation of correspondence is read in preprocessed data storehouse (receiver database) one by one.
Step 3: amendment second time entry time.
The time (postprocess time) of Allocation Analysis pre-service journal file, be set to be greater than and the closest integral point time (such as: 10:00,11:00,12:00) running through the good pre-service journal file of all segmentations.Operating analysis pre-service journal file step (postprocess).After completing above-mentioned steps, data warehouse technology (ETL) is utilized to store.
Step 4: journal file is read in circulation.
Step one is repeated to step 3, until can not find large journal file to other journal file.
The present invention, after carrying out dividing processing to journal file, has carried out and control pre-service journal file being carried out to analyzing and processing point correlation time, can accelerate to process the performance inserting log database, solves disposable deficiency of reading in daily record internal memory and performance.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
In several embodiments that the application provides, should be understood that, disclosed device, the mode by other realizes.Such as, device embodiment described above is only schematic, the such as division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.
If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises all or part of step of some instructions in order to make a computer equipment (can be personal computer, mobile terminal, server or the network equipment etc.) perform method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or CD etc. various can be program code stored medium.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (11)

1. a disposal route for database journal, is characterized in that, comprising:
Obtain journal file;
Read the segmentation rule of described journal file;
According to described segmentation rule, described journal file is split, obtain at least two pre-service journal files;
Described at least two pre-service journal files are write preprocessed data storehouse successively.
2. method according to claim 1, is characterized in that, describedly the step that described at least two pre-service journal files write preprocessed data storehouse is successively comprised:
Read the file attribute of described journal file, wherein, described file attribute at least comprises: log recording initial time and log recording termination time;
According to described file attribute, revise the time configuration information in described preprocessed data storehouse;
According to described time configuration information, read the described pre-service journal file corresponding with described time configuration information;
Successively the described pre-service journal file corresponding with described time configuration information is write described preprocessed data storehouse.
3. method according to claim 2, is characterized in that, described by described pre-service journal file successively write into Databasce after, described method also comprises:
According to described time configuration information, calculate the analyzing and processing period;
According to the described analyzing and processing period, read the described pre-service journal file in described preprocessed data storehouse;
Analyze described pre-service journal file, obtain standard logs tables of data;
By described standard logs tables of data write log database.
4. method according to claim 3, is characterized in that, the described pre-service journal file of described analysis, and the step obtaining standard logs tables of data comprises:
The log content of described pre-service journal file is extracted, obtains target data;
According to the target data type pre-set, described target data is converted to target data type and obtains result;
Gather described result, generate standard logs tables of data.
5. method according to claim 3, is characterized in that, before described reading is to the segmentation rule of journal file, described method comprises:
Obtain the amount of capacity of journal file;
Read the capacity threshold pre-set;
According to amount of capacity and the described capacity threshold of described journal file, judge whether to split described journal file;
Wherein, when the capacity of described journal file is greater than described capacity threshold, determine to carry out dividing processing to described journal file;
When the capacity of described journal file is less than or equal to described capacity threshold, determine not carry out dividing processing to described journal file.
6. method as claimed in any of claims 1 to 5, is characterized in that, described segmentation rule at least comprises: fixing number split plot design, fixed capacity split plot design.
7. method according to claim 6, is characterized in that, when described segmentation rule is for fixing number split plot design, describedly to split described journal file according to described segmentation is regular, the step obtaining at least two pre-service journal files comprises:
Read the segmentation number pre-set;
On average split described journal file according to described segmentation number, be fixed the described pre-service journal file of number.
8. method according to claim 6, is characterized in that, when described segmentation rule is fixed capacity split plot design, describedly to split described journal file according to described segmentation rule, the step obtaining at least two pre-service journal files comprises:
Read the segmentation capacity pre-set;
According to described segmentation capacity, described journal file is split, obtain the described pre-service journal file that some capacity are identical.
9. a treating apparatus for database journal, is characterized in that, comprising:
First acquisition module, for obtaining journal file;
First read module, for reading the segmentation rule of described journal file;
Segmentation module, for splitting described journal file according to described segmentation rule, obtains at least two pre-service journal files;
First memory module, for writing preprocessed data storehouse successively by described at least two pre-service journal files.
10. device according to claim 9, is characterized in that, described device also comprises:
Computing module, for according to time configuration information, calculates the analyzing and processing period;
Second read module, for according to the described analyzing and processing period, reads the described pre-service journal file in described preprocessed data storehouse;
Processing module, for analyzing described pre-service journal file, obtains standard logs tables of data;
Second memory module, for writing log database by described standard logs tables of data.
11. devices according to claim 10, is characterized in that, described device also comprises:
Second acquisition module, for obtaining the amount of capacity of journal file;
Third reading delivery block, for reading the capacity threshold pre-set, wherein, described capacity threshold is for judging the size of described journal file;
Judge module, for according to the amount of capacity of described journal file and described capacity threshold, judges whether to split described journal file;
Wherein, when the capacity of described journal file is greater than described capacity threshold, determine to carry out dividing processing to described journal file;
When the capacity of described journal file is less than or equal to described capacity threshold, determine not carry out dividing processing to described journal file.
CN201410709417.5A 2014-11-27 2014-11-27 The processing method and processing device of database journal Active CN104391954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410709417.5A CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410709417.5A CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Publications (2)

Publication Number Publication Date
CN104391954A true CN104391954A (en) 2015-03-04
CN104391954B CN104391954B (en) 2019-04-09

Family

ID=52609858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410709417.5A Active CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Country Status (1)

Country Link
CN (1) CN104391954B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160285728A1 (en) * 2015-03-25 2016-09-29 International Business Machines Corporation Optimizing log analysis in saas environments
CN106815363A (en) * 2017-01-24 2017-06-09 郑州云海信息技术有限公司 One kind rotates management method and device based on linux daily records
CN106844630A (en) * 2017-01-20 2017-06-13 山东中创软件商用中间件股份有限公司 A kind of application server sql log recording methods and its device
CN106897431A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 A kind of daily record deriving method and system
CN107818041A (en) * 2017-10-24 2018-03-20 南京航空航天大学 SECONDO system files read and write inspection software
CN108304305A (en) * 2018-01-11 2018-07-20 北京潘达互娱科技有限公司 The method and apparatus that journal file is read
CN108989471A (en) * 2018-09-05 2018-12-11 郑州云海信息技术有限公司 The management method and device of log in network system
CN109299052A (en) * 2018-09-03 2019-02-01 平安普惠企业管理有限公司 Log cutting method, device, computer equipment and storage medium
CN110209643A (en) * 2019-04-23 2019-09-06 深圳壹账通智能科技有限公司 A kind of data processing method and device
CN111045885A (en) * 2019-11-11 2020-04-21 网联清算有限公司 Database log file processing method and device and computer equipment
CN111061690A (en) * 2019-11-22 2020-04-24 武汉达梦数据库有限公司 RAC-based database log file reading method and device
CN112434949A (en) * 2020-11-25 2021-03-02 平安普惠企业管理有限公司 Service early warning processing method, device, equipment and medium based on artificial intelligence
CN113656358A (en) * 2020-05-12 2021-11-16 网联清算有限公司 Database log file processing method and system
CN113687974A (en) * 2021-10-22 2021-11-23 飞狐信息技术(天津)有限公司 Client log processing method and device and computer equipment
WO2021237704A1 (en) * 2020-05-29 2021-12-02 深圳市欢太科技有限公司 Data synchronization method and related device
CN115225471A (en) * 2022-07-15 2022-10-21 中国工商银行股份有限公司 Log analysis method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751478A (en) * 2010-02-20 2010-06-23 浪潮(北京)电子信息产业有限公司 File backup method and system
US20110258242A1 (en) * 2010-04-16 2011-10-20 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store
CN102724063A (en) * 2012-05-11 2012-10-10 北京邮电大学 Log collection server, data packet delivering and log clustering methods and network
CN103178982A (en) * 2011-12-23 2013-06-26 阿里巴巴集团控股有限公司 Method and device for analyzing log
CN103324696A (en) * 2013-06-06 2013-09-25 合一信息技术(北京)有限公司 Collecting and statistical analysis system and method for data logs
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN103593422A (en) * 2013-11-01 2014-02-19 国云科技股份有限公司 Virtual access management method of heterogeneous database
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN104035729A (en) * 2014-05-22 2014-09-10 中国科学院计算技术研究所 Block device thin-provisioning method for log mapping
CN104050268A (en) * 2014-06-23 2014-09-17 西北工业大学 Continuous data protection and recovery method with log space adjustable online

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751478A (en) * 2010-02-20 2010-06-23 浪潮(北京)电子信息产业有限公司 File backup method and system
US20110258242A1 (en) * 2010-04-16 2011-10-20 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store
CN103178982A (en) * 2011-12-23 2013-06-26 阿里巴巴集团控股有限公司 Method and device for analyzing log
CN102724063A (en) * 2012-05-11 2012-10-10 北京邮电大学 Log collection server, data packet delivering and log clustering methods and network
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN103324696A (en) * 2013-06-06 2013-09-25 合一信息技术(北京)有限公司 Collecting and statistical analysis system and method for data logs
CN103593422A (en) * 2013-11-01 2014-02-19 国云科技股份有限公司 Virtual access management method of heterogeneous database
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN104035729A (en) * 2014-05-22 2014-09-10 中国科学院计算技术研究所 Block device thin-provisioning method for log mapping
CN104050268A (en) * 2014-06-23 2014-09-17 西北工业大学 Continuous data protection and recovery method with log space adjustable online

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9917758B2 (en) * 2015-03-25 2018-03-13 International Business Machines Corporation Optimizing log analysis in SaaS environments
US10171329B2 (en) 2015-03-25 2019-01-01 International Business Machines Corporation Optimizing log analysis in SaaS environments
US20160285728A1 (en) * 2015-03-25 2016-09-29 International Business Machines Corporation Optimizing log analysis in saas environments
CN106844630A (en) * 2017-01-20 2017-06-13 山东中创软件商用中间件股份有限公司 A kind of application server sql log recording methods and its device
CN106815363A (en) * 2017-01-24 2017-06-09 郑州云海信息技术有限公司 One kind rotates management method and device based on linux daily records
CN106897431B (en) * 2017-02-27 2021-06-11 郑州云海信息技术有限公司 Log export method and system
CN106897431A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 A kind of daily record deriving method and system
CN107818041A (en) * 2017-10-24 2018-03-20 南京航空航天大学 SECONDO system files read and write inspection software
CN108304305A (en) * 2018-01-11 2018-07-20 北京潘达互娱科技有限公司 The method and apparatus that journal file is read
CN109299052A (en) * 2018-09-03 2019-02-01 平安普惠企业管理有限公司 Log cutting method, device, computer equipment and storage medium
CN109299052B (en) * 2018-09-03 2024-03-15 珠海泰合科技有限公司 Log cutting method, device, computer equipment and storage medium
CN108989471A (en) * 2018-09-05 2018-12-11 郑州云海信息技术有限公司 The management method and device of log in network system
CN110209643A (en) * 2019-04-23 2019-09-06 深圳壹账通智能科技有限公司 A kind of data processing method and device
CN111045885A (en) * 2019-11-11 2020-04-21 网联清算有限公司 Database log file processing method and device and computer equipment
CN111061690A (en) * 2019-11-22 2020-04-24 武汉达梦数据库有限公司 RAC-based database log file reading method and device
CN111061690B (en) * 2019-11-22 2023-08-22 武汉达梦数据库股份有限公司 RAC-based database log file reading method and device
CN113656358A (en) * 2020-05-12 2021-11-16 网联清算有限公司 Database log file processing method and system
WO2021237704A1 (en) * 2020-05-29 2021-12-02 深圳市欢太科技有限公司 Data synchronization method and related device
CN112434949A (en) * 2020-11-25 2021-03-02 平安普惠企业管理有限公司 Service early warning processing method, device, equipment and medium based on artificial intelligence
CN113687974A (en) * 2021-10-22 2021-11-23 飞狐信息技术(天津)有限公司 Client log processing method and device and computer equipment
CN115225471A (en) * 2022-07-15 2022-10-21 中国工商银行股份有限公司 Log analysis method and device

Also Published As

Publication number Publication date
CN104391954B (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN104391954A (en) Database log processing method and device
CN105224606B (en) A kind of processing method and processing device of user identifier
US9361343B2 (en) Method for parallel mining of temporal relations in large event file
CN108595583A (en) Dynamic chart class page data crawling method, device, terminal and storage medium
US8019765B2 (en) Identifying files associated with a workflow
CN103595571B (en) Preprocess method, the apparatus and system of web log
US20110238677A1 (en) Dynamic Sort-Based Parallelism
CN104361092A (en) Searching method and device
CN103019855A (en) Method for forecasting executive time of Map Reduce operation
CN105025068A (en) Network data downloading method and apparatus
CN105022807A (en) Information recommendation method and apparatus
CN110222046B (en) List data processing method, device, server and storage medium
CN105550179A (en) Webpage collection method and browser plug-in
CN109412865B (en) Virtual network resource allocation method, system and electronic equipment
CN103823881A (en) Method and device for performance optimization of distributed database
CN105426119A (en) Storage apparatus and data processing method
CN113032621A (en) Data sampling method and device, computer equipment and storage medium
CN110442614B (en) Metadata searching method and device, electronic equipment and storage medium
Gupta et al. An approach for optimizing the performance for apache spark applications
CN110069772A (en) Predict device, method and the storage medium of the scoring of question and answer content
CN116366603A (en) Method and device for determining active IPv6 address
CN104408188A (en) Method and device for processing data
CN104750846A (en) Method and device for finding substring
CN113792237A (en) Card type layout optimization method and device, storage medium and processor
CN111444430B (en) Content recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Database log processing method and device

Effective date of registration: 20190531

Granted publication date: 20190409

Pledgee: Shenzhen Black Horse World Investment Consulting Co.,Ltd.

Pledgor: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Registration number: 2019990000503

PE01 Entry into force of the registration of the contract for pledge of patent right
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Patentee before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

CP02 Change in the address of a patent holder
PP01 Preservation of patent right

Effective date of registration: 20240604

Granted publication date: 20190409

PP01 Preservation of patent right