CN104391954B - The processing method and processing device of database journal - Google Patents

The processing method and processing device of database journal Download PDF

Info

Publication number
CN104391954B
CN104391954B CN201410709417.5A CN201410709417A CN104391954B CN 104391954 B CN104391954 B CN 104391954B CN 201410709417 A CN201410709417 A CN 201410709417A CN 104391954 B CN104391954 B CN 104391954B
Authority
CN
China
Prior art keywords
journal file
pretreatment
journal
file
capacity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410709417.5A
Other languages
Chinese (zh)
Other versions
CN104391954A (en
Inventor
戴飞
张同欣
刘凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410709417.5A priority Critical patent/CN104391954B/en
Publication of CN104391954A publication Critical patent/CN104391954A/en
Application granted granted Critical
Publication of CN104391954B publication Critical patent/CN104391954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses a kind of processing method and processing devices of database journal.Wherein, this method comprises: obtaining journal file;Read the segmentation rule of journal file;Journal file is split according to segmentation rule, obtains at least two pretreatment journal files;At least two pretreatment journal files are sequentially written in preprocessed data library.The present invention solves the problem of carrying out repeatedly read-write operation to entire journal file in the prior art and making the pretreatment of web log file time-consuming, lead to the low efficiency of log processing.

Description

The processing method and processing device of database journal
Technical field
The present invention relates to computer fields, in particular to a kind of processing method and processing device of database journal.
Background technique
With the development of internet, the amount of access and data volume of website all ramp, and single server can not expire Foot application needs.So more common way is the method using computer cluster equally loaded at present, by one or The multiple front end load servers of person, by one group of server of the load distribution of server to rear end, back-end server, which receives, is asked Seek simultaneously record log.
With the growth of website visiting amount, the journal file that access request is received for recording server is visited also with website The growth for the amount of asking and constantly expand.But handling temporal requirement, there is no reduce to journal file.Therefore, how to mention The treatment effeciency of high journal file, becoming must problems faced.
General log processing method is to directly read raw log files, then to the data in raw log files into Row analysis.Because journal file itself is very huge, and the restriction by disk read-write speed, the reading speed ten of journal file Divide slow.And all original logs will be re-read when carrying out different analyses to journal file, cause to imitate in this way Rate is very low.
Make the pre- of web log file for read-write operation is carried out repeatedly to entire journal file in the prior art The problem of time-consuming for processing, leads to the low efficiency of log processing, currently no effective solution has been proposed.
Summary of the invention
The main purpose of the present invention is to provide a kind of processing method and processing devices of database journal, to solve the prior art In read-write operation is carried out repeatedly to entire journal file make the pretreatment of web log file time-consuming, cause at log The problem of low efficiency of reason.
To achieve the goals above, according to an aspect of an embodiment of the present invention, a kind of place of database journal is provided Reason method.This method comprises: obtaining journal file;Read the segmentation rule of journal file;According to segmentation rule to journal file It is split, obtains at least two pretreatment journal files;At least two pretreatment journal files are sequentially written in pretreatment number According to library.
To achieve the goals above, according to another aspect of an embodiment of the present invention, a kind of place of database journal is provided Device is managed, which includes the first acquisition module, for obtaining journal file;First read module, for reading journal file Segmentation rule;Divide module, for being split according to segmentation rule to journal file, obtains at least two pretreatment logs File;First memory module, at least two pretreatment journal files to be sequentially written in preprocessed data library.
According to inventive embodiments, by obtaining journal file;Read the segmentation rule of journal file;It is right according to segmentation rule Journal file is split, and obtains at least two pretreatment journal files;At least two pretreatment journal files are sequentially written in Preprocessed data library, solving the prior art, read-write operation makes web log file repeatedly to the progress of entire journal file Pretreatment time-consuming, the problem of leading to the low efficiency of log processing.Realize processing speed and the place improved to journal file Manage the effect of efficiency.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present invention, schematic reality of the invention It applies example and its explanation is used to explain the present invention, do not constitute improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of the processing method of according to embodiments of the present invention one database journal;
Fig. 2 is the flow chart of the processing method of according to embodiments of the present invention one preferred database journal;
Fig. 3 is the flow chart of the processing method of according to embodiments of the present invention one preferred database journal;
Fig. 4 is the structural schematic diagram of the processing unit of two database journal according to embodiments of the present invention;
Fig. 5 is the structural schematic diagram of the processing unit of according to embodiments of the present invention two preferred database journals;And
Fig. 6 is the structural schematic diagram of the processing unit of according to embodiments of the present invention two preferred database journals.
Specific embodiment
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The present invention will be described in detail below with reference to the accompanying drawings and embodiments.
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The model that the present invention protects all should belong in member's every other embodiment obtained without making creative work It encloses.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein.In addition, term " includes " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
Embodiment 1
The embodiment of the invention provides a kind of processing methods of database journal.
Fig. 1 is the flow chart of the processing method of database journal according to an embodiment of the present invention.As shown in Figure 1, this method It comprises the following steps that
Step S11 obtains journal file.
Specifically, S11 through the above steps, gets the journal file of server generation.
Step S13 reads the segmentation rule of journal file.
Specifically, S13 through the above steps, reads pre-set, to be split to journal file segmentation rule.
Step S15 is split journal file according to segmentation rule, obtains at least two pretreatment journal files.
Specifically, S15 through the above steps, according to the segmentation rule read, to the entire journal file got It is split, journal file is divided into more than two pretreatment journal files, for subsequent use.
At least two pretreatment journal files are sequentially written in preprocessed data library by step S17.
Specifically, S17 through the above steps, the pretreatment journal file after segmentation is read out respectively, and successively writes Enter in the preprocessed data library for storing pretreatment journal file.
The very big journal file of script single file is split by S11 to step S17 through the above steps respectively, point It is cut into the lesser preprocessed file of several files, and the pretreatment journal file Jing Guo dividing processing is stored in pretreatment number In library.
In practical application, many hardware resources, example are needed because reading to the huge journal file of single file Such as: memory size and hard disk cache capacity.Restriction also by disk read-write speed causes reading efficiency very low.So Journal file is split, several small pretreatment journal files are divided into.These small pretreatment journal files because File size is smaller, and computer reads very fastly, is successively read out to each preprocessed file, and be stored in pretreatment number According in library, to accelerate the reading speed of journal file.
In summary, the present invention solve in the prior art to entire journal file carry out repeatedly read-write operation make website The problem of time-consuming for the pretreatment of access log file, leads to the low efficiency of log processing realizes raising to journal file The effect of processing speed and treatment effeciency.
Preferably, in alternative embodiment provided by the present application, journal files are pre-processed successively by least two in step S17 It is written in preprocessed data library, step includes:
Step S171 reads the file attribute of journal file, wherein file attribute includes at least: when log recording originates Between and log recording terminate the time.
Step S173 modifies the time configuration information in preprocessed data library according to file attribute.
Step S175 reads pretreatment journal file corresponding with time configuration information according to time configuration information.
Successively preprocessed data library is written in pretreatment journal file corresponding with time configuration information by step S177.
Specifically, S171 to step S177 through the above steps, obtains log recording initial time and the day of journal file The will end of record time.Terminate the time according to log recording initial time and log recording, modify in preprocessed data library when Between configuration information, the time configuration information may include pretreatment initial time and pretreatment terminate the time.Preprocessed data library According to time configuration information, the determining pretreatment journal file to match with time configuration information.By determining pretreatment log File is sequentially written in preprocessed data library.
Each pretreatment journal file has pretreatment journal file attribute, and pretreatment journal file attribute at least wraps Include: pretreatment log recording initial time and pretreatment log recording terminate the time, and pretreatment log recording initial time can be with It is determined according to the time of the first of the pretreatment journal file log recording, pretreatment log recording terminates the time then can root It is determined according to the time of the last item log recording of the pretreatment journal file.
Preprocessed data library is according to time configuration information, the determining pretreatment log text to match with time configuration information Part can be pretreatment log recording initial time and be more than or equal to pretreatment initial time and terminate the time less than or equal to pretreatment Pretreatment journal file, be also possible to pre-process log recording terminate the time be more than or equal to pretreatment initial time and be less than etc. The pretreatment journal file of time is terminated in pretreatment, can also be that pretreatment log recording initial time is more than or equal to pretreatment Initial time and pretreatment log recording terminate the pretreatment journal file for the time being less than or equal to the pretreatment termination time.It is specific why Selection those skilled in the art can set according to actual needs, and this will not be repeated here.
Under some scenes, in order to reduce journal file to the occupancy of database side storage resource, divide to journal file After obtaining pretreatment journal file, the original journal file will be deleted, if the subsequent log for needing to read the journal file Record start time and log recording terminate the time, can be before raw log files are deleted by the log recording of the journal file Initial time and log recording termination time read out and are recorded, in case the reading of subsequent step is called.
Preferably, as shown in Fig. 2, in alternative embodiment provided by the present application, step S17 will pre-process journal file according to After secondary write-in database, method further include:
The analysis processing period is calculated according to time configuration information in step S19.
Step S21 handles the period according to analysis, reads the pretreatment journal file in preprocessed data library.
Step S23, analysis pretreatment journal file, obtains standard logs tables of data.
Log database is written in standard logs tables of data by step S25.
Specifically, S19 to step S25 through the above steps, carries out the pretreatment journal file in preprocessed data library It is further processed.The time is terminated according to the pretreatment initial time in time configuration information, pretreatment, is calculated and needs to carry out The analysis of analysis processing handles the period.Because there is the pretreatment journal file of each time in preprocessed data library, need According to the time interval that the analysis processing period determines, the pretreatment day in the preprocessed data library in the time interval is read Will file.These pretreatment journal files are analyzed and processed, obtaining, which can be used for, is directly stored in database, uses In the standard logs tables of data of analysis, obtained standard logs tables of data is stored in log database, so as at any time to day Will content its be read out, analyze.
In practical application, pretreatment journal file is handled according to the analysis processing period, it can be according to service The loading condition of device handles the pretreatment journal file in preprocessed data library.When can also be by handling analysis The real time modifying of section, realization handle the pretreatment journal file in preprocessed data library in real time.It can also be by predetermined The treatment process of pretreatment journal file in preprocessed data library has been evenly distributed to each generation log text by time interval After part.In this way, it is only necessary to the modification analysis processing period, so that it may according to the actual situation, to the place of analysis processing journal file Reason process is controlled.
Wherein, the analysis processing period is to terminate the time interval that the time determines by handling time started and processing, can Using directly by log recording initial time as the processing time started in the analysis processing period, log recording terminates time conduct Processing in the analysis processing period terminates the time, is analyzed and processed to be directed to each journal file in time.
Preferably, in alternative embodiment provided by the present application, pretreatment journal file is analyzed in step S23, obtains standard In daily record data table, step includes:
The log content for pre-processing journal file is extracted, obtains target data by step S231.
Target data is converted to target data type and obtained by step S233 according to pre-set target data type Processing result.
Step S235, aggregation process is as a result, generate standard logs tables of data.
Specifically, S231 to step S235, reading pretreatment journal file will pre-process journal file through the above steps In content extraction come out, be converted to the data content that can store in database, obtain target data.By target data into Row conversion, is converted to unified data type.Target data after conversion is summarized, standard logs tables of data is generated, deposits It is stored in log database, to transfer at any time.
Preferably, as shown in figure 3, in alternative embodiment provided by the present application, journal file is divided in step S13 reading Before cutting rule, method further include:
Step S121 obtains the amount of capacity of journal file.
Step S123 reads pre-set capacity threshold, wherein capacity threshold is used to judge the size of journal file.
Step S125 judges whether to be split journal file according to the amount of capacity and capacity threshold of journal file;
Wherein, when the capacity of journal file is greater than capacity threshold, determination is split processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determination is not split processing to journal file.
Specifically, judging the amount of capacity of journal file by step S121 to step S125, working as journal file Amount of capacity be more than or equal to pre-set capacity threshold when, processing is split to journal file.When the appearance of journal file When measuring size less than pre-set capacity threshold, then processing is not split to journal file.
In practical application, in night or other idle periods, website visiting amount is simultaneously little, so the log text generated Part is relatively small.And very big burden can't be caused to system in the process of processing to lesser journal file.So this In joined judgment step, judge whether to be split journal file processing.
Preferably, in alternative embodiment provided by the present application, the segmentation rule of journal file is included at least: fixed number Split plot design, fixed capacity split plot design.
Preferably, in alternative embodiment provided by the present application, when segmentation rule is fixes number split plot design, step S15 Include: according to the step of rule is split journal file, obtains at least two pretreatment journal files is divided
Step S151a reads pre-set segmentation number.
Step S153a averagely divides journal file according to segmentation number, obtains the pretreatment journal file of fixed number.
Specifically, S151a and step S153a through the above steps, journal file is carried out according to fixed number split plot design Dividing processing.Pre-set segmentation number is read, all journal files are averagely divided into and preset segmentation number Pre-process journal file.
Preferably, in alternative embodiment provided by the present application, when segmentation rule is fixed capacity split plot design, step S15 Include: according to the step of rule is split journal file, obtains at least two pretreatment journal files is divided
Step S151b reads pre-set segmentation capacity.
Step S153b is split journal file according to segmentation capacity, if obtaining the identical pretreatment log of dry capacity File.
Specifically, S151b and step S153b through the above steps, journal file is carried out according to fixed capacity split plot design Dividing processing.Read it is pre-set pretreatment journal file single file size, sequence by journal file to log text Part is split, and is divided into the pretreatment journal file that capacity etc. is big.In practical application, most of journal file exists It is divided to finally, remaining capacity and pre-set segmentation capacity is unsatisfactory for, at this point it is possible to which remaining log is generated one A pretreatment journal file.
In conjunction with actual application, following steps can be divided into:
Step 1: segmentation journal file.Using log partition tools by large log file (such as: journal file capacity is super Cross 20GB).Journal file is divided into several pretreatments journal file M1, M2 ... Mn.
Step 2: preliminary entry time is modified.The task scheduling of the logreader of halt system and schedulework's Task scheduling modifies two times configuration in the configuration file of logreader: pretreatment initial time (lastsuccesstime), pretreatment terminates time (untiltime).According to the pretreatment in the configuration file of Logreader The corresponding small pretreatment journal file divided is read in preprocessed data by initial time and pretreatment termination time one by one In library (receiver database).
Step 3: second of entry time of modification.
Allocation Analysis pre-processes the time (postprocess time) of journal file, is set greater than and closest to running through The integral point time (such as: 10:00,11:00,12:00) of all pretreatment journal files divided.Operating analysis pre-processes day Will Files step (postprocess).After completing above-mentioned steps, stored using data warehouse technology (ETL).
Step 4: circulation reads journal file.
Step 1 is repeated to step 3, until can not find big journal file to other journal files.
The present invention carry out at analysis with to pretreatment journal file after being split processing to journal file The control for managing correlation time point can accelerate the performance of processing insertion log database, solve disposably to read in log memory and property The deficiency of energy.
Embodiment 2
The embodiment of the invention also provides a kind of processing units of database journal, as shown in figure 4, the device can wrap Include: first obtains module 31, the first read module 33, segmentation module 35 and the first memory module 37.
Wherein, first module 31 is obtained, for obtaining journal file.
Specifically, obtaining module 31 by above-mentioned first, the journal file of server generation is got.
First read module 33, for reading the segmentation rule of journal file.
Specifically, reading pre-set, to be split to journal file point by above-mentioned first read module 33 Cut rule.
Divide module 35, for being split according to segmentation rule to journal file, obtains at least two pretreatment logs File.
Specifically, by above-mentioned segmentation module 35, according to the segmentation rule read, to the entire log text got Part is split, and journal file is divided into more than two pretreatment journal files, for subsequent use.
First memory module 37, at least two pretreatment journal files to be sequentially written in preprocessed data library.
Specifically, be read out the pretreatment journal file after segmentation respectively by above-mentioned first memory module 37, and It is sequentially written in the preprocessed data library for storing pretreatment journal file.
Module 31, the first read module 33, segmentation module 35 and the first memory module 37 are obtained by above-mentioned first, respectively The very big journal file of script single file is split, is divided into the lesser preprocessed file of several files, and will be through The pretreatment journal file of over-segmentation processing is stored in preprocessed data library.
In practical application, many hardware resources, example are needed because reading to the huge journal file of single file Such as: memory size and hard disk cache capacity.Restriction also by disk read-write speed causes reading efficiency very low.So Journal file is split, several small pretreatment journal files are divided into.These small pretreatment journal files because File size is smaller, and computer reads very fastly, is successively read out to each preprocessed file, and be stored in pretreatment number According in library, to accelerate the reading speed of journal file.
In summary, the present invention solve in the prior art to entire journal file carry out repeatedly read-write operation make website The problem of time-consuming for the pretreatment of access log file, leads to the low efficiency of log processing realizes raising to journal file The effect of processing speed and treatment effeciency.
Further, above-mentioned first memory module 37 execute by least two pretreatment journal files be sequentially written in it is pre- During handling database, further includes:
Firstly, read journal file file attribute, wherein file attribute includes at least: log recording initial time and Log recording terminates the time.
In turn, according to file attribute, the time configuration information in preprocessed data library is modified.
Then, according to time configuration information, pretreatment journal file corresponding with time configuration information is read.
Finally, preprocessed data library successively is written in pretreatment journal file corresponding with time configuration information.
Specifically, obtaining the log recording initial time and log note of journal file by above-mentioned first memory module 37 Record terminates the time.The time is terminated according to log recording initial time and log recording, the time modified in preprocessed data library matches Confidence breath, the time configuration information may include that pretreatment initial time and pretreatment terminate the time.Preprocessed data library according to Time configuration information, the determining pretreatment journal file to match with time configuration information.By determining pretreatment journal file It is sequentially written in preprocessed data library.
Each pretreatment journal file has pretreatment journal file attribute, and pretreatment journal file attribute at least wraps Include: pretreatment log recording initial time and pretreatment log recording terminate the time, and pretreatment log recording initial time can be with It is determined according to the time of the first of the pretreatment journal file log recording, pretreatment log recording terminates the time then can root It is determined according to the time of the last item log recording of the pretreatment journal file.
Preprocessed data library is according to time configuration information, the determining pretreatment log text to match with time configuration information Part can be pretreatment log recording initial time and be more than or equal to pretreatment initial time and terminate the time less than or equal to pretreatment Pretreatment journal file, be also possible to pre-process log recording terminate the time be more than or equal to pretreatment initial time and be less than etc. The pretreatment journal file of time is terminated in pretreatment, can also be that pretreatment log recording initial time is more than or equal to pretreatment Initial time and pretreatment log recording terminate the pretreatment journal file for the time being less than or equal to the pretreatment termination time.It is specific why Selection those skilled in the art can set according to actual needs, and this will not be repeated here.
Under some scenes, in order to reduce journal file to the occupancy of database side storage resource, divide to journal file After obtaining pretreatment journal file, the original journal file will be deleted, if the subsequent log for needing to read the journal file Record start time and log recording terminate the time, can be before raw log files are deleted by the log recording of the journal file Initial time and log recording termination time read out and are recorded, in case the reading of subsequent step is called.
Preferably, as shown in figure 5, in alternative embodiment provided by the present application, device further include: computing module 39, second Read module 41, processing module 43 and the second memory module 45.
Wherein, computing module 39, for the analysis processing period to be calculated according to time configuration information;
Second read module 41 reads the pretreatment log text in preprocessed data library for handling the period according to analysis Part;
Processing module 43 obtains standard logs tables of data for analyzing pretreatment journal file;
Second memory module 45, for log database to be written in standard logs tables of data.
Specifically, by computing module 39, the second read module 41, processing module 43 and the second memory module 45, to pre- Pretreatment journal file in processing database is further processed.When being originated according to the pretreatment in time configuration information Between, pretreatment terminate the time, be calculated need be analyzed and processed analysis processing the period.Because preprocessed data has in library The pretreatment journal file of each time, so the time interval that needs are determined according to the analysis processing period, when reading is in this Between pretreatment journal file in preprocessed data library in section.These pretreatment journal files are analyzed and processed, are obtained To can be used for being directly stored in database, for the standard logs tables of data of analysis, the standard logs number that will be obtained In table deposit log database, so as to log content, it is read out, analyzes at any time.
In practical application, pretreatment journal file is handled according to the analysis processing period, it can be according to service The loading condition of device handles the pretreatment journal file in preprocessed data library.When can also be by handling analysis The real time modifying of section, realization handle the pretreatment journal file in preprocessed data library in real time.It can also be by predetermined The treatment process of pretreatment journal file in preprocessed data library has been evenly distributed to each generation log text by time interval After part.In this way, it is only necessary to the modification analysis processing period, so that it may according to the actual situation, to the place of analysis processing journal file Reason process is controlled.
Wherein, the analysis processing period is to terminate the time interval that the time determines by handling time started and processing, can The time started is handled in the period directly to handle log recording initial time as analysis, the log recording termination time, which is used as, divides Processing in the analysis processing period terminates the time, is analyzed and processed to be directed to each journal file in time.
Further, pretreatment journal file is analyzed in above-mentioned processing module 43, obtains the process of standard logs tables of data In, further includes:
Firstly, the log content for pre-processing journal file is extracted, target data is obtained.
Then, according to pre-set target data type, target data is converted into target data type and is handled As a result.
Finally, aggregation process is as a result, generate standard logs tables of data.
Specifically, reading pretreatment journal file by above-mentioned processing module 43, the content in journal file will be pre-processed It extracts, is converted to the data content that can store in database, obtains target data.Target data is converted, Be converted to unified data type.Target data after conversion is summarized, standard logs tables of data is generated, is stored in log In database, to transfer at any time.
Preferably, as shown in fig. 6, in alternative embodiment provided by the present application, above-mentioned apparatus further include: second obtains module 321, third read module 323 and judgment module 325.
Wherein, second module 321 is obtained, for obtaining the amount of capacity of journal file.
Third read module 323, for reading pre-set capacity threshold, wherein capacity threshold is for judging log The size of file.
Judgment module 325, for the amount of capacity and capacity threshold according to journal file, judge whether to journal file into Row segmentation;
Wherein, when the capacity of journal file is greater than capacity threshold, determination is split processing to journal file;
When the capacity of journal file is less than or equal to capacity threshold, determination is not split processing to journal file.
Specifically, module 321, third read module 323 and judgment module 325 are obtained by above-mentioned second, to log text The amount of capacity of part is judged, when the amount of capacity of journal file is more than or equal to pre-set capacity threshold, to log File is split processing.When the amount of capacity of journal file is less than pre-set capacity threshold, then not to journal file It is split processing.
In practical application, in night or other idle periods, website visiting amount is simultaneously little, so the log text generated Part is relatively small.And very big burden can't be caused to system in the process of processing to lesser journal file.So this In joined judgment step, judge whether to be split journal file processing.
Further, the segmentation rule of journal file is included at least: fixed number split plot design, fixed capacity split plot design.
Further, when segmentation rule is fixes number split plot design, above-mentioned segmentation module 35 is according to segmentation rule to day The step of will file is split, and obtains at least two pretreatment journal files include:
Step 1: pre-set segmentation number is read.
Step 2: averagely divide journal file according to segmentation number, obtain the pretreatment journal file of fixed number.
Specifically, through the above steps one and step 2, place is split according to fixed number split plot design to journal file Reason.Pre-set segmentation number is read, all journal files are averagely divided into the pretreatment for presetting segmentation number Journal file.
Further, in alternative embodiment provided by the present application, when segmentation rule is fixed capacity split plot design, above-mentioned point Cutting the step of module 35 is split journal file, obtains at least two pretreatment journal files according to segmentation rule includes:
Step 1: pre-set segmentation capacity is read.
Step 2: being split journal file according to segmentation capacity, if obtaining the identical pretreatment log text of dry capacity Part.
Specifically, through the above steps one and step 2, place is split according to fixed capacity split plot design to journal file Reason.The single file size of pre-set pretreatment journal file is read, sequence carries out journal file to journal file Segmentation, is divided into the pretreatment journal file that capacity etc. is big.In practical application, most of journal file is being divided To the end, remaining capacity and it is unsatisfactory for pre-set segmentation capacity, at this point it is possible to which remaining log is generated a pre- place Manage journal file.
In conjunction with actual application, following steps can be divided into:
Step 1: segmentation journal file.Using log partition tools by large log file (such as: journal file capacity is super Cross 20GB).Journal file is divided into several pretreatments journal file M1, M2 ... Mn.
Step 2: preliminary entry time is modified.The task scheduling of the logreader of halt system and schedulework's Task scheduling modifies two times configuration in the configuration file of logreader: pretreatment initial time (lastsuccesstime), pretreatment terminates time (untiltime).According to the pretreatment in the configuration file of Logreader The corresponding small pretreatment journal file divided is read in preprocessed data by initial time and pretreatment termination time one by one In library (receiver database).
Step 3: second of entry time of modification.
Allocation Analysis pre-processes the time (postprocess time) of journal file, is set greater than and closest to running through The integral point time (such as: 10:00,11:00,12:00) of all pretreatment journal files divided.Operating analysis pre-processes day Will Files step (postprocess).After completing above-mentioned steps, stored using data warehouse technology (ETL).
Step 4: circulation reads journal file.
Step 1 is repeated to step 3, until can not find big journal file to other journal files.
The present invention carry out at analysis with to pretreatment journal file after being split processing to journal file The control for managing correlation time point can accelerate the performance of processing insertion log database, solve disposably to read in log memory and property The deficiency of energy.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules is not necessarily of the invention It is necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed device, it can be by another way It realizes.For example, the apparatus embodiments described above are merely exemplary, such as the division of the unit, it is only a kind of Logical function partition, there may be another division manner in actual implementation, such as multiple units or components can combine or can To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or unit, It can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, mobile terminal, server or network equipment etc.) executes side described in each embodiment of the present invention The all or part of the steps of method.And storage medium above-mentioned include: USB flash disk, read-only memory (ROM, Read-Only Memory), Random access memory (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. are various to be can store The medium of program code.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (9)

1. a kind of processing method of database journal characterized by comprising
Obtain journal file;
Read the segmentation rule of the journal file;
The journal file is split according to the segmentation rule, obtains at least two pretreatment journal files;
At least two pretreatments journal file is sequentially written in preprocessed data library;
Wherein, the described the step of at least two pretreatments journal file is sequentially written in preprocessed data library, includes:
Read the file attribute of the journal file, wherein the file attribute includes at least: log recording initial time and day The will end of record time;
According to the file attribute, the time configuration information in the preprocessed data library is modified;
According to the time configuration information, the pretreatment journal file corresponding with the time configuration information is read;
The preprocessed data library successively is written into the pretreatment journal file corresponding with the time configuration information.
2. the method according to claim 1, wherein successively being write by at least two pretreatments journal file After entering preprocessed data library, the method also includes:
According to the time configuration information, the analysis processing period is calculated;
The period is handled according to the analysis, reads the pretreatment journal file in the preprocessed data library;
The pretreatment journal file is analyzed, standard logs tables of data is obtained;
Log database is written into the standard logs tables of data.
3. according to the method described in claim 2, it is characterized in that, the analysis pretreatment journal file, obtains standard The step of daily record data table includes:
The log content of the pretreatment journal file is extracted, target data is obtained;
According to pre-set target data type, the target data is converted into target data type and obtains processing result;
Summarize the processing result, generates standard logs tables of data.
4. according to the method described in claim 2, it is characterized in that, before described read to the segmentation of journal file rule, The described method includes:
Obtain the amount of capacity of journal file;
Read pre-set capacity threshold;
According to the amount of capacity of the journal file and the capacity threshold, judge whether to be split the journal file;
Wherein, when the capacity of the journal file is greater than the capacity threshold, determination is split place to the journal file Reason;
When the capacity of the journal file is less than or equal to the capacity threshold, determination is not split place to the journal file Reason.
5. method as claimed in any of claims 1 to 4, which is characterized in that the segmentation rule includes at least: Gu Determine number split plot design, fixed capacity split plot design.
6. according to the method described in claim 5, it is characterized in that, when segmentation rule is fixed number split plot design, institute State the step of being split to the journal file according to the segmentation rule, obtain at least two pretreatment journal files packet It includes:
Read pre-set segmentation number;
Averagely divide the journal file according to the segmentation number, obtains the pretreatment journal file of fixed number.
7. according to the method described in claim 5, it is characterized in that, when segmentation rule is fixed capacity split plot design, institute State the step of being split to the journal file according to the segmentation rule, obtain at least two pretreatment journal files packet It includes:
Read pre-set segmentation capacity;
The journal file is split according to the segmentation capacity, if obtaining the identical pretreatment log text of dry capacity Part.
8. a kind of processing unit of database journal characterized by comprising
First obtains module, for obtaining journal file;
First read module, for reading the segmentation rule of the journal file;
Divide module, for being split according to the segmentation rule to the journal file, obtained at least two pretreatment days Will file;
First memory module, for at least two pretreatments journal file to be sequentially written in preprocessed data library;
Wherein, described device further include:
Computing module, for the analysis processing period to be calculated according to time configuration information;
Second read module reads the pretreatment in the preprocessed data library for handling the period according to the analysis Journal file;
Processing module obtains standard logs tables of data for analyzing the pretreatment journal file;
Second memory module, for log database to be written in the standard logs tables of data.
9. device according to claim 8, which is characterized in that described device further include:
Second obtains module, for obtaining the amount of capacity of journal file;
Third read module, for reading pre-set capacity threshold, wherein the capacity threshold is for judging the log The size of file;
Judgment module, for according to the journal file amount of capacity and the capacity threshold, judge whether to the log File is split;
Wherein, when the capacity of the journal file is greater than the capacity threshold, determination is split place to the journal file Reason;
When the capacity of the journal file is less than or equal to the capacity threshold, determination is not split place to the journal file Reason.
CN201410709417.5A 2014-11-27 2014-11-27 The processing method and processing device of database journal Active CN104391954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410709417.5A CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410709417.5A CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Publications (2)

Publication Number Publication Date
CN104391954A CN104391954A (en) 2015-03-04
CN104391954B true CN104391954B (en) 2019-04-09

Family

ID=52609858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410709417.5A Active CN104391954B (en) 2014-11-27 2014-11-27 The processing method and processing device of database journal

Country Status (1)

Country Link
CN (1) CN104391954B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9917758B2 (en) 2015-03-25 2018-03-13 International Business Machines Corporation Optimizing log analysis in SaaS environments
CN106844630A (en) * 2017-01-20 2017-06-13 山东中创软件商用中间件股份有限公司 A kind of application server sql log recording methods and its device
CN106815363A (en) * 2017-01-24 2017-06-09 郑州云海信息技术有限公司 One kind rotates management method and device based on linux daily records
CN106897431B (en) * 2017-02-27 2021-06-11 郑州云海信息技术有限公司 Log export method and system
CN107818041A (en) * 2017-10-24 2018-03-20 南京航空航天大学 SECONDO system files read and write inspection software
CN108304305A (en) * 2018-01-11 2018-07-20 北京潘达互娱科技有限公司 The method and apparatus that journal file is read
CN109299052B (en) * 2018-09-03 2024-03-15 珠海泰合科技有限公司 Log cutting method, device, computer equipment and storage medium
CN108989471A (en) * 2018-09-05 2018-12-11 郑州云海信息技术有限公司 The management method and device of log in network system
CN110209643A (en) * 2019-04-23 2019-09-06 深圳壹账通智能科技有限公司 A kind of data processing method and device
CN111045885A (en) * 2019-11-11 2020-04-21 网联清算有限公司 Database log file processing method and device and computer equipment
CN111061690B (en) * 2019-11-22 2023-08-22 武汉达梦数据库股份有限公司 RAC-based database log file reading method and device
CN113656358A (en) * 2020-05-12 2021-11-16 网联清算有限公司 Database log file processing method and system
WO2021237704A1 (en) * 2020-05-29 2021-12-02 深圳市欢太科技有限公司 Data synchronization method and related device
CN112434949A (en) * 2020-11-25 2021-03-02 平安普惠企业管理有限公司 Service early warning processing method, device, equipment and medium based on artificial intelligence
CN113687974B (en) * 2021-10-22 2022-03-01 飞狐信息技术(天津)有限公司 Client log processing method and device and computer equipment
CN115225471A (en) * 2022-07-15 2022-10-21 中国工商银行股份有限公司 Log analysis method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724063A (en) * 2012-05-11 2012-10-10 北京邮电大学 Log collection server, data packet delivering and log clustering methods and network
CN103178982A (en) * 2011-12-23 2013-06-26 阿里巴巴集团控股有限公司 Method and device for analyzing log
CN103324696A (en) * 2013-06-06 2013-09-25 合一信息技术(北京)有限公司 Collecting and statistical analysis system and method for data logs
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN103593422A (en) * 2013-11-01 2014-02-19 国云科技股份有限公司 Virtual access management method of heterogeneous database
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN104035729A (en) * 2014-05-22 2014-09-10 中国科学院计算技术研究所 Block device thin-provisioning method for log mapping
CN104050268A (en) * 2014-06-23 2014-09-17 西北工业大学 Continuous data protection and recovery method with log space adjustable online

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751478A (en) * 2010-02-20 2010-06-23 浪潮(北京)电子信息产业有限公司 File backup method and system
US10198463B2 (en) * 2010-04-16 2019-02-05 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103178982A (en) * 2011-12-23 2013-06-26 阿里巴巴集团控股有限公司 Method and device for analyzing log
CN102724063A (en) * 2012-05-11 2012-10-10 北京邮电大学 Log collection server, data packet delivering and log clustering methods and network
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN103324696A (en) * 2013-06-06 2013-09-25 合一信息技术(北京)有限公司 Collecting and statistical analysis system and method for data logs
CN103593422A (en) * 2013-11-01 2014-02-19 国云科技股份有限公司 Virtual access management method of heterogeneous database
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN104035729A (en) * 2014-05-22 2014-09-10 中国科学院计算技术研究所 Block device thin-provisioning method for log mapping
CN104050268A (en) * 2014-06-23 2014-09-17 西北工业大学 Continuous data protection and recovery method with log space adjustable online

Also Published As

Publication number Publication date
CN104391954A (en) 2015-03-04

Similar Documents

Publication Publication Date Title
CN104391954B (en) The processing method and processing device of database journal
US9361343B2 (en) Method for parallel mining of temporal relations in large event file
US9256665B2 (en) Creation of inverted index system, and data processing method and apparatus
CN108595583A (en) Dynamic chart class page data crawling method, device, terminal and storage medium
CN106874348A (en) File is stored and the method for indexing means, device and reading file
CN103595571B (en) Preprocess method, the apparatus and system of web log
CN106649349A (en) Method, device and system for data caching, applicable to game application
CN102148805B (en) Feature matching method and device
CN104123238A (en) Data storage method and device
US20090327220A1 (en) Automated client/server operation partitioning
CN105095247B (en) symbol data analysis method and system
CN105487987B (en) A kind of concurrent sequence of processing reads the method and device of IO
US8019765B2 (en) Identifying files associated with a workflow
CN107688488A (en) A kind of optimization method and device of the task scheduling based on metadata
CN103310460A (en) Image characteristic extraction method and system
CN107153643A (en) Tables of data connection method and device
CN106339388A (en) Flexible scheduling method and device for database
CN104899161B (en) A kind of caching method of the continuous data protection based on cloud storage environment
CN108650334A (en) A kind of setting method and device of session failed
CN109144734A (en) A kind of container resource quota distribution method and device
CN106021566A (en) Method, device and system for improving concurrent processing capacity of single database
CN110704485A (en) Virtual resource processing method, device and storage medium
KR20200010645A (en) Method and apparatus for pre-processing big data
CN107203418A (en) The method and device of resource is chosen according to system configuration
CN108197323A (en) Applied to distributed system map data processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Database log processing method and device

Effective date of registration: 20190531

Granted publication date: 20190409

Pledgee: Shenzhen Black Horse World Investment Consulting Co., Ltd.

Pledgor: Beijing Guoshuang Technology Co.,Ltd.

Registration number: 2019990000503

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Patentee before: Beijing Guoshuang Technology Co.,Ltd.