CN106776617A - The store method and device of journal file - Google Patents

The store method and device of journal file Download PDF

Info

Publication number
CN106776617A
CN106776617A CN201510812631.8A CN201510812631A CN106776617A CN 106776617 A CN106776617 A CN 106776617A CN 201510812631 A CN201510812631 A CN 201510812631A CN 106776617 A CN106776617 A CN 106776617A
Authority
CN
China
Prior art keywords
file
server
data base
distributed data
journal file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510812631.8A
Other languages
Chinese (zh)
Other versions
CN106776617B (en
Inventor
汤卫群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510812631.8A priority Critical patent/CN106776617B/en
Publication of CN106776617A publication Critical patent/CN106776617A/en
Application granted granted Critical
Publication of CN106776617B publication Critical patent/CN106776617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses the store method and device of a kind of journal file.Wherein, the method includes:Obtain the journal file of server record;File destination is created in distributed data base, wherein, file destination is used to store journal file;Obtain the data transmission resources between server and distributed data base;And data transmission resources are utilized, journal file is preserved into the file destination of distributed data base.Present application addresses the technical problem of the journal file that cannot quickly and efficiently preserve server.

Description

The store method and device of journal file
Technical field
The application is related to data processing field, in particular to the store method and device of a kind of journal file.
Background technology
Having very famous application server-tomcat, java web an application in Java web applications can run Service is externally provided in tomcat, the daily record that each user accesses the tomcat is very valuable, it is necessary to by this A little log recordings are simultaneously saved.Tomcat has been provided for two ways and preserves these access logs, and one kind is to be based on The log mode AccessLogValve of file, another is the JDBC that tomcat7.0 versions are provided AccessLogValve。
If preserving daily record, it is necessary to raw log files are entered using the log modes AccessLogValve based on file Row conversion, and these journal files all write on local, can be limited by disk size size.
In order to avoid being limited this problem by disk size size, it is possible to use the JDBC based on JDBC modes AccessLogValve preserves daily record, but which has the disadvantages that:(1) in JDBC AccessLogValve The connection of JDBC is not taken from connection pool, and resource is consumed when setting up connection, and (2) JDBC AccessLogValve do not have Have and inserted using batch, (3) JDBC AccessLogValve employ synchronous insertion, (4) are when data volume is big Wait, point storehouse is done to database and divides table cumbersome, lead to not quickly and efficiently preserve the journal file of server.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
The embodiment of the present application provides the store method and device of a kind of journal file, so that at least solve cannot be rapidly and efficiently Ground preserves the technical problem of the journal file of server.
According to the one side of the embodiment of the present application, there is provided a kind of store method of journal file, including:Obtain clothes The journal file of business device record;File destination is created in distributed data base, wherein, the file destination is used to deposit Store up the journal file;Obtain the data transmission resources between the server and the distributed data base;And profit The data transmission resources are used, the journal file is preserved into the file destination of the distributed data base.
Further, the data transmission resources are being utilized, the journal file is being preserved to the distributed data base The file destination in before, methods described also includes:The journal file is stored to the buffering of the server Area;Judge whether the buffering capacity of the buffering area reaches preset value;Using the data transmission resources, by the daily record File is preserved to the file destination of the distributed data base to be included:Judging the buffering capacity of the buffering area In the case of reaching the preset value, by the data transmission resources, the day in the buffering area is will be stored in Will file is preserved into the file destination of the distributed data base.
Further, before the data transmission resources between the server and the distributed data base are obtained, institute Stating method also includes:The connection pool set up between the server and the distributed data base, wherein, the connection Connected comprising multiple in pond, the data transmission resources are the connection;Reached in the buffering capacity for judging the buffering area In the case of the preset value, by the data transmission resources, the daily record in the buffering area is will be stored in File is preserved to the file destination of the distributed data base to be included:Judging the buffering capacity of the buffering area In the case of reaching the preset value, multiple target connections are obtained from the connection pool, wherein, the target connection It is the idle connection in the connection pool;Connected using the multiple target, the journal file is preserved to described point In the file destination of cloth database.
Further, the server is multiple, and the journal file for obtaining server record includes:Obtain server S j The journal file of record, wherein, j takes 1 to n successively, and n is the quantity of the server, in distributed data base Creating file destination includes:Sub-goal file D1 is created in the distributed data base to sub-goal file Dn, its In, the sub-goal file D1 to the sub-goal file Dn constitutes the file destination, set up the server with Connection pool between the distributed data base includes:Set up between the server S j and the distributed data base Connection pool Pj, is connected using the multiple target, and the journal file is preserved to described in the distributed data base File destination includes:Connected using the target got from the connection pool Pj, by the server S j The journal file of record is preserved into the sub-goal file Dj of the distributed data base.
Further, using the data transmission resources, the journal file is preserved to the distributed data base Also include in the file destination:Using the data transmission resources, with Asynchronous Transfer Mode by the journal file A plurality of batch data preserve into the file destination of the distributed data base.
According to the another aspect of the embodiment of the present application, a kind of save set of journal file is additionally provided, including:First Acquiring unit, the journal file for obtaining server record;Creating unit, for being created in distributed data base File destination, wherein, the file destination is used to store the journal file;Second acquisition unit, for obtaining State the data transmission resources between server and the distributed data base;And storage unit, for utilizing the number According to transfer resource, the journal file is preserved into the file destination of the distributed data base.
Further, described device also includes:Memory cell, for utilizing the data transfer in the storage unit Resource, before the journal file is preserved into the file destination of the distributed data base, by the daily record File is stored to the buffering area of the server;Whether judging unit, the buffering capacity for judging the buffering area reaches Preset value;The storage unit includes:First preserves subelement, for judging the buffering when the judging unit When the buffering capacity in area reaches the preset value, by the data transmission resources, the institute in the buffering area is will be stored in Journal file is stated to preserve into the file destination of the distributed data base.
Further, described device also includes:Unit is set up, for obtaining the service in the second acquisition unit Before data transmission resources between device and the distributed data base, the server is set up with the distributed data Connection pool between storehouse, wherein, being connected comprising multiple in the connection pool, the data transmission resources are the connection; The first preservation subelement includes:Acquisition module, the buffering for judging the buffering area when the judging unit When amount reaches the preset value, multiple target connections are obtained from the connection pool, wherein, the target connection is institute State the idle connection in connection pool;Preserving module, for the multiple target connection obtained using the acquisition module, The journal file is preserved into the file destination of the distributed data base.
Further, the server is multiple, and the first acquisition unit includes:First obtains subelement, is used for The journal file of server S j records is obtained, wherein, j takes 1 to n successively, and n is the quantity of the server, described Creating unit includes:Subelement is created, for creating sub-goal file D1 to specific item in the distributed data base Mark file Dn, wherein, the sub-goal file D1 to the sub-goal file Dn constitutes the file destination, institute State and set up unit and include:Subelement is set up, for setting up the company between the server S j and the distributed data base Pond Pj is met, the preserving module includes:Submodule is preserved, for using the acquisition module from the connection pool Pj In get the target connection, by the server S j record journal file preserve to the distributed data base Sub-goal file Dj in.
Further, the storage unit also includes:Second preserves subelement, for utilizing the data transmission resources, The a plurality of batch data in the journal file is preserved with Asynchronous Transfer Mode to the mesh of the distributed data base In mark file.
In the embodiment of the present application, using the journal file for obtaining server record, mesh is created in distributed data base Mark file, wherein, file destination is used to store journal file, obtains the data between server and distributed data base Transfer resource, and data transmission resources are utilized, journal file is preserved into the file destination of distributed data base. By obtaining the journal file of server record, the target text for storing journal file is created in distributed data base Part, obtains the data transmission resources between server and distributed data base, using the data transmission resources, by daily record File is preserved into the file destination of distributed data base, due to this method avoid making point storehouse point table to database, because This can be quickly and efficiently preserved into distributed data base the journal file of server record, realized server The journal file of record quickly and efficiently preserves the technique effect into distributed data base, and then solves prior art In cannot quickly and efficiently preserve server journal file technical problem.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen Schematic description and description please does not constitute the improper restriction to the application for explaining the application.In accompanying drawing In:
Fig. 1 is the flow chart of the store method of the journal file according to the embodiment of the present application;And
Fig. 2 is the schematic diagram of the save set of the journal file according to the embodiment of the present application.
Specific embodiment
In order that those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present application, it is clear that described embodiment The only embodiment of the application part, rather than whole embodiments.Based on the embodiment in the application, ability The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to The scope of the application protection.
It should be noted that term " first ", " second " in the description and claims of this application and above-mentioned accompanying drawing Etc. being for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so making Data can be exchanged in the appropriate case, so that embodiments herein described herein can be with except herein Order beyond those of diagram or description is implemented.Additionally, term " comprising " and " having " and their any deformation, Be intended to that covering is non-exclusive to be included, for example, contain the process of series of steps or unit, method, system, Product or equipment are not necessarily limited to those steps clearly listed or unit, but may include not list clearly or For these processes, method, product or other intrinsic steps of equipment or unit.
Description below is made to the technical term involved by the embodiment of the present application first:
Tomcat:Tomcat is a free Web Application Server increased income, and belongs to lightweight service device, generally It is applied in middle-size and small-size system.
HBase:HBase is that reliability is high, performance is good, towards row, telescopic distributed memory system.
According to the embodiment of the present application, there is provided a kind of embodiment of the store method of journal file, it is necessary to explanation, Can be performed in the such as one group computer system of computer executable instructions the step of the flow of accompanying drawing is illustrated, And, although logical order is shown in flow charts, but in some cases, can be with suitable different from herein Sequence performs shown or described step.
Fig. 1 is the flow chart of the store method of the journal file according to the embodiment of the present application, as shown in figure 1, the method Comprise the following steps:
Step S102, obtains the journal file of server record.The journal file of server record can be the visit of user Ask daily record, or the running log of server.
Step S104, creates file destination in distributed data base, wherein, file destination is used to store journal file. Distributed data base can be HBase, Cassandra, Hyper Table etc..In the data volume of the file for needing storage In the case of very big, point storehouse done to database and divides table cumbersome, use HBase distributed databases, it is to avoid Point storehouse point table is made to database, the efficiency of data is preserved so as to improve.File destination can be table.
Step S106, obtains the data transmission resources between server and distributed data base.
Step S108, using data transmission resources, journal file is preserved into the file destination of distributed data base.
HBase is a database distributed, towards row, and it is with the maximum difference of universal relation type database: HBase is well suited for storing non-structured data, and also it is rather than based on capable pattern based on row.
Rowkey is one section of binary code stream, and maximum length is 64KB, and content can be by the User Defined that uses. It is typically also to carry out according to the binary system sequence of Rowkey (line unit) is ascending when data are loaded.
HBase is retrieved according to Rowkey, system by find certain Rowkey (or certain Rowkey scopes) where Region (region), then by inquire about data request be routed to the Region obtain number According to.3 kinds of modes are supported in the retrieval of HBase:
(1) accessed by single Rowkey, i.e., carry out get operations according to certain Rowkey key assignments, so obtained A unique record;
(2) be scanned by the scope of Rowkey, i.e., by set startRowKey (beginning line unit) and EndRowKey (end line unit), is scanned in the range of this, so can obtain a collection of note by specified condition Record;
(3) full table scan, i.e., directly scan all row records in whole table.
HBase is very high by single Rowkey effectivenesss of retrieval, is taken below 1 millisecond, and each second can obtain 1000-2000 bars are recorded.
By dexterously designing Rowkey, getting together the data in the file of acquisition (should be in same Region Under), good performance can be obtained in traversing result.
By obtaining the journal file of server record, the mesh for storing journal file is created in distributed data base Mark file, obtains the data transmission resources between server and distributed data base, using the data transmission resources, will Journal file is preserved into the file destination of distributed data base, due to this method avoid making point storehouse point table to database, Therefore, it is possible to the journal file of server record is quickly and efficiently preserved into distributed data base, realizing to service The journal file of device record quickly and efficiently preserves the technique effect into distributed data base, and then solves existing skill The technical problem of the journal file of server cannot be quickly and efficiently preserved in art.
Alternatively, data transmission resources are being utilized, journal file is being preserved into the file destination of distributed data base it Before, the store method of the journal file that the embodiment of the present application is provided also includes:Journal file is stored to server Buffering area;Judge whether the buffering capacity of buffering area reaches preset value;Using data transmission resources, journal file is preserved File destination to distributed data base includes:In the case where the buffering capacity for judging buffering area reaches preset value, By data transmission resources, will be stored in the journal file in buffering area and preserve into the file destination of distributed data base.
Data transmission resources are being utilized, before journal file is preserved into the file destination of distributed data base, first will Journal file is stored to the buffering area of server, and after the buffering capacity of buffering area reaches preset value, is passed using data Defeated resource, journal file is preserved into the file destination of distributed data base.Preset value can pre-set, in buffering When the maximum cushioning amount in area is M units, it is 0.5M, 0.6M, 0.7M or 0.8M etc. that can set preset value.In advance If value is also simultaneously threshold value, i.e., after the buffering capacity of buffering area reaches the threshold value, by the journal file storage in buffering area To distributed data base.
For example, it is the 60% of maximum cushioning amount to set preset value, journal file is stored to buffering area, in buffering area Buffering capacity is reached after the 60% of maximum cushioning amount, using data transmission resources, journal file is preserved to distributed number According in the file destination in storehouse.
Because the speed of CPU is fast, and I/O (input/output end port, English full name is input/output) device rate Slowly, therefore easily produce " bottleneck " phenomenon because passage is not enough.By using buffering area, CPU and I/O can be improved The unmatched problem of speed between equipment, reduces interruption times of the I/O equipment to CPU, so as to improve CPU's Operating efficiency, therefore the performance of server is improved, make user access server more efficient, improve Consumer's Experience.
For the buffering capacity of buffering area sets threshold value, the day that will can be stored in buffering area in time when buffering capacity reaches threshold value Will file is transmitted to distributed data base, it is to avoid the buffering capacity of buffering area reaches data after maximum in journal file not Can be saved and lose.
Alternatively, before the data transmission resources between server and distributed data base are obtained, the embodiment of the present application The store method of the journal file for being provided also includes:The connection pool set up between server and distributed data base, its In, being connected comprising multiple in connection pool, data transmission resources are connection;Reached in the buffering capacity for judging buffering area pre- If in the case of value, by data transmission resources, will be stored in the journal file in buffering area and preserve to distributed data The file destination in storehouse includes:In the case where the buffering capacity for judging buffering area reaches preset value, obtained from connection pool Multiple target connections are taken, wherein, target connection is the idle connection in connection pool;Connected using multiple targets, by day Will file is preserved into the file destination of distributed data base.
The connection pool set up between server and distributed data base, is connected in connection pool comprising multiple, and above-mentioned data are passed Defeated resource is connection.When the buffering capacity of buffering area reaches preset value, idle connection is obtained from connection pool, obtained The idle connection arrived is target connection, is connected using multiple targets, and journal file is preserved to distributed data base In file destination.
Connection is a kind of crucial limited expensive resource, and the management to connecting can be significantly affected entirely applies journey The retractility and robustness of sequence, have influence on the performance indications of program.Connection pool is responsible for distributing, manage and discharging connection, It allows application program to reuse an existing database to connect, rather than resettling one, it is to avoid resource wave Take, and the performance of application program can be lifted.
Alternatively, server is multiple, and the journal file for obtaining server record includes:Obtain server S j records Journal file, wherein, j takes 1 to n successively, and n is the quantity of server, and target text is created in distributed data base Part includes:Sub-goal file D1 is created in distributed data base to sub-goal file Dn, wherein, sub-goal file D1 to sub-goal file Dn constitutes file destination, and the connection pool set up between server and distributed data base includes: The connection pool Pj set up between server S j and distributed data base, is connected using multiple targets, and journal file is preserved File destination to distributed data base includes:Connected using the target got from connection pool Pj, by server The journal file of Sj records is preserved into the sub-goal file Dj of distributed data base.
The store method of the journal file that the embodiment of the present application is provided can be while preserve the daily record text of different server record Part, is described as follows.Server S 1 to server S n be the different servers of n, it is necessary to this n is serviced The journal file of device record is preserved into distributed data base.N is set up under file destination in distributed data base Sub-goal file, this n sub- file destination is respectively sub-goal file D1 to sub-goal file Dn.Sub-goal file D1 is used for preserving the journal file of server S 1, and sub-goal file D2 is used for preserving the journal file of server S 2, Sub-goal file D3 is used for preserving the journal file of server S 3, and by that analogy, sub-goal file Dn is used for preserving The journal file of server S n.
A connection pool is set up between each server and distributed data base, n connection pool is established altogether, This n connection pool be respectively connection pool P1, connection pool P2 ..., connection pool Pn.Wherein, connection pool P1 is clothes Connection pool between business device S1 and distributed data base, connection pool P2 is between server S 2 and distributed data base Connection pool, connection pool P3 is the connection pool between server S 3 and distributed data base, by that analogy, connection pool Pn is the connection pool between server S n and distributed data base.
In this n connection pool, each connection pool is comprising at least one connection.It is individual comprising m (i) in connection pool Pi Connection, respectively connects C1-i, connection Ci-2, connection Ci-3..., connection Ci-m(i)
For example, it is assumed that one has 3 different servers, then n is 3, and connection pool P1 is server S 1 and distribution Connection pool between formula database, contains 6 connections, i.e. m (1)=6, this 6 connection difference in connection pool P1 It is connection C1-1, connection C1-2, connection C1-3, connection C1-4, connection C1-5With connection C1-6
Connection pool P2 is the connection pool between server S 2 and distributed data base, and 3 are contained in connection pool P2 Connection, i.e. m (2)=3, this 3 connections are respectively connection C2-1, connection C2-2With connection C2-3
Connection pool P3 is the connection pool between server S 3 and distributed data base, and 4 are contained in connection pool P3 Connection, i.e. m (3)=4, this 4 connections are respectively connection C3-1, connection C3-2, connection C3-3With connection C3-4
The identification information of server S 1,3 in connection pool P2 are contained in 6 connections in connection pool P1 The identification information of server S 2 is contained in connection, server S 3 is contained in 4 connections in connection pool P3 Identification information, according to the identification information included in connection, it can be determined that go out the connection be which server with it is distributed Connection between database, then preserves into distributed data base corresponding son by the journal file of the server record In file destination, i.e., the journal file that server S 1 is recorded is preserved to sub-goal file D1, server S 2 is recorded Journal file preserve to sub-goal file D2, the journal file that server S 3 is recorded is preserved to sub-goal file D3.
Preserved to corresponding sub-goal file by the journal file for recording different server so that the guarantor of journal file Deposit clearly orderly, afterwards during query log files, be convenient to.For example, when inquiry is needed, some is serviced The journal file of device, it is only necessary to inquire about the corresponding sub-goal file of the server, it is not necessary to enter to whole file destination again Row inquiry, can accelerate inquiry velocity, improve search efficiency.
Alternatively, using data transmission resources, the file destination that journal file is preserved to distributed data base is included: Using data transmission resources, a plurality of batch data in journal file is preserved to distributed data with Asynchronous Transfer Mode In the file destination in storehouse.
Many datas in journal file are carried out into asynchronous batch processing, the speed of data processing can be greatly speeded up, from And more efficiently preserve to distributed data base journal file.For example, crowd size BatchSize can be set being 2000, i.e., each data interaction processes 2000 datas.
In the connection set up between server and distributed data base, there are two implementation methods:
(1) method is public void invoke (Request request, Response response), this be responsible for by Request forwarding is gone down;(2) another method:public void log(Request request,Response response,long time)。
The HbaseAccessLogValve of realization is broken into the lib that jar bags are put under tomcat catalogues, by Hbase client End jar bags and its jar bags for relying on also put the lib under tomcat catalogues into.
Server.xml labels in the conf files of Tomcat<Host></Host>Configured below middle increase <Valve className=" com.gridsum.tomcat.valves.HbaseAccessLogValve " ZkQuorum=" 192.168.1.100 " zkClientPort=" 2181 " maxConnections=" 10 "
BatchSize=" 1000 " tableName=" access_log " columeFamily=" logInfo "/>
Wherein,
MaxConnections=" 10 " represents that maximum number of connections is 10.
BatchSize=" 1000 " represents that batch size is 1000, i.e., each data interaction processes 1000 datas.
TableName=" access_log " represents that the table name of the table created in HBase is " access_log ".
ColumeFamily=" logInfo " represents that the name of the row race of the table created in HBase is " logInfo ".
According to embodiments of the present invention, a kind of save set of journal file is additionally provided.The save set of the journal file The store method of above-mentioned journal file can be performed, the store method of above-mentioned journal file can also be by the journal file Save set implement.
Fig. 2 is the schematic diagram of the save set of the journal file according to the embodiment of the present application.As shown in Fig. 2 the device Including first acquisition unit 22, creating unit 24, second acquisition unit 26 and storage unit 28.
First acquisition unit 22 is used to obtain the journal file of server record.The journal file of server record can be The access log of user, or the running log of server.
Creating unit 24 is used to create file destination in distributed data base, wherein, file destination is used to store daily record File.Distributed data base can be HBase, Cassandra, Hyper Table etc..Needing the file of storage In the case that data volume is very big, point storehouse is done to database and divides table cumbersome, using HBase distributed databases, Avoid and point storehouse point table is made to database, the efficiency of data is preserved so as to improve.File destination can be table.
Second acquisition unit 26 is used to obtain the data transmission resources between server and distributed data base.
Storage unit 28 is used to utilize data transmission resources, journal file is preserved to the file destination of distributed data base In.
HBase is a database distributed, towards row, and it is with the maximum difference of universal relation type database: HBase is well suited for storing non-structured data, and also it is rather than based on capable pattern based on row.
Rowkey is one section of binary code stream, and maximum length is 64KB, and content can be by the User Defined that uses. It is typically also to carry out according to the binary system sequence of Rowkey (line unit) is ascending when data are loaded.
HBase is retrieved according to Rowkey, system by find certain Rowkey (or certain Rowkey scopes) where Region (region), then by inquire about data request be routed to the Region obtain number According to.3 kinds of modes are supported in the retrieval of HBase:
(1) accessed by single Rowkey, i.e., carry out get operations according to certain Rowkey key assignments, so obtained A unique record;
(2) be scanned by the scope of Rowkey, i.e., by set startRowKey (beginning line unit) and EndRowKey (end line unit), is scanned in the range of this, so can obtain a collection of note by specified condition Record;
(3) full table scan, i.e., directly scan all row records in whole table.
HBase is very high by single Rowkey effectivenesss of retrieval, is taken below 1 millisecond, and each second can obtain 1000-2000 bars are recorded.
By dexterously designing Rowkey, getting together the data in the file of acquisition (should be in same Region Under), good performance can be obtained in traversing result.
By obtaining the journal file of server record, the mesh for storing journal file is created in distributed data base Mark file, obtains the data transmission resources between server and distributed data base, using the data transmission resources, will Journal file is preserved into the file destination of distributed data base, and point storehouse point table is made to database due to avoiding, therefore The journal file of server record can be quickly and efficiently preserved into distributed data base, realize and remember server The journal file of record quickly and efficiently preserves the technique effect into distributed data base, and then solves in the prior art The technical problem of the journal file of server cannot quickly and efficiently be preserved.
Alternatively, the save set of the journal file that the embodiment of the present application is provided also includes memory cell and judging unit. Memory cell is used to utilize data transmission resources in storage unit, journal file is preserved to the target of distributed data base Before in file, journal file is stored to the buffering area of server.Judging unit is used to judge the buffering capacity of buffering area Whether preset value is reached.Storage unit includes that first preserves subelement.The first preservation subelement is used to work as judging unit When judging that the buffering capacity of buffering area reaches preset value, by data transmission resources, the daily record in buffering area is will be stored in File is preserved into the file destination of distributed data base.
Data transmission resources are being utilized, before journal file is preserved into the file destination of distributed data base, first will Journal file is stored to the buffering area of server, and after the buffering capacity of buffering area reaches preset value, is passed using data Defeated resource, journal file is preserved into the file destination of distributed data base.Preset value can pre-set, in buffering When the maximum cushioning amount in area is M units, it is 0.5M, 0.6M, 0.7M or 0.8M etc. that can set preset value.In advance If value is also simultaneously threshold value, i.e., after the buffering capacity of buffering area reaches the threshold value, by the journal file storage in buffering area To distributed data base.
For example, it is the 60% of maximum cushioning amount to set preset value, journal file is stored to buffering area, in buffering area Buffering capacity is reached after the 60% of maximum cushioning amount, using data transmission resources, journal file is preserved to distributed number According in the file destination in storehouse.
Because the speed of CPU is fast, and I/O (input/output end port, English full name is input/output) device rate Slowly, therefore easily produce " bottleneck " phenomenon because passage is not enough.By using buffering area, CPU and I/O can be improved The unmatched problem of speed between equipment, reduces interruption times of the I/O equipment to CPU, so as to improve CPU's Operating efficiency, therefore the performance of server is improved, make user access server more efficient, improve Consumer's Experience.
For the buffering capacity of buffering area sets threshold value, the day that will can be stored in buffering area in time when buffering capacity reaches threshold value Will file is transmitted to distributed data base, it is to avoid the buffering capacity of buffering area reaches data after maximum in journal file not Can be saved and lose.
Alternatively, the save set of the journal file that the embodiment of the present application is provided also includes setting up unit.The foundation list Unit is for before the data transmission resources that second acquisition unit obtains between server and distributed data base, setting up clothes Connection pool between business device and distributed data base, wherein, being connected comprising multiple in connection pool, data transmission resources are Connection.First preserves subelement includes acquisition module and preserving module.Acquisition module is used to judge to delay when judging unit When the buffering capacity for rushing area reaches preset value, multiple target connections are obtained from connection pool, wherein, target connection is connection Idle connection in pond.Preserving module is used for the multiple targets connection obtained using acquisition module, and journal file is preserved Into the file destination of distributed data base.
The connection pool set up between server and distributed data base, is connected in connection pool comprising multiple, and above-mentioned data are passed Defeated resource is connection.When the buffering capacity of buffering area reaches preset value, idle connection is obtained from connection pool, obtained The idle connection arrived is target connection, is connected using multiple targets, and journal file is preserved to distributed data base In file destination.
Connection is a kind of crucial limited expensive resource, and the management to connecting can be significantly affected entirely applies journey The retractility and robustness of sequence, have influence on the performance indications of program.Connection pool is responsible for distributing, manage and discharging connection, It allows application program to reuse an existing database to connect, rather than resettling one, it is to avoid resource wave Take, and the performance of application program can be lifted.
Alternatively, server is multiple.First acquisition unit includes that first obtains subelement.The first acquisition subelement Journal file for obtaining server S j records, wherein, j takes 1 to n successively, and n is the quantity of server.Create Unit includes creating subelement.The establishment subelement, for creating sub-goal file D1 extremely in distributed data base Sub-goal file Dn, wherein, sub-goal file D1 to sub-goal file Dn constitutes file destination.Set up unit bag Include and set up subelement.This sets up subelement, for the connection pool Pj set up between server S j and distributed data base. Preserving module includes preserving submodule.The preservation submodule is used for the mesh got from connection pool Pj using acquisition module Mark connection, the journal file that server S j is recorded is preserved into the sub-goal file Dj of distributed data base.
The save set of the journal file that the embodiment of the present application is provided can be while preserve the daily record text of different server record Part, is described as follows.Server S 1 to server S n be the different servers of n, it is necessary to this n is serviced The journal file of device record is preserved into distributed data base.N is set up under file destination in distributed data base Sub-goal file, this n sub- file destination is respectively sub-goal file D1 to sub-goal file Dn.Sub-goal file D1 is used for preserving the journal file of server S 1, and sub-goal file D2 is used for preserving the journal file of server S 2, Sub-goal file D3 is used for preserving the journal file of server S 3, and by that analogy, sub-goal file Dn is used for preserving The journal file of server S n.
A connection pool is set up between each server and distributed data base, n connection pool is established altogether, This n connection pool be respectively connection pool P1, connection pool P2 ..., connection pool Pn.Wherein, connection pool P1 is clothes Connection pool between business device S1 and distributed data base, connection pool P2 is between server S 2 and distributed data base Connection pool, connection pool P3 is the connection pool between server S 3 and distributed data base, by that analogy, connection pool Pn is the connection pool between server S n and distributed data base.
In this n connection pool, each connection pool is comprising at least one connection.It is individual comprising m (i) in connection pool Pi Connection, respectively connects C1-i, connection Ci-2, connection Ci-3..., connection Ci-m(i)
For example, it is assumed that one has 3 different servers, then n is 3, and connection pool P1 is server S 1 and distribution Connection pool between formula database, contains 6 connections, i.e. m (1)=6, this 6 connection difference in connection pool P1 It is connection C1-1, connection C1-2, connection C1-3, connection C1-4, connection C1-5With connection C1-6
Connection pool P2 is the connection pool between server S 2 and distributed data base, and 3 are contained in connection pool P2 Connection, i.e. m (2)=3, this 3 connections are respectively connection C2-1, connection C2-2With connection C2-3
Connection pool P3 is the connection pool between server S 3 and distributed data base, and 4 are contained in connection pool P3 Connection, i.e. m (3)=4, this 4 connections are respectively connection C3-1, connection C3-2, connection C3-3With connection C3-4
The identification information of server S 1,3 in connection pool P2 are contained in 6 connections in connection pool P1 The identification information of server S 2 is contained in connection, server S 3 is contained in 4 connections in connection pool P3 Identification information, according to the identification information included in connection, it can be determined that go out the connection be which server with it is distributed Connection between database, then preserves into distributed data base corresponding son by the journal file of the server record In file destination, i.e., the journal file that server S 1 is recorded is preserved to sub-goal file D1, server S 2 is recorded Journal file preserve to sub-goal file D2, the journal file that server S 3 is recorded is preserved to sub-goal file D3.
Preserved to corresponding sub-goal file by the journal file for recording different server so that the guarantor of journal file Deposit clearly orderly, afterwards during query log files, be convenient to.For example, when inquiry is needed, some is serviced The journal file of device, it is only necessary to inquire about the corresponding sub-goal file of the server, it is not necessary to enter to whole file destination again Row inquiry, can accelerate inquiry velocity, improve search efficiency.
Alternatively, storage unit also includes that second preserves subelement.The second preservation subelement is used to utilize data transfer Resource, is preserved to the file destination of distributed data base a plurality of batch data in journal file with Asynchronous Transfer Mode In.
Many datas in journal file are carried out into asynchronous batch processing, the speed of data processing can be greatly speeded up, from And more efficiently preserve to distributed data base journal file.For example, crowd size BatchSize can be set being 2000, i.e., each data interaction processes 2000 datas.
In the connection set up between server and distributed data base, there are two implementation methods:
(1) method is public void invoke (Request request, Response response), this be responsible for by Request forwarding is gone down;(2) another method:public void log(Request request,Response response,long time)。
The HbaseAccessLogValve of realization is broken into the lib that jar bags are put under tomcat catalogues, by Hbase client End jar bags and its jar bags for relying on also put the lib under tomcat catalogues into.
Server.xml labels in the conf files of Tomcat<Host></Host>Configured below middle increase <Valve className=" com.gridsum.tomcat.valves.HbaseAccessLogValve " ZkQuorum=" 192.168.1.100 " zkClientPort=" 2181 " maxConnections=" 10 "
BatchSize=" 1000 " tableName=" access_log " columeFamily=" logInfo "/>
Wherein,
MaxConnections=" 10 " represents that maximum number of connections is 10.
BatchSize=" 1000 " represents that batch size is 1000, i.e., each data interaction processes 1000 datas.
TableName=" access_log " represents that the table name of the table created in HBase is " access_log ".
ColumeFamily=" logInfo " represents that the name of the row race of the table created in HBase is " logInfo ".
The save set of the journal file includes processor and memory, above-mentioned first acquisition unit 22, creating unit 24th, second acquisition unit 26 and storage unit 28 etc. are stored in memory as program unit, are held by processor Storage said procedure unit in memory is gone to realize corresponding function.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, preserve journal file by adjusting kernel parameter.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/ Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one Individual storage chip.
Present invention also provides a kind of computer program product, when being performed on data processing equipment, it is adapted for carrying out just The program code of beginningization there are as below methods step:Obtain the journal file of server record;Created in distributed data base File destination is built, wherein, file destination is used to store journal file;Obtain between server and distributed data base Data transmission resources;And data transmission resources are utilized, journal file is preserved to the file destination of distributed data base In.
Above-mentioned the embodiment of the present application sequence number is for illustration only, and the quality of embodiment is not represented.
In above-described embodiment of the application, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode realize.Wherein, device embodiment described above is only schematical, such as division of described unit, Can be a kind of division of logic function, there can be other dividing mode when actually realizing, for example multiple units or component Can combine or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, institute Display or the coupling each other for discussing or direct-coupling or communication connection can be by some interfaces, unit or mould The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to On multiple units.Some or all of unit therein can be according to the actual needs selected to realize this embodiment scheme Purpose.
In addition, during each functional unit in the application each embodiment can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or when using, Can store in a computer read/write memory medium.Based on such understanding, the technical scheme essence of the application On all or part of the part that is contributed to prior art in other words or the technical scheme can be with software product Form is embodied, and the computer software product is stored in a storage medium, including some instructions are used to so that one Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the application State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. is various can be with the medium of store program codes.
The above is only the preferred embodiment of the application, it is noted that for the ordinary skill people of the art For member, on the premise of the application principle is not departed from, some improvements and modifications can also be made, these improve and moisten Decorations also should be regarded as the protection domain of the application.

Claims (10)

1. a kind of store method of journal file, it is characterised in that including:
Obtain the journal file of server record;
File destination is created in distributed data base, wherein, the file destination is used to store the daily record text Part;
Obtain the data transmission resources between the server and the distributed data base;And
Using the data transmission resources, the journal file is preserved to the mesh of the distributed data base In mark file.
2. method according to claim 1, it is characterised in that the data transmission resources are being utilized, by the day Before will file is preserved into the file destination of the distributed data base, methods described also includes:
The journal file is stored to the buffering area of the server;
Judge whether the buffering capacity of the buffering area reaches preset value;
Using the data transmission resources, the journal file is preserved to the mesh of the distributed data base Mark file includes:
In the case where the buffering capacity for judging the buffering area reaches the preset value, by the data transfer Resource, will be stored in the journal file in the buffering area and preserves to the mesh of the distributed data base In mark file.
3. method according to claim 2, it is characterised in that obtaining the server and the distributed data Before data transmission resources between storehouse, methods described also includes:
The connection pool set up between the server and the distributed data base, wherein, wrapped in the connection pool Connected containing multiple, the data transmission resources are the connection;
In the case where the buffering capacity for judging the buffering area reaches the preset value, by the data transfer Resource, will be stored in the journal file in the buffering area and preserves to the mesh of the distributed data base Mark file includes:
In the case where the buffering capacity for judging the buffering area reaches the preset value, obtained from the connection pool Multiple target connections are taken, wherein, the target connection is the idle connection in the connection pool;
Connected using the multiple target, the journal file is preserved to the mesh of the distributed data base In mark file.
4. method according to claim 3, it is characterised in that the server is multiple,
The journal file for obtaining server record includes:The journal file of server S j records is obtained, wherein, j 1 to n is taken successively, and n is the quantity of the server,
File destination is created in distributed data base to be included:Sub-goal text is created in the distributed data base Part D1 to sub-goal file Dn, wherein, the sub-goal file D1 to the sub-goal file Dn is constituted The file destination,
The connection pool set up between the server and the distributed data base includes:
The connection pool Pj set up between the server S j and the distributed data base,
Connected using the multiple target, the journal file is preserved to the mesh of the distributed data base Mark file includes:
Connected using the target got from the connection pool Pj, the day that the server S j is recorded Will file is preserved into the sub-goal file Dj of the distributed data base.
5. method according to claim 1, it is characterised in that the data transmission resources are utilized, by the daily record File is preserved into the file destination of the distributed data base also to be included:
Using the data transmission resources, with Asynchronous Transfer Mode by a plurality of batch data in the journal file Preserve into the file destination of the distributed data base.
6. a kind of save set of journal file, it is characterised in that including:
First acquisition unit, the journal file for obtaining server record;
Creating unit, for creating file destination in distributed data base, wherein, the file destination is used for Store the journal file;
Second acquisition unit, for obtaining the money of the data transfer between the server and the distributed data base Source;And
Storage unit, for utilizing the data transmission resources, the journal file is preserved to the distribution In the file destination of database.
7. device according to claim 6, it is characterised in that described device also includes:
Memory cell, for utilizing the data transmission resources in the storage unit, the journal file is protected Before depositing into the file destination of the distributed data base, the journal file is stored to the service The buffering area of device;
Whether judging unit, the buffering capacity for judging the buffering area reaches preset value;
The storage unit includes:
First preserves subelement, and the buffering capacity for judging the buffering area when the judging unit reaches described During preset value, by the data transmission resources, the journal file that will be stored in the buffering area is preserved Into the file destination of the distributed data base.
8. device according to claim 7, it is characterised in that described device also includes:
Set up unit, for the second acquisition unit obtain the server and the distributed data base it Between data transmission resources before, the connection pool set up between the server and the distributed data base, its In, being connected comprising multiple in the connection pool, the data transmission resources are the connection;
The first preservation subelement includes:
Acquisition module, the buffering capacity for judging the buffering area when the judging unit reaches the preset value When, multiple targets connections are obtained from the connection pool, wherein, during the target connection is the connection pool Free time connection;
Preserving module, for the multiple target connection obtained using the acquisition module, by daily record text Part is preserved into the file destination of the distributed data base.
9. device according to claim 8, it is characterised in that the server is multiple,
The first acquisition unit includes:First obtains subelement, the daily record text for obtaining server S j records Part, wherein, j takes 1 to n successively, and n is the quantity of the server,
The creating unit includes:Subelement is created, for creating sub-goal text in the distributed data base Part D1 to sub-goal file Dn, wherein, the sub-goal file D1 to the sub-goal file Dn is constituted The file destination,
The unit of setting up includes:
Subelement is set up, for setting up the connection pool Pj between the server S j and the distributed data base,
The preserving module includes:
Submodule is preserved, the target for being got from the connection pool Pj using the acquisition module is connected Connect, the journal file that the server S j is recorded is preserved to the sub-goal file Dj of the distributed data base In.
10. device according to claim 6, it is characterised in that the storage unit also includes:
Second preserves subelement, for utilizing the data transmission resources, with Asynchronous Transfer Mode by the daily record A plurality of batch data in file is preserved into the file destination of the distributed data base.
CN201510812631.8A 2015-11-20 2015-11-20 Log file saving method and device Active CN106776617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510812631.8A CN106776617B (en) 2015-11-20 2015-11-20 Log file saving method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510812631.8A CN106776617B (en) 2015-11-20 2015-11-20 Log file saving method and device

Publications (2)

Publication Number Publication Date
CN106776617A true CN106776617A (en) 2017-05-31
CN106776617B CN106776617B (en) 2020-11-06

Family

ID=58886030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510812631.8A Active CN106776617B (en) 2015-11-20 2015-11-20 Log file saving method and device

Country Status (1)

Country Link
CN (1) CN106776617B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107515813A (en) * 2017-09-07 2017-12-26 杭州安恒信息技术有限公司 One kind is based on distributed modularization log processing method, apparatus and system
CN109492045A (en) * 2018-11-22 2019-03-19 郑州云海信息技术有限公司 A kind of log information processing method and system
CN110147411A (en) * 2019-05-20 2019-08-20 平安科技(深圳)有限公司 Method of data synchronization, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101163265A (en) * 2007-11-20 2008-04-16 中兴通讯股份有限公司 Distributed database based on multimedia message log inquiring method and system
CN101808121A (en) * 2010-02-24 2010-08-18 深圳市五巨科技有限公司 Method and device for writing server log of mobile terminal into database
WO2010107626A3 (en) * 2009-03-16 2011-01-13 Microsoft Corporation Flexible logging, such as for a web server
CN104899278A (en) * 2015-05-29 2015-09-09 北京京东尚科信息技术有限公司 Method and apparatus for generating data operation logs of Hbase database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101163265A (en) * 2007-11-20 2008-04-16 中兴通讯股份有限公司 Distributed database based on multimedia message log inquiring method and system
WO2010107626A3 (en) * 2009-03-16 2011-01-13 Microsoft Corporation Flexible logging, such as for a web server
CN101808121A (en) * 2010-02-24 2010-08-18 深圳市五巨科技有限公司 Method and device for writing server log of mobile terminal into database
CN104899278A (en) * 2015-05-29 2015-09-09 北京京东尚科信息技术有限公司 Method and apparatus for generating data operation logs of Hbase database

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107515813A (en) * 2017-09-07 2017-12-26 杭州安恒信息技术有限公司 One kind is based on distributed modularization log processing method, apparatus and system
CN107515813B (en) * 2017-09-07 2021-04-09 杭州安恒信息技术股份有限公司 Distributed modular log processing method, device and system
CN109492045A (en) * 2018-11-22 2019-03-19 郑州云海信息技术有限公司 A kind of log information processing method and system
CN110147411A (en) * 2019-05-20 2019-08-20 平安科技(深圳)有限公司 Method of data synchronization, device, computer equipment and storage medium
CN110147411B (en) * 2019-05-20 2024-05-28 平安科技(深圳)有限公司 Data synchronization method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN106776617B (en) 2020-11-06

Similar Documents

Publication Publication Date Title
US9805079B2 (en) Executing constant time relational queries against structured and semi-structured data
CN104252536B (en) A kind of internet log data query method and device based on hbase
CN104679778B (en) A kind of generation method and device of search result
CN102737057B (en) Determining method and device for goods category information
CN104516979B (en) A kind of data query method and system based on quadratic search
CN104298736B (en) Data acquisition system connection method, device and Database Systems
CN104809182A (en) Method for web crawler URL (uniform resource locator) deduplicating based on DSBF (dynamic splitting Bloom Filter)
CN108255958A (en) Data query method, apparatus and storage medium
CN102971732A (en) System architecture for integrated hierarchical query processing for key/value stores
CN100458784C (en) Researching system and method used in digital labrary
CN106528787A (en) Mass data multi-dimensional analysis-based query method and device
CN103905311A (en) Flow table matching method and device and switch
CN102246172A (en) System and method for distributed index searching of electronic content
CN101944124A (en) Distributed file system management method, device and corresponding file system
CN103345521A (en) Method and device for processing key values in hash table database
CN104407879A (en) A power grid timing sequence large data parallel loading method
CN106202416A (en) Table data write method and device, table data read method and device
CN108235069A (en) The processing method and processing device of Web TV daily record
CN102402586A (en) Distributed data storage method
CN101848248B (en) Rule searching method and device
CN106991102A (en) The processing method and processing system of key-value pair in inverted index
CN104111924A (en) Database system
CN106776617A (en) The store method and device of journal file
CN107798106A (en) A kind of URL De-weight methods in distributed reptile system
CN107330094A (en) The Bloom Filter tree construction and key-value pair storage method of dynamic memory key-value pair

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant