CN107423336A - A kind of data processing method, device and computer-readable storage medium - Google Patents

A kind of data processing method, device and computer-readable storage medium Download PDF

Info

Publication number
CN107423336A
CN107423336A CN201710286287.2A CN201710286287A CN107423336A CN 107423336 A CN107423336 A CN 107423336A CN 201710286287 A CN201710286287 A CN 201710286287A CN 107423336 A CN107423336 A CN 107423336A
Authority
CN
China
Prior art keywords
daily record
record data
index
index file
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710286287.2A
Other languages
Chinese (zh)
Other versions
CN107423336B (en
Inventor
邹炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nubia Technology Co Ltd
Original Assignee
Nubia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nubia Technology Co Ltd filed Critical Nubia Technology Co Ltd
Priority to CN201710286287.2A priority Critical patent/CN107423336B/en
Publication of CN107423336A publication Critical patent/CN107423336A/en
Application granted granted Critical
Publication of CN107423336B publication Critical patent/CN107423336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3096Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents wherein the means or processing minimize the use of computing system or of computing system component resources, e.g. non-intrusive monitoring which minimizes the probe effect: sniffing, intercepting, indirectly deriving the monitored data from other directly available data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1737Details of further file system functions for reducing power consumption or coping with limited storage space, e.g. in mobile devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Abstract

The embodiment of the present invention provides a kind of data processing method, device and computer-readable storage medium, and methods described includes:The first daily record data is obtained, first daily record data is the history log data collected;Structure is indexed to first daily record data, obtains the first index file;Increment synchronization is carried out using first index file, obtains the second index file;Second daily record data is gathered according to second index file, second daily record data is the daily record data currently to be gathered.The embodiment of the present invention can effectively reduce the occupancy of log collection Time Bandwidth and the influence to business service.

Description

A kind of data processing method, device and computer-readable storage medium
Technical field
The present invention relates to the data processing technique of the communications field, more particularly to a kind of data processing method, device and calculating Machine storage medium.
Background technology
The big data epoch are stepped into along with internet industry, each large enterprises are also all kinds of facing to how effectively to collect The problem of big data resource.In collected all kinds of big datas, daily record data is a kind of data for belonging to very core, enterprise Risk server anticipation and accident analysis can be carried out by the system journal being collected into, business income is carried out by business diary Analysis and the follow-up planning of product etc..
In several mechanism of log collection, agency-collector (Agent-Collector, A-C) pattern is the most general One kind.Specifically, Agent plug-in units are arranged on each service server and carry out log collection, then by specifically passing Defeated agreement, Agent periodically service the log transmission gathered to Collector, then service unified storage by Collector Into data warehouse.
However, there is the data transfer of flood tide and two big questions of substance of storage during log collection.One side Face, the log transmission of big data quantity can take network bandwidth resources, so as to cause bandwidth waste and the loss of business service performance Problem;On the other hand, the storage of flood tide history log data can increase the amount of storage (i.e. spending increase) of disk, also influence whether to count According to the storage performance in warehouse.
The content of the invention
In view of this, the embodiment of the present invention provides a kind of data processing method, device and computer-readable storage medium, can be at least Solves the above-mentioned problems in the prior art.
The embodiment of the present invention provides a kind of data processing method, and methods described includes:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the day currently to be gathered Will data.
In such scheme, first index file includes N number of subindex file and master index file;
Structure is indexed to first daily record data, obtains the first index file, including:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of sub- rope Quotation part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
It is described that structure is indexed to first daily record data according to the index construct rule listened in such scheme Build, obtain N number of subindex file, including:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistics knot Fruit;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtained N number of Subindex file.
In such scheme, methods described also includes:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
The embodiment of the present invention also provides a kind of data processing equipment, and the data processing equipment includes processor, memory And communication bus;
The communication bus is used to realize the connection communication between processor and memory;
The processor is used to perform the data processor stored in memory, to realize following steps:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the day currently to be gathered Will data.
In such scheme, first index file includes N number of subindex file and master index file;
What the processor was additionally operable to store in execution memory is used to be indexed structure to first daily record data, The program of the first index file is obtained, to realize following steps:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of sub- rope Quotation part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
In such scheme, what the processor was additionally operable to store in execution memory is used for according to the index structure listened to Build rule and structure is indexed to first daily record data, the program of N number of subindex file is obtained, to realize following steps:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistics knot Fruit;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtained N number of Subindex file.
In such scheme, the processor is additionally operable to perform the data processor stored in memory, following to realize Step:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
The embodiment of the present invention also provides a kind of computer-readable recording medium, and the computer-readable recording medium storage has One or more program, one or more of programs can be walked by one or more computing device so that realization is following Suddenly:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the day currently to be gathered Will data.
In such scheme, first index file includes N number of subindex file and master index file;
One or more of programs can also be by one or more of computing devices, to realize following steps:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of sub- rope Quotation part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
Data processing method, device and the computer-readable storage medium that the embodiment of the present invention is provided, obtain the first daily record number According to first daily record data is the history log data collected;Structure is indexed to first daily record data, obtained To the first index file;Increment synchronization is carried out using first index file, obtains the second index file;According to described second Index file gathers the second daily record data, and second daily record data is the daily record data currently to be gathered.In this way, the present invention is real Example is applied by using dynamic configuration, is indexed structure to history gathered data, and by generated index file, further Daily record data transmission magnitude and history log storage magnitude is greatly reduced, whereby come reduce the occupancy of log collection Time Bandwidth with Influence to business service, the amount of storage of data warehouse log information is lifted, reduce the consumption of disk hardware resource.
Brief description of the drawings
Fig. 1 is the system architecture diagram of data processing method of the embodiment of the present invention;
Fig. 2 is the implementation process schematic diagram of data processing method of the embodiment of the present invention;
Fig. 3 is the implementation process schematic diagram that the embodiment of the present invention is indexed structure to first daily record data;
Fig. 4 applies the first daily record data schematic diagram in example for the present invention;
Fig. 5 is that the present invention applies the daily record data schematic diagram after the processing of drop amount is completed in example;
Fig. 6 is index construct of embodiment of the present invention service centre operation flow schematic diagram;
Fig. 7 is business datum flow diagram of the embodiment of the present invention;
Fig. 8 is log collection transmission services flow schematic diagram of the embodiment of the present invention;
Fig. 9 is that data processing equipment of the embodiment of the present invention forms structural representation.
Embodiment
It should be appreciated that specific embodiment described herein is not intended to limit the present invention only to explain the present invention.
The embodiment of the present invention increases income product and enterprise product is realized to going through using a high proportion of compression algorithm existing On the basis of history daily record data carries out drop amount processing, based on system architecture as shown in Figure 1, enter one using index construct mode Step carries out drop amount processing to log transmission amount and history log amount of storage.The system architecture is taken with Zookeeper resource coordinations Distributed performance monitoring service is built based on business;Data warehouse is used as by the use of Hbase distributed storages;Use ES (elastic Search general index thesaurus) is used as, index management and inquiry service are externally provided, it is specific as follows:
Admin central management platforms:Index construct is provided and matches somebody with somebody posting port with posting port and load balancing, and change is given birth to Configuration information after effect is persisted in the service of Zookeeper resource coordinations, is taken for collection of log data center and index construct Business center uses;History log data inquiry displaying function is externally provided, that is, the data source shown stores number in daily record data According to storehouse, after the total search service reduction of ES, primary daily record data will be shown in Admin central management platforms;
Collection of log data center:Service is collected by multiple Collector and forms distributed type assemblies, based on Zookeeper The configuration of persistence carries out load balancing in resource coordination service;The Agent daily record datas through unified gateway are received, legal Property verification pass through after, daily record data storage arrive HBase daily record datas warehouse;
Index construct service centre:Configured by monitoring index construct of the persistence on Zookeeper, regularly to having deposited The daily record for entering HBase daily record datas warehouse carries out MR statistics, the generation of Agent subindexs, the optimization of Agent subindexs, and general index is returned And the operation such as generation;
Server where daily record Agent:The physical server of log collection Agent plug-in units is installed, can be regularly from all The Agent subindex file increments for belonging to local server are synchronized to the machine by structure service centre by Rsync file services; During log collection, Agent can use the existing index file batch of the machine to drop the log information that will be transmitted Amount processing, as index compression and Zlib are compressed;
The total search services of ES:Increment adds index from newly-generated master index file, and externally provides search index clothes Business;It is that daily record data to be presented is reduced to Admin central management platforms in the system framework.
It is main including following several in order to realize the storage of the drop amount of daily record data and transmission in application example of the present invention Step:1) Admin central management systems are built, configuration index structure rule and load balancing rule, and be allowed on this system Come into force and be persisted in the service of Zookeeper resource coordinations;2) log collection service centre is built, is assisted from Zookeeper resources It is taken after mixing with liquid business and reads load balancing rule, ensured that single Collector was capable of high performance reception processing Agent daily record number According to;3) index construct service centre is built, index construct rule is read from the service of Zookeeper resource coordinations;Regularly (as normally Weekly) history log data in HBase warehouses carries out the operation such as generation, optimization, merging of Agent subindexs; 4) by means of the restoring function of ES general index services, primary (readable) daily record data is shown on Admin central management systems.
Based on system architecture as described in Figure 1, data processing method provided in an embodiment of the present invention, as shown in Fig. 2 bag Include:
Step S201:The first daily record data is obtained, first daily record data is the history log data collected;
Step S202:Structure is indexed to first daily record data, obtains the first index file;
Step S203:Increment synchronization is carried out using first index file, obtains the second index file;
In one embodiment, it is described to carry out increment synchronization using first index file, the second index file is obtained, is wrapped Include:The N number of subindex file being utilized respectively in first index file carries out increment synchronization, obtains the second index file, N's Value is more than or equal to 2.In actual applications, with reference to system architecture as shown in Figure 1, the thing of log collection Agent plug-in units is installed Server is managed, regularly can build service centres by Rsync file services the sub- ropes of Agent for belonging to local server from all Quotation part, i.e., N number of subindex file increment included in first index file is synchronized to the machine, such as news background service Device, XX servers etc.;During log collection, Agent can use the existing second index file batch of the machine to will The log information of transmission carries out drop amount processing, as index compression and Zlib are compressed.
Here, it is necessary to which supplementary notes, the implementation process of the increment synchronization can be:Taken for such as news backstage It is engaged in for the machine of device, XX servers, if the machine knows the N number of son itself being not present in first index file through Autonomous test Part or all of subindex file in index file, then the machine by it is described partly or entirely subindex file increment be synchronized to from Body server.
Step S204:Second daily record data is gathered according to second index file, second daily record data is current The daily record data to be gathered.
Here, first index file includes N number of subindex file and master index file, and N value is more than or equal to 2.
The embodiment of the present invention when realizing the operation of step 202, concrete operations flow as shown in figure 3, including:
Step 2021:Monitor index construct rule;
Step 2022:Structure is indexed to first daily record data according to the index construct rule listened to, obtained To N number of subindex file;
Step 2023:Similar merging treatment is carried out to N number of subindex file, obtains master index file.
In one example, for the first daily record data as shown in Figure 4, wherein, the first daily record data field description As shown in following table one:
Table one
IP 110.25.78.191
Time 2017-03-15 11:05:08
Bag name com.zoe.salar.task.cat.job
Class name CountJob
Method name excute
Main information this method cost time 35ms
The index construct rule listened in this example is:Log history data go out occurrence in HBase data warehouses Number includes full matching and fuzzy matching more than 100 times.Based on index construct rule, generation generation index corresponding relation is such as Shown in following table two:
Table two
Here, it is necessary to which remarks illustrate, pure digi-tal belongs to the exclusive index of the Agent, is bound with the Agent; Above there is alphabetical c's to be expressed as general index, do not bind Agent, may span across multiple Agent subindex file.
Agent carries out the daily record data after drop amount processing using the corresponding subindex file, i.e. the first index file, such as Shown in Fig. 5.
For the embodiment of the present invention when realizing the operation of step 2022, concrete operations flow is as follows, including:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistics knot Fruit;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtained N number of Subindex file.
In another example, based on system architecture as shown in Figure 1, operation flow such as Fig. 6 institutes of index construct service centre Show:
Step S601, index construct rule is read from Zookeeper;
Step S602, according to index construct rule, Map targetedly is carried out to each Agent history log data Reduce is counted;
Step S603, it will meet that the daily record block sort for indexing rule builds index in statistical result;
Wherein, the index such as 6 class log fields in foregoing table one will not interfere with each other;
Step S604, after all Agent subindexs are all generated and finished, whole subindexs are carried out with similar merging and is grasped Make, reduce the general act amount size of subindex;
Step S605, merge all subindex files into master index file;Using master index file to HBase data warehouses Middle daily record data carries out drop amount operation;Further master index file is sent in the total search services of ES, for the total search services of ES The incremental supplementation being indexed.
Here, it is necessary to remark additionally, specific business data flow is as shown in Figure 7 in above-mentioned example.
In one embodiment, data processing method described in the embodiment of the present invention can also comprise the following steps:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
Based on above-mentioned data processing method, according to system architecture as shown in Figure 1, Agent end log collection transmission services Flow is as shown in figure 8, specifically include:
Daily record data on physical server where step S801, Agent are gathered in real time;
Step S802, drop amount operation is carried out to daily record data using the index mapping relations for being synchronized to the machine, reduces day The data volume of will information;
Step S803, using the further compressed datas of Zlib, unified entrance is then transferred to by the agreement appointed Gateway, and finally serviced by Collector and receive storage.
Data processing method described in the embodiment of the present invention, it on the one hand can effectively reduce the magnitude of log transmission.Specifically, By the index file being distributed on each service server, temporary cache is dropped in service server local daily record data Amount processing;After the processing of drop amount, further it is compressed using Zlib compression algorithms;On the basis of log information is not changed, Daily record data waiting for transmission is compressed to minimum.On the other hand, data warehouse storage log information ability can be lifted.It is specific next Say, be transmitted back to the log information come and have been subjected to drop amount compression processing, compared to the data of before processing, stored in data warehouse When, shared space reduction;Each general index structure complete after, structure center service can be based on master index file to History log data in data warehouse carries out drop amount processing, and further discharge data warehouse uses space;This comprehensive two side The data drop amount compression processing in face, data warehouse storage log information amount will be increased dramatically.
The embodiment of the present invention also provides a kind of data processing equipment, as shown in figure 9, the data processing equipment includes processing Device 901, memory 902 and communication bus 903;
The communication bus 903 is used to realize the connection communication between processor 901 and memory 902;
The processor 901 is used to perform the data processor stored in memory 902, to realize following steps:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the day currently to be gathered Will data.
In actual applications, the embodiment of the present invention is for performing the described of the data processor stored in memory 902 Processor 901 can be by the Admin central management platforms, the collection of log data that are distributed in respectively in system architecture as described in Figure 1 Server where center, index construct service centre, daily record Agent and its sub-processor in the total search services of ES are subject to reality It is existing, implement process and its division of labor can repeat no more here referring to the description of preceding method embodiment.
In one embodiment, first index file includes N number of subindex file and master index file;
What the processor 901 was additionally operable to store in execution memory 902 is used to carry out rope to first daily record data Draw structure, obtain the program of the first index file, to realize following steps:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of sub- rope Quotation part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
In one embodiment, the processor 901 is additionally operable to perform being used for according to being listened to of being stored in memory 902 Index construct rule structure is indexed to first daily record data, obtain the program of N number of subindex file, with realize with Lower step:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistics knot Fruit;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtained N number of Subindex file.
In one embodiment, what the processor 901 was additionally operable to store in execution memory 902 is used to utilize described first Index file carries out increment synchronization, the program of the second index file is obtained, to realize following steps:
The N number of subindex file being utilized respectively in first index file carries out increment synchronization, obtains the second index text Part, N value are more than or equal to 2.
In one embodiment, the processor 901 is additionally operable to perform the data processor stored in memory 902, with Realize following steps:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
The embodiment of the present invention provides a kind of computer-readable recording medium again, and the computer-readable recording medium storage has One or more program, one or more of programs can be walked by one or more computing device so that realization is following Suddenly:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the day currently to be gathered Will data.
In one embodiment, first index file includes N number of subindex file and master index file;
One or more of programs can also be by one or more of computing devices, to realize following steps: Monitor index construct rule;Structure is indexed to first daily record data according to the index construct rule listened to, obtained To N number of subindex file;Similar merging treatment is carried out to N number of subindex file, obtains master index file;Wherein, N takes Value is more than or equal to 2.
In one embodiment, one or more of programs can also by one or more of computing devices, with Realize following steps:According to the index construct rule listened to, frequency statistics are carried out to first daily record data, united Count result;The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtain N number of son Index file.
In one embodiment, one or more of programs can also by one or more of computing devices, with Realize following steps:The N number of subindex file being utilized respectively in first index file carries out increment synchronization, is included Second index file of N number of new subindex file, N value are more than or equal to 2.
In one embodiment, one or more of programs can also by one or more of computing devices, with Realize following steps:Monitor load balancing rule;Load balancing rule according to being listened to is collected and stores what is collected Second daily record data.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those key elements, and And also include the other element being not expressly set out, or also include for this process, method, article or device institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Other identical element also be present in the process of key element, method, article or device.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Embodiments of the invention are described above in conjunction with accompanying drawing, but the invention is not limited in above-mentioned specific Embodiment, above-mentioned embodiment is only schematical, rather than restricted, one of ordinary skill in the art Under the enlightenment of the present invention, in the case of present inventive concept and scope of the claimed protection is not departed from, it can also make a lot Form, these are belonged within the protection of the present invention.

Claims (10)

1. a kind of data processing method, it is characterised in that methods described includes:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the daily record number currently to be gathered According to.
2. according to the method for claim 1, it is characterised in that first index file include N number of subindex file and Master index file;
Structure is indexed to first daily record data, obtains the first index file, including:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of subindex text Part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
3. according to the method for claim 2, it is characterised in that it is described according to the index construct rule listened to described First daily record data is indexed structure, obtains N number of subindex file, including:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistical result;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtain N number of sub- rope Quotation part.
4. according to the method for claim 1, it is characterised in that methods described also includes:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
5. a kind of data processing equipment, it is characterised in that it is total that the data processing equipment includes processor, memory and communication Line;
The communication bus is used to realize the connection communication between processor and memory;
The processor is used to perform the data processor stored in memory, to realize following steps:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the daily record number currently to be gathered According to.
6. data processing equipment according to claim 5, it is characterised in that first index file includes N number of sub- rope Quotation part and master index file;
What the processor was additionally operable to store in execution memory is used to be indexed structure to first daily record data, obtains The program of first index file, to realize following steps:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of subindex text Part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
7. data processing equipment according to claim 6, it is characterised in that the processor is additionally operable to perform in memory Being used for of storage according to the index construct rule listened to is indexed structure to first daily record data, obtains N number of son The program of index file, to realize following steps:
According to the index construct rule listened to, frequency statistics are carried out to first daily record data, obtain statistical result;
The log blocks for meeting the index construct rule in the statistical result are subjected to classification structure index, obtain N number of sub- rope Quotation part.
8. data processing equipment according to claim 5, it is characterised in that the processor is additionally operable to perform in memory The data processor of storage, to realize following steps:
Monitor load balancing rule;
Load balancing rule according to being listened to is collected and stores the second collected daily record data.
A kind of 9. computer-readable recording medium, it is characterised in that the computer-readable recording medium storage have one or Multiple programs, one or more of programs can be by one or more computing devices, to realize following steps:
The first daily record data is obtained, first daily record data is the history log data collected;
Structure is indexed to first daily record data, obtains the first index file;
Increment synchronization is carried out using first index file, obtains the second index file;
Second daily record data is gathered according to second index file, second daily record data is the daily record number currently to be gathered According to.
10. computer-readable recording medium according to claim 9, it is characterised in that first index file includes N Individual sub- index file and master index file;
One or more of programs can also be by one or more of computing devices, to realize following steps:
Monitor index construct rule;
Structure is indexed to first daily record data according to the index construct rule listened to, obtains N number of subindex text Part;
Similar merging treatment is carried out to N number of subindex file, obtains master index file;
Wherein, N value is more than or equal to 2.
CN201710286287.2A 2017-04-27 2017-04-27 Data processing method and device and computer storage medium Active CN107423336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710286287.2A CN107423336B (en) 2017-04-27 2017-04-27 Data processing method and device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710286287.2A CN107423336B (en) 2017-04-27 2017-04-27 Data processing method and device and computer storage medium

Publications (2)

Publication Number Publication Date
CN107423336A true CN107423336A (en) 2017-12-01
CN107423336B CN107423336B (en) 2021-01-15

Family

ID=60424367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710286287.2A Active CN107423336B (en) 2017-04-27 2017-04-27 Data processing method and device and computer storage medium

Country Status (1)

Country Link
CN (1) CN107423336B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109788030A (en) * 2018-12-17 2019-05-21 北京百度网讯科技有限公司 Unmanned vehicle data processing method, device, system and storage medium
CN110442559A (en) * 2019-07-05 2019-11-12 深圳中兴网信科技有限公司 Log searching method, apparatus and server
CN110990366A (en) * 2019-12-04 2020-04-10 中国农业银行股份有限公司 Index allocation method and device for improving performance of log system based on ES
CN111506646A (en) * 2020-03-16 2020-08-07 阿里巴巴集团控股有限公司 Data synchronization method, device, system, storage medium and processor

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036894A1 (en) * 2008-08-05 2010-02-11 Senda Riro Data synchronization method, data synchronization program, database server and database system
CN101887417A (en) * 2009-05-13 2010-11-17 上海即略网络信息科技有限公司 Searching method
CN102129435A (en) * 2010-01-13 2011-07-20 中国移动通信集团公司 Data storage service control method and system
CN102750326A (en) * 2012-05-30 2012-10-24 浪潮电子信息产业股份有限公司 Log management optimization method of cluster system based on downsizing strategy
CN104281506A (en) * 2014-07-10 2015-01-14 中国科学院计算技术研究所 Data maintenance method and system for file system
CN104731796A (en) * 2013-12-19 2015-06-24 北京思博途信息技术有限公司 Data storage computing method and system
CN105138592A (en) * 2015-07-31 2015-12-09 武汉虹信技术服务有限责任公司 Distributed framework-based log data storing and retrieving method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036894A1 (en) * 2008-08-05 2010-02-11 Senda Riro Data synchronization method, data synchronization program, database server and database system
CN101887417A (en) * 2009-05-13 2010-11-17 上海即略网络信息科技有限公司 Searching method
CN102129435A (en) * 2010-01-13 2011-07-20 中国移动通信集团公司 Data storage service control method and system
CN102750326A (en) * 2012-05-30 2012-10-24 浪潮电子信息产业股份有限公司 Log management optimization method of cluster system based on downsizing strategy
CN104731796A (en) * 2013-12-19 2015-06-24 北京思博途信息技术有限公司 Data storage computing method and system
CN104281506A (en) * 2014-07-10 2015-01-14 中国科学院计算技术研究所 Data maintenance method and system for file system
CN105138592A (en) * 2015-07-31 2015-12-09 武汉虹信技术服务有限责任公司 Distributed framework-based log data storing and retrieving method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
方诗伟: "基于HBase的医疗卫生数据中心构建与异构数据库同步研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
杨至安: "数据库技术在海河流域水资源综合规划中的应用", 《海河水利》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109788030A (en) * 2018-12-17 2019-05-21 北京百度网讯科技有限公司 Unmanned vehicle data processing method, device, system and storage medium
US11616840B2 (en) 2018-12-17 2023-03-28 Apollo Intelligent Driving Technology (Beijing) Co., Ltd. Method, apparatus and system for processing unmanned vehicle data, and storage medium
CN110442559A (en) * 2019-07-05 2019-11-12 深圳中兴网信科技有限公司 Log searching method, apparatus and server
CN110990366A (en) * 2019-12-04 2020-04-10 中国农业银行股份有限公司 Index allocation method and device for improving performance of log system based on ES
CN110990366B (en) * 2019-12-04 2024-02-23 中国农业银行股份有限公司 Index allocation method and device for improving performance of ES-based log system
CN111506646A (en) * 2020-03-16 2020-08-07 阿里巴巴集团控股有限公司 Data synchronization method, device, system, storage medium and processor
CN111506646B (en) * 2020-03-16 2023-05-02 阿里巴巴集团控股有限公司 Data synchronization method, device, system, storage medium and processor

Also Published As

Publication number Publication date
CN107423336B (en) 2021-01-15

Similar Documents

Publication Publication Date Title
CN107423336A (en) A kind of data processing method, device and computer-readable storage medium
CN104375824B (en) Data processing method
DE602004011890T2 (en) Method for redistributing objects to arithmetic units
CN104021194A (en) Mixed type processing system and method oriented to industry big data diversity application
CN110166282A (en) Resource allocation methods, device, computer equipment and storage medium
CN102929961A (en) Data processing method and device thereof based on building quick data staging channel
CN109918349A (en) Log processing method, device, storage medium and electronic device
CN104969213A (en) Data stream splitting for low-latency data access
CN110058940B (en) Data processing method and device in multi-thread environment
CN104809130A (en) Method, equipment and system for data query
Pagh et al. Is min-wise hashing optimal for summarizing set intersection?
CN104660427A (en) Method and device for real-time statistics of logs
CN107346270B (en) Method and system for real-time computation based radix estimation
WO2021027331A1 (en) Graph data-based full relationship calculation method and apparatus, device, and storage medium
CN106682206A (en) Method and system for big data processing
CN107085579A (en) A kind of data acquisition distribution method and device
CN107480283A (en) Realize the method, apparatus and storage system of big data quick storage
CN105550351A (en) Passenger travel data ad-hoc query system and method
Sztrik Finite-source queueing systems and their applications
CN104657130A (en) Method for hierarchically layering business support system
CN110716986B (en) Big data analysis system and application method thereof
CN112181972A (en) Data management method and device based on big data and computer equipment
US20080034054A1 (en) System and method for reservation flow control
CN204425400U (en) Application server system
CN107147547A (en) A kind of cluster overall performance monitoring implementation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant