CN107918621A - Daily record data processing method, device and operation system - Google Patents

Daily record data processing method, device and operation system Download PDF

Info

Publication number
CN107918621A
CN107918621A CN201610884695.3A CN201610884695A CN107918621A CN 107918621 A CN107918621 A CN 107918621A CN 201610884695 A CN201610884695 A CN 201610884695A CN 107918621 A CN107918621 A CN 107918621A
Authority
CN
China
Prior art keywords
daily record
record data
log
log source
shunting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610884695.3A
Other languages
Chinese (zh)
Inventor
楼江航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610884695.3A priority Critical patent/CN107918621A/en
Publication of CN107918621A publication Critical patent/CN107918621A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application provides daily record data processing method, device and operation system.Daily record data processing method includes:Obtain pending daily record data;Identified according to shunting, shunting processing is carried out to the pending daily record data, to obtain the daily record data of each Log Source, wherein, each value for shunting mark corresponds to a Log Source;According to the daily record data of each Log Source, corresponding service processing is carried out respectively.The application may rely on daily record data and commence business, and improve the degree of parallelism of business, improve the handling capacity of log system.

Description

Daily record data processing method, device and operation system
Technical field
This application involves database technical field, more particularly to a kind of daily record data processing method, device and business system System.
Background technology
It is a more wide in range concept that day, which aims at computer realm, and all there may be daily record data for any application system. Daily record data is actually the record of a kind of event stored sequentially in time or behavior, and daily record data generally includes event master Body, time of origin, event content etc..In log system, unify the daily record that all Log Sources of storage produce sequentially in time Data.
With the development of business demand, there is the business scenario for much depending on daily record data, such as trace log source Operating status business scenario, position the business scenario of abnormal point in Log Source, build the business scenario of search engine, and Business scenario of the local failure log buffer of renewal, etc..Wherein, each operation system for relying on daily record data is referred to as day Will processing system., it is necessary to be successively read the day that corresponding Log Source system produces sequentially in time for log processing system Will data, could realize the business based on daily record data.
For the implementation of existing log system, since each log processing system is needed according to the time when commencing business Order is successively read the daily record data that corresponding Log Source system produces, this causes the degree of parallelism of log processing system to be limited, greatly Ground constrains the handling capacity of log system.
The content of the invention
The application provides a kind of daily record data processing method, device and operation system, and daily record number is depended on to improve According to business degree of parallelism, improve the handling capacity of log system.
To reach above-mentioned purpose, embodiments herein adopts the following technical scheme that:
First aspect, there is provided a kind of daily record data processing method, including:
Obtain pending daily record data;
Identified according to shunting, shunting processing is carried out to the pending daily record data, to obtain the daily record number of each Log Source According to, wherein, each value of the shunting mark corresponds to a Log Source;
According to the daily record data of each Log Source, corresponding service processing is carried out respectively.
Second aspect, there is provided a kind of daily record data processing unit, including:
Acquisition module, for obtaining pending daily record data;
Diverter module, for being identified according to shunting, carries out shunting processing, to obtain each day to the pending daily record data The daily record data in will source, wherein, each value of the shunting mark corresponds to a Log Source;
Business module, for the daily record data according to each Log Source, carries out corresponding service processing respectively.
The third aspect, there is provided a kind of operation system based on daily record data, including:Electric business platform, log system, daily record number According to processing unit, search engine and the corresponding database of described search engine;
The electric business platform, stores to the log system for producing daily record data, and by the daily record data;
Described search engine, for subscribing to daily record increment change business to the daily record data processing unit;
The daily record data processing unit, the daily record increment for being subscribed to according to described search engine changes business, from institute State and newly-increased daily record data is read in log system as pending daily record data;Identified according to shunting, to the pending day Will data carry out shunting processing, to obtain the corresponding daily record data of each Log Source;And with starting multiple thread parallels from described Corresponding data is obtained in the corresponding daily record data of each Log Source and is updated into the corresponding database of described search engine.
Fourth aspect, there is provided a kind of operation system based on daily record data, including:Electric business platform, log system, daily record number According to processing unit and purchaser terminal;
The electric business platform, stores to the log system for producing daily record data, and by the daily record data;
The purchaser terminal, for subscribing to daily record increment change business to the daily record data processing unit;
The daily record data processing unit, the daily record increment for being subscribed to according to the purchaser terminal changes business, from institute State and newly-increased daily record data is read in log system as pending daily record data;Identified according to shunting, to the pending day Will data carry out shunting processing, to obtain the corresponding daily record data of each Log Source;And with starting multiple thread parallels from described Corresponding data is obtained in the corresponding daily record data of each Log Source and is updated into the local cache of the purchaser terminal.
In this application, before business processing is carried out based on daily record data, shunting mark is first depending on, acquisition is treated Processing daily record data carries out shunting processing, so as to obtain the daily record data of each Log Source, shunts the value and Log Source one of mark One corresponds to, and corresponding service processing is carried out further according to the daily record data of each Log Source afterwards, due to realizing root by shunting mark The purpose shunted according to Log Source to daily record data, so needing to commence business based on the daily record data that different Log Sources produce Log processing system can perform parallel, be conducive to improve log processing system degree of parallelism, drastically increase a day aspiration The handling capacity of system.
Described above is only the general introduction of technical scheme, in order to better understand the technological means of the application, And can be practiced according to the content of specification, and in order to allow above and other objects, features and advantages of the application can Become apparent, below especially exemplified by the embodiment of the application.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the application Limitation.And in whole attached drawing, identical component is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 is the flow diagram for the daily record data processing method that one embodiment of the application provides;
Fig. 2 is the schematic diagram for the daily record data branching process that another embodiment of the application provides;
Fig. 3 is the schematic diagram that daily record data compression is carried out based on circle queue that the another embodiment of the application provides;
Fig. 4 a are the original that the type based on database manipulation that the another embodiment of the application provides carries out daily record data restructuring Reason figure;
Fig. 4 b are the structure diagram for the operation system based on daily record data that the another embodiment of the application provides;
Fig. 4 c are the structure diagram for the operation system based on daily record data that the another embodiment of the application provides;
Fig. 5 is the structure diagram for the daily record data processing unit that the another embodiment of the application provides;
Fig. 6 is the structure diagram for the daily record data processing unit that the another embodiment of the application provides.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
In the prior art, log system unifies the daily record data that all Log Sources of storage produce sequentially in time.It is right In this log system, due to the log processing system commenced business based on daily record data during commencing business, it is necessary to by Be successively read the daily record data that corresponding Log Source system produces according to time sequencing, this cause the degree of parallelism of log processing system by Limit, greatlys restrict the handling capacity of log system.
In view of the above-mentioned problems, the application provides a solution, cardinal principle is:Business is being carried out based on daily record data Before processing, shunting mark is first depending on, shunting processing is carried out to the pending daily record data of acquisition, so as to obtain each Log Source Daily record data, afterwards further according to each Log Source daily record data carry out corresponding service processing.Since the application is by shunting mark The purpose for realizing and being shunted according to Log Source to daily record data is known, so needing the daily record number produced based on different Log Sources It can parallel be performed according to the log processing system commenced business, be conducive to improve the degree of parallelism of log processing system, greatly carry The high handling capacity of log system.
It will be elaborated below by specific embodiment to technical scheme.
Fig. 1 is the flow diagram for the daily record data processing method that one embodiment of the application provides.As shown in Figure 1, the party Method includes:
101st, pending daily record data is obtained.
102nd, identified according to shunting, shunting processing is carried out to pending daily record data, to obtain the daily record number of each Log Source According to, wherein, the value and Log Source for shunting mark correspond.
103rd, according to the daily record data of each Log Source, corresponding service processing is carried out respectively.
The present embodiment provides a kind of daily record data processing method, can be performed by daily record data processing unit, to based on Daily record data is commenced business, and improves the degree of parallelism for the log processing system commenced business dependent on daily record data, improves day aspiration The handling capacity of system.
When needing to commence business based on daily record data, daily record data processing unit needs to read daily record from log system Data.Wherein, according to business demand, can continue to read daily record data from log system.Day is being read from log system It is the time sequencing according to daily record data in log system during will data, by what is be early successively read to evening.
What deserves to be explained is log system can be the design or table of database (database) rank (table) design of rank.The design of database ranks means that the daily record data that a Log Source produces is suitable according to the time Sequence is stored in a database;Correspondingly, the design of table ranks means the daily record data that all Log Sources produce It is stored in sequentially in time in a table of a database.In the present embodiment, log system is not limited Implementation, method provided in this embodiment are suitable for the log system of any way of realization.
For ease of distinguishing and describing, the daily record data read from log system is known as pending daily record data.Wherein, Daily record data in log system is the daily record data that each Log Source stored successively sequentially in time produces.In the present embodiment In, the concept of Log Source is broad sense, and in addition to it can be system or application for producing daily record data etc., can also be has The service fields of specific value or the business of major key.
In order to improve the concurrency commenced business based on daily record data, the handling capacity of log system is improved, waits to locate obtaining After managing daily record data, identified according to shunting, shunting processing is carried out to pending daily record data, so as to obtain the day of each Log Source Will data, can so minimize journal queue, and condition is provided to improve service concurrence degree.In the present embodiment, Log Source Corresponded with the value of shunting mark, in other words, each or every group of value for shunting mark corresponds to a Log Source.Citing Illustrate, if shunting is identified as order ID, order ID often takes an a value just corresponding Log Source, correspondingly, order ID's takes It is worth identical daily record data to be divided under same Log Source.
In an optional embodiment, before shunting processing is carried out to pending daily record data according to shunting mark, need Predefine shunting mark.Optionally, the unique major key that can preset daily record data is identified as shunting, due to daily record The major key of data can be with one business datum of unique mark, it is possible to which the major key based on daily record data divides daily record data Stream, this mode are suitable for most of general service scenes.
In some special applications scenes, daily record data may have multiple major keys, in these application scenarios, Ke Yigen According to business processing demand, the major key as shunting mark is specified by transaction processing system.Based on this, can be referred to according to business processing Fixed major key is identified as shunting.
The major key of above-mentioned daily record data can be traffic ID, but not limited to this., can be by above-mentioned optional embodiment License-master's key, pending daily record data is shunted.
Further, if major key is not present in log system, one or more non-primary key words can be specified by business processing The combination mark of section is as shunting mark.Based on this, the combination mark for the non-master key field that can also be specified according to business processing Identified as shunting.Based on this, pending daily record data can be shunted according to specified non-master key field.
By taking the order daily record in e-commerce field as an example, the business of an order is produced as a Log Source, order Daily record includes the daily record data such as place an order, pay the bill, and the daily record data of more orders is stored with log system.For order day Will, the daily record data such as can will be directed to the placing an order of same transaction id, pays the bill and being diverted to the transaction according to transaction id in order daily record In the corresponding journal queues of ID.By taking order daily record as an example, a kind of shunting principle is as shown in Figure 2.
The corresponding journal queue of each Log Source can be obtained by above-mentioned shunting processing, each journal queue can hold queue Succession, ensures the first in first out (FIFO) of daily record data.Optionally, loop buffer (RingBuffer) mechanism can be used real The journal queue of existing the present embodiment, but not limited to this.RingBuffer is a Open-Source Tools, and specific implementation can be found in existing skill Art, details are not described herein.
After the daily record data of each Log Source is obtained, business processing can be carried out according to the daily record data of each Log Source. It so can parallel perform, have for the log processing system for needing to commence business based on the daily record data that different Log Sources produce Beneficial to the degree of parallelism for improving log processing system, the handling capacity of log system is drastically increased.
In an optional embodiment, in order to improve the treatment effeciency of daily record data, gulping down for log system is further improved The amount of spitting, before business processing is performed, can be compressed processing, to obtain each Log Source to the daily record data of each Log Source Compress daily record data;Then according to the compression daily record data of each Log Source, corresponding service processing is carried out respectively.
Further, in a kind of optional embodiment, a circle queue is distributed for each Log Source, circle queue has Reading position and writing position, and circle queue has a certain size.Optionally, circle queue, which can be designed as one, has The memory headroom of fixed size, but not limited to this.In the present embodiment, it can be controlled according to circle queue and daily record data is carried out Compression processing, in particular to control need to compress the size of the daily record data of processing.As shown in figure 3, for based on circle queue into The schematic diagram of row daily record data compression.Specifically, processing is compressed to the daily record data of each Log Source based on circle queue Embodiment includes:
To the daily record data of each Log Source, sequentially in time, the daily record data of the Log Source is input to day successively In the corresponding circle queue in will source;When the corresponding circle queue of the Log Source meets the reading conditions of setting, the daily record is read Daily record data in the corresponding circle queue in source, is compressed processing, to obtain the Log Source to the daily record data read Compress daily record data.
What deserves to be explained is the corresponding reading conditions of different circle queues may be the same or different.The annular team The writing position that corresponding reading conditions can be circle queue is arranged to overlap with reading position, or the corresponding time-out of circle queue Time reaches, or can also be the combination of two kinds of conditions.
For example, when the writing position of circle queue, to catch up with the reading position interval scale circle queue full, trigger at this time from Data are read in circle queue and the daily record data to reading is compressed processing.
For example, each circle queue is provided with timeout mechanism, such as time-out time is 500ms, when being spaced 500ms not yet Processing is compressed to the daily record data in circle queue, then can force to read data from circle queue and to the daily record of reading Data are compressed processing.
, can be right in the case where ensureing business implication according to the type of database manipulation in an optional embodiment The daily record data of each Log Source is compressed.
By taking data manipulation language (DML) as an example, the database manipulation being related to mainly includes insertion (insert), renewal (update) and (delete) is deleted.
Based on above-mentioned, processing is compressed to the daily record data read from circle queue includes following at least one pressure Contracting is handled:
To be respectively used in the daily record data read from circle queue description to same field carry out insertion operation and The daily record data of operation is updated, boil down to one is used to describe the daily record data for carrying out the same field insertion operation;Letter It is denoted as:insert+update->insert;
To be respectively used in the daily record data read from circle queue description to same field carry out insertion operation and The daily record data of delete operation, boil down to one are used to describe the daily record data for carrying out the same field delete operation;Letter It is denoted as:insert+delete->delete;
Description will be respectively used in the daily record data read from circle queue different update behaviour is carried out to same field The daily record data of work, boil down to one are used to describe the daily record data for being updated the same field operation;It is abbreviated as: update+update->update;
To be respectively used in the daily record data read from circle queue description to same field be updated operation and The daily record data of delete operation, boil down to one are used to describe the daily record data for carrying out the same field delete operation;Letter It is denoted as:update+delete->delete.
It is exemplified below:
Example 1:
Assuming that the daily record data read from circle queue include one be used for describe be inserted into 2 fields daily record number According to specially:
It is inserted into 2 fields:A fields (va1 contents), B field (vb1 contents);
In addition, further included in the daily record data one be used for describe renewal 1 field daily record data, specially:
Update 1 field:B field (vb2 contents);
Since above-mentioned two daily record datas operate B field, therefore can be by this two daily record data boil down tos One daily record data, is specially:
It is inserted into 2 fields:A fields (va1 contents), B field (vb2 contents).
Example 2:
Assuming that the daily record data read from circle queue include one be used for describe have updated the daily records of 3 fields Data, are specially:
It has updated 3 fields:A fields (va1 contents), B field (vb1 contents), C fields (vc1 contents);
In addition, further included in the daily record data one be used for describe renewal 2 fields daily record data, specially:
It has updated 2 fields:C fields (vc2 contents), D field (vd2 contents)
Since above-mentioned two daily record datas operate C fields, therefore can be by this two daily record data boil down tos One daily record data, is specially:
Update 4 fields:A fields (va1 contents), B field (va1 contents), C fields (vc2 contents), D field is (in vd2 Hold).
Example 3:
Assuming that the daily record data read from circle queue include one be used for describe have updated the daily records of 2 fields Data, are specially:
It has updated 2 fields:A fields (va1 contents), B field (vb1 contents);A fields and B field are to be based on major key pk1 Field;
In addition, further included in the daily record data one be used for describe delete field daily record data, specially:
Delete field:Field based on major key pk1;
Since above-mentioned two daily record datas operate A and B field, therefore can be by this two daily record data compressions For a daily record data, it is specially:
Delete field:A fields and B field.
What deserves to be explained is above-mentioned compression process can perform once, execution can also be circulated, daily record will be compressed Data are re-used as input and perform compression processing again, untill meeting preset requirement.The preset requirement can be compression The quantity of daily record data is less than specified quantity, or the number of circulation compression reaches predetermined number of times, etc..
Using above-mentioned compression process it can be seen from the example above, the compression daily record number of each Log Source can be obtained According to, wherein, the compression daily record data of each Log Source is conducive to improve daily record data processing effect less than the daily record data before compression Rate, improves degree of parallelism.
Further, can be according to the compression daily record number of each Log Source after the compression daily record data of each Log Source is obtained According to progress corresponding service processing respectively.
, can be by the daily record data of a plurality of order of each Log Source using above-mentioned compression processing in an optional embodiment One compression daily record data of boil down to, can obtain a day being made of the wall scroll compression daily record data of multiple Log Sources at this time Will queue, in the journal queue, the record of different Log Sources can carry out unordered processing.A kind of simplest implementation For:Independent thread is separately turned on for the compression daily record data of each Log Source, corresponding service processing is carried out by separate threads, realizes The parallel processing of business based on daily record data.
, can also be according to the type of database manipulation in another optional embodiment, the compression daily record to each Log Source Recombinated, being separately turned on separate threads for the daily record data after restructuring carries out corresponding service processing, realizes the parallel place of restructuring Reason.As shown in fig. 4 a, the schematic diagram of daily record data restructuring is carried out for the type based on database manipulation.In fig.4, show to insert Enter to operate corresponding daily record data queue, renewal operates corresponding daily record data queue and the corresponding daily record data of delete operation Queue.In the corresponding daily record data queue of insertion operation, include the daily record data of each description insertion operation;Operated in renewal In corresponding daily record data queue, include the daily record data of each description renewal operation;In the corresponding daily record data of delete operation In queue, include the daily record data of each description delete operation.
Exemplified by building search engine based on daily record data, daily record data processing unit can be the day in the business scenario Function module in will processing system or the system.As shown in Figure 4 b, the operation system based on daily record data includes:Electric business is put down Platform, log system, daily record data processing unit, the corresponding database of search engine and search engine.Wherein, electric business platform is used Stored in generation daily record data, and by daily record data to log system.In order to preferably provide function of search to the network user, search Index is held up can subscribe to daily record increment change business to daily record data processing unit.Daily record data processing unit is used for according to search The daily record increment change business that engine is subscribed to, reads newly-increased daily record data as pending daily record data from log system; Identified according to shunting, shunting processing is carried out to pending daily record data, to obtain the corresponding daily record data of each Log Source;And open Corresponding data is obtained from the corresponding daily record data of each Log Source with moving multiple thread parallels and is added to described search and drawn Hold up in corresponding database.
By taking e-commerce field as an example, mistake of the description daily record data processing unit based on daily record data structure search engine Journey.It is specific as follows:
Assuming that seller user issues a commodity on electric business platform, electric business platform can produce the day that user issues the commodity Will data, and by daily record data storage into log system.Search engine subscribes to daily record increment to daily record data processing unit Change business;When needing based on commodity data structure search engine, daily record data processing unit is read newly from log system The daily record data of increasing, as pending daily record data;Identified commodity ID as shunting, pending daily record data is divided Stream, obtains the corresponding daily record datas of each commodity ID, with then starting multiple thread parallels from the corresponding daily record datas of each commodity ID The middle detail information for obtaining each commodity respectively is simultaneously added in the corresponding database of search engine.In this way, buyer user can Each commodity by search engine inquiry to the newest issue of seller user.Due to can with the daily record data of each commodity of parallel processing, Be conducive to improve the efficiency of structure search engine.
Exemplified by updating local Stale Cache based on daily record data, daily record data processing unit can be in the business scenario Log processing system or the system in function module.As illustrated in fig. 4 c, the operation system based on daily record data includes:Electric business Platform, log system, daily record data processing unit, the local cache of purchaser terminal.Wherein, electric business platform is used to produce daily record number According to, and daily record data is stored to log system.See the up-to-date information of commodity, purchaser terminal in time for the ease of buyer user Daily record increment change business can be subscribed to daily record data processing unit.Daily record data processing unit is used to be ordered according to purchaser terminal The daily record increment change business read, reads newly-increased daily record data as pending daily record data from log system;According to point Traffic identifier, shunting processing is carried out to pending daily record data, to obtain the corresponding daily record data of each Log Source;And start multiple Obtain corresponding data thread parallel from the corresponding daily record data of each Log Source and update to the local of purchaser terminal and delay In depositing.What deserves to be explained is purchaser terminal can be multiple, the multiple thread can be to the local cache of different purchaser terminals It is updated.
By taking e-commerce field as an example, mistake of the description daily record data processing unit based on daily record data renewal local cache Journey.It is specific as follows:
Assuming that seller user is updated the commodity being published on electric business platform, such as it have updated the picture of the commodity Information, electric business platform can produce the daily record data that user updates the pictorial information of the commodity, and can arrive the daily record data storage In log system.Purchaser terminal subscribes to daily record increment change business to daily record data processing unit, so as to the buyer's end that upgrades in time Hold merchandise news that is browsed or buying (these merchandise newss are stored in the local cache of purchaser terminal);When need more During the local cache of new purchaser terminal, daily record data processing unit reads newly-increased daily record data from log system, as treating Handle daily record data;Identified commodity ID as shunting, pending daily record data is shunted, it is corresponding to obtain each commodity ID Update the daily record data of pictorial information, with then starting multiple thread parallels day from the corresponding renewal pictorial informations of each commodity ID The new pictorial information of each commodity is obtained in will data, and utilizes the commodity in the local cache of new pictorial information replacement purchaser terminal Pictorial information (pictorial information in local cache is old pictorial information), be conducive to buyer user and see the new of the commodity in time Pictorial information.Due to that, with the daily record data of each commodity of parallel processing, can be conducive to improve the efficiency of renewal local cache.
As the above analysis, the embodiment of the present application is by shunting daily record data so that is carried out based on daily record data Business preferably can be performed concurrently, be conducive to improve the handling capacity of log system;Further, the daily record to each Log Source is passed through Data are compressed processing, it is possible to reduce the quantity of daily record data, is conducive to improve the efficiency of daily record data processing, and then improves The execution efficiency of each business;Further, can be into one by carrying out parallel processing or restructuring parallel processing to compression daily record data Walk the concurrency of lifting business and the handling capacity of log system.
Fig. 5 is the structure diagram for the daily record data processing unit that the another embodiment of the application provides.As shown in figure 5, should Device includes:Acquisition module 51, diverter module 52 and business module 53.
Acquisition module 51, for obtaining pending daily record data.
Diverter module 52, for being identified according to shunting, carries out shunting processing, to obtain each daily record to pending daily record data The daily record data in source, wherein, the value of the shunting mark is corresponded with Log Source.
Business module 53, for the daily record data according to each Log Source, carries out corresponding service processing respectively.
In the present embodiment, the concept of Log Source is broad sense, except that can be system or the application for producing daily record data Deng outside, the business of service fields or major key with specific value can also be.
The shunting mark of above-mentioned each Log Source can be the major key that log system storage uses, such as the major key can be industry Be engaged in ID, but not limited to this.Based on this, pending daily record data can be shunted according to major key.
Further, if major key is not present in log system, the mark conduct of one or more service fields can be specified The shunting mark of Log Source.Based on this, pending daily record data can be shunted according to specified services field.
By taking the order daily record in e-commerce field as an example, the business of an order is produced as a Log Source, order Daily record includes the daily record data such as place an order, pay the bill, and the daily record data of more orders is stored with log system.Can be by transaction id Shunting as order business identifies.
In an optional embodiment, as shown in fig. 6, one kind of business module 53 realizes that structure includes:Compression unit 531 With business unit 532.
Compression unit 531, for being compressed processing to the daily record data of each Log Source, to obtain the compression of each Log Source Daily record data.
Business unit 532, for the compression daily record data according to each Log Source, carries out corresponding service processing respectively.
In an optional embodiment, compression unit 531 is specifically used for:
To the daily record data of each Log Source, sequentially in time, the daily record data of the Log Source is inputted into the day successively In the corresponding circle queue in will source;
When the corresponding circle queue of the Log Source meets the reading conditions of setting, the corresponding annular team of the Log Source is read Daily record data in row, and processing is compressed to the daily record data that this reads, to obtain the compression daily record number of the Log Source According to.
Optionally, above-mentioned reading conditions include following at least one:
The writing position of circle queue is overlapped with reading position;
The corresponding time-out time of circle queue reaches.
Further, compression unit 531 is specifically used for performing following at least one compression processing:
To be respectively used in the daily record data read from circle queue description to same field carry out insertion operation and The daily record data of operation is updated, boil down to one is used to describe the daily record data for carrying out same field insertion operation;It is abbreviated as: insert+update->insert;
To be respectively used in the daily record data read from circle queue description to same field carry out insertion operation and The daily record data of delete operation, boil down to one are used to describe the daily record data for carrying out same field delete operation;It is abbreviated as: insert+delete->delete;
Description will be respectively used in the daily record data read from circle queue different update behaviour is carried out to same field The daily record data of work, boil down to one are used to describe the daily record data for being updated same field operation;It is abbreviated as:update +update->update;
To be respectively used in the daily record data read from circle queue description to same field be updated operation and The daily record data of delete operation, boil down to one are used to describe the daily record data for carrying out same field delete operation;It is abbreviated as: update+delete->delete。
In an optional embodiment, business unit 532 is specifically used for:
According to the type of database manipulation, the compression daily record data to each Log Source recombinates, for the day after restructuring Will data are separately turned on separate threads and carry out corresponding service processing;Or
Separate threads progress corresponding service processing is separately turned on for the compression daily record data of each Log Source.
Daily record data processing unit provided in this embodiment, based on daily record data carry out business processing before, first according to Identified according to the shunting of each Log Source, shunting processing is carried out to the pending daily record data of acquisition, so as to obtain the day of each Log Source Will data, afterwards further according to each Log Source daily record data carry out corresponding service processing, due to according to Log Source to daily record data Shunted, so needing the log processing system that the daily record data based on the generation of different Log Sources is commenced business can be parallel Perform, be conducive to improve the degree of parallelism of log processing system, drastically increase the handling capacity of log system.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above-mentioned each method embodiment can lead to The relevant hardware of programmed instruction is crossed to complete.Foregoing program can be stored in a computer read/write memory medium.The journey Sequence upon execution, execution the step of including above-mentioned each method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above is only to illustrate the technical solution of the application, rather than its limitations;To the greatest extent Pipe is described in detail the application with reference to foregoing embodiments, it will be understood by those of ordinary skill in the art that:Its according to Can so modify to the technical solution described in foregoing embodiments, either to which part or all technical characteristic into Row equivalent substitution;And these modifications or replacement, the essence of appropriate technical solution is departed from each embodiment technology of the application The scope of scheme.

Claims (10)

  1. A kind of 1. daily record data processing method, it is characterised in that including:
    Obtain pending daily record data;
    Identified according to shunting, shunting processing is carried out to the pending daily record data, to obtain the daily record data of each Log Source, its In, the value of the shunting mark is corresponded with Log Source;
    According to the daily record data of each Log Source, corresponding service processing is carried out respectively.
  2. 2. according to the method described in claim 1, it is characterized in that, the daily record data according to each Log Source, carries out respectively Corresponding service processing, including:
    Processing is compressed to the daily record data of each Log Source, to obtain the compression daily record data of each Log Source;
    According to the compression daily record data of each Log Source, corresponding service processing is carried out respectively.
  3. 3. according to the method described in claim 2, it is characterized in that, the daily record data to each Log Source is compressed place Reason, to obtain the compression daily record data of each Log Source, including:
    To the daily record data of each Log Source, sequentially in time, the daily record data of the Log Source is inputted into the day successively In the corresponding circle queue in will source;
    When the corresponding circle queue of the Log Source meets the reading conditions of setting, the corresponding annular team of the Log Source is read Daily record data in row, and processing is compressed to the daily record data read, to obtain the compression day of the Log Source Will data.
  4. 4. according to the method described in claim 3, it is characterized in that, the reading conditions are including following at least one:
    The writing position of circle queue is overlapped with reading position;
    The corresponding time-out time of circle queue reaches.
  5. 5. according to the method described in claim 3, it is characterized in that, described be compressed place to the daily record data read Reason, to obtain the compression daily record data of the Log Source, including performs following at least one compression processing:
    The day that description carries out insertion operation to same field and renewal operates will be respectively used in the daily record data read Will data, boil down to one are used to describe the daily record data for carrying out the same field insertion operation;
    Day of the description to same field progress insertion operation and delete operation will be respectively used in the daily record data read Will data, boil down to one are used to describe the daily record data for carrying out the same field delete operation;
    Daily record data of the description to same field progress different update operation will be respectively used in the daily record data read, Boil down to one is used to describe the daily record data for being updated the same field operation;
    It will be respectively used to describe the day for being updated same field operation and delete operation in the daily record data read Will data, boil down to one are used to describe the daily record data for carrying out the same field delete operation.
  6. 6. according to claim 2-5 any one of them methods, it is characterised in that the compression daily record number according to each Log Source According to, corresponding service processing is carried out respectively, including:
    According to the type of database manipulation, the compression daily record data to each Log Source recombinates, for the daily record number after restructuring Corresponding service processing is carried out according to separate threads are separately turned on;Or
    Separate threads progress corresponding service processing is separately turned on for the compression daily record data of each Log Source.
  7. 7. according to claim 2-5 any one of them methods, it is characterised in that it is described to be identified according to shunting, wait to locate to described Reason daily record data carries out shunting processing, with before obtaining the daily record data of each Log Source, including:
    Unique major key of daily record data is preset as the shunting mark;Or
    According to the major key that the business processing is specified as the shunting mark;Or
    The combination mark for the non-master key field specified according to the business processing is identified as the shunting.
  8. A kind of 8. daily record data processing unit, it is characterised in that including:
    Acquisition module, for obtaining pending daily record data;
    Diverter module, for being identified according to shunting, carries out shunting processing, to obtain each Log Source to the pending daily record data Daily record data, wherein, value and the Log Source of the shunting mark correspond;
    Business module, for the daily record data according to each Log Source, carries out corresponding service processing respectively.
  9. A kind of 9. operation system based on daily record data, it is characterised in that including:Electric business platform, log system, at daily record data Manage device, search engine and the corresponding database of described search engine;
    The electric business platform, stores to the log system for producing daily record data, and by the daily record data;
    Described search engine, for subscribing to daily record increment change business to the daily record data processing unit;
    The daily record data processing unit, the daily record increment for being subscribed to according to described search engine changes business, from the day Newly-increased daily record data is read as pending daily record data in aspiration system;Identified according to shunting, to the pending daily record number According to shunting processing is carried out, to obtain the corresponding daily record data of each Log Source;And with starting multiple thread parallels from each day Corresponding data is obtained in the corresponding daily record data in will source and is updated into the corresponding database of described search engine.
  10. A kind of 10. operation system based on daily record data, it is characterised in that including:Electric business platform, log system, daily record data Processing unit and purchaser terminal;
    The electric business platform, stores to the log system for producing daily record data, and by the daily record data;
    The purchaser terminal, for subscribing to daily record increment change business to the daily record data processing unit;
    The daily record data processing unit, the daily record increment for being subscribed to according to the purchaser terminal changes business, from the day Newly-increased daily record data is read as pending daily record data in aspiration system;Identified according to shunting, to the pending daily record number According to shunting processing is carried out, to obtain the corresponding daily record data of each Log Source;And with starting multiple thread parallels from each day Corresponding data is obtained in the corresponding daily record data in will source and is updated into the local cache of the purchaser terminal.
CN201610884695.3A 2016-10-10 2016-10-10 Daily record data processing method, device and operation system Pending CN107918621A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610884695.3A CN107918621A (en) 2016-10-10 2016-10-10 Daily record data processing method, device and operation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610884695.3A CN107918621A (en) 2016-10-10 2016-10-10 Daily record data processing method, device and operation system

Publications (1)

Publication Number Publication Date
CN107918621A true CN107918621A (en) 2018-04-17

Family

ID=61892424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610884695.3A Pending CN107918621A (en) 2016-10-10 2016-10-10 Daily record data processing method, device and operation system

Country Status (1)

Country Link
CN (1) CN107918621A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362544A (en) * 2019-05-27 2019-10-22 中国平安人寿保险股份有限公司 Log processing system, log processing method, terminal and storage medium
CN110457181A (en) * 2019-08-02 2019-11-15 武汉达梦数据库有限公司 A kind of the log method for optimization analysis and device of database
CN111241078A (en) * 2020-01-07 2020-06-05 网易(杭州)网络有限公司 Data analysis system, data analysis method and device
CN111552575A (en) * 2019-12-31 2020-08-18 远景智能国际私人投资有限公司 Message queue-based message consumption method, device and equipment
CN111680009A (en) * 2020-06-10 2020-09-18 苏州跃盟信息科技有限公司 Log processing method and device, storage medium and processor
CN111737203A (en) * 2020-06-09 2020-10-02 阿里巴巴集团控股有限公司 Database history log backtracking method, device, system, equipment and storage medium
CN111797158A (en) * 2019-04-08 2020-10-20 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer-readable storage medium
CN112256658A (en) * 2020-10-16 2021-01-22 海尔优家智能科技(北京)有限公司 Log record shunting method and device, storage medium and electronic device
CN112671756A (en) * 2020-12-21 2021-04-16 北京明略昭辉科技有限公司 Method and device for filtering abnormal traffic
CN113760885A (en) * 2020-10-23 2021-12-07 北京沃东天骏信息技术有限公司 Incremental log processing method and device, electronic equipment and storage medium
CN115934043A (en) * 2023-01-04 2023-04-07 广州佰瑞医药有限公司 PHP-based high-efficiency MVC framework

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719149A (en) * 2009-12-03 2010-06-02 联动优势科技有限公司 Data synchronization method and device
US20110040811A1 (en) * 2009-08-17 2011-02-17 International Business Machines Corporation Distributed file system logging
CN102156720A (en) * 2011-03-28 2011-08-17 中国人民解放军国防科学技术大学 Method, device and system for restoring data
US8407335B1 (en) * 2008-06-18 2013-03-26 Alert Logic, Inc. Log message archiving and processing using a remote internet infrastructure
CN103023693A (en) * 2012-11-27 2013-04-03 北京小米科技有限责任公司 Behaviour log data management system and behaviour log data management method
CN103744906A (en) * 2013-12-26 2014-04-23 乐视网信息技术(北京)股份有限公司 System, method and device for data synchronization
CN103778136A (en) * 2012-10-19 2014-05-07 阿里巴巴集团控股有限公司 Cross-room database synchronization method and system
CN105740344A (en) * 2016-01-25 2016-07-06 中国科学院计算技术研究所 Sql statement combination method and system independent of database

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8407335B1 (en) * 2008-06-18 2013-03-26 Alert Logic, Inc. Log message archiving and processing using a remote internet infrastructure
US20110040811A1 (en) * 2009-08-17 2011-02-17 International Business Machines Corporation Distributed file system logging
CN101719149A (en) * 2009-12-03 2010-06-02 联动优势科技有限公司 Data synchronization method and device
CN102156720A (en) * 2011-03-28 2011-08-17 中国人民解放军国防科学技术大学 Method, device and system for restoring data
CN103778136A (en) * 2012-10-19 2014-05-07 阿里巴巴集团控股有限公司 Cross-room database synchronization method and system
CN103023693A (en) * 2012-11-27 2013-04-03 北京小米科技有限责任公司 Behaviour log data management system and behaviour log data management method
CN103744906A (en) * 2013-12-26 2014-04-23 乐视网信息技术(北京)股份有限公司 System, method and device for data synchronization
CN105740344A (en) * 2016-01-25 2016-07-06 中国科学院计算技术研究所 Sql statement combination method and system independent of database

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RUSSELL SEARS ET AL.: "bLSM: A General Purpose Log Structured Merge Tree", 《PROCEEDINGS OF THE ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA》 *
王春凯 等: "分布式数据流关系查询技术研究", 《计算机学报》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797158A (en) * 2019-04-08 2020-10-20 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer-readable storage medium
CN111797158B (en) * 2019-04-08 2024-04-05 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer readable storage medium
CN110362544B (en) * 2019-05-27 2024-04-02 中国平安人寿保险股份有限公司 Log processing system, log processing method, terminal and storage medium
CN110362544A (en) * 2019-05-27 2019-10-22 中国平安人寿保险股份有限公司 Log processing system, log processing method, terminal and storage medium
CN110457181A (en) * 2019-08-02 2019-11-15 武汉达梦数据库有限公司 A kind of the log method for optimization analysis and device of database
CN110457181B (en) * 2019-08-02 2023-05-16 武汉达梦数据库股份有限公司 Log optimization analysis method and device for database
CN111552575A (en) * 2019-12-31 2020-08-18 远景智能国际私人投资有限公司 Message queue-based message consumption method, device and equipment
CN111552575B (en) * 2019-12-31 2023-09-12 远景智能国际私人投资有限公司 Message consumption method, device and equipment based on message queue
CN111241078A (en) * 2020-01-07 2020-06-05 网易(杭州)网络有限公司 Data analysis system, data analysis method and device
CN111241078B (en) * 2020-01-07 2024-06-21 网易(杭州)网络有限公司 Data analysis system, data analysis method and device
CN111737203A (en) * 2020-06-09 2020-10-02 阿里巴巴集团控股有限公司 Database history log backtracking method, device, system, equipment and storage medium
CN111680009A (en) * 2020-06-10 2020-09-18 苏州跃盟信息科技有限公司 Log processing method and device, storage medium and processor
CN111680009B (en) * 2020-06-10 2023-10-03 苏州跃盟信息科技有限公司 Log processing method, device, storage medium and processor
CN112256658B (en) * 2020-10-16 2023-08-18 海尔优家智能科技(北京)有限公司 Log record distribution method and device, storage medium and electronic device
CN112256658A (en) * 2020-10-16 2021-01-22 海尔优家智能科技(北京)有限公司 Log record shunting method and device, storage medium and electronic device
CN113760885A (en) * 2020-10-23 2021-12-07 北京沃东天骏信息技术有限公司 Incremental log processing method and device, electronic equipment and storage medium
CN112671756A (en) * 2020-12-21 2021-04-16 北京明略昭辉科技有限公司 Method and device for filtering abnormal traffic
CN115934043A (en) * 2023-01-04 2023-04-07 广州佰瑞医药有限公司 PHP-based high-efficiency MVC framework
CN115934043B (en) * 2023-01-04 2024-03-15 广州佰瑞医药有限公司 PHP-based high-efficiency MVC framework

Similar Documents

Publication Publication Date Title
CN107918621A (en) Daily record data processing method, device and operation system
US11379755B2 (en) Feature processing tradeoff management
US20230126005A1 (en) Consistent filtering of machine learning data
CN104685497B (en) The hardware realization of the polymerization/packet operated by filter method
US10339465B2 (en) Optimized decision tree based models
US11100420B2 (en) Input processing for machine learning
CA2953969C (en) Interactive interfaces for machine learning model evaluations
EP2924594B1 (en) Data encoding and corresponding data structure in a column-store database
US11182691B1 (en) Category-based sampling of machine learning data
CN105074724B (en) Effective query processing is carried out using the histogram in columnar database
CN104298760B (en) A kind of data processing method and data processing equipment applied to data warehouse
US20150379429A1 (en) Interactive interfaces for machine learning model evaluations
CN106886367A (en) For the duplicate removal in memory management reference block to reference set polymerization
CN106663224A (en) Interactive interfaces for machine learning model evaluations
CN109325041A (en) Business data processing method, device, computer equipment and storage medium
CN107229730A (en) Data query method and device
CN103927314B (en) A kind of method and apparatus of batch data processing
US20140136472A1 (en) Methodology supported business intelligence (BI) software and system
US20150066969A1 (en) Combined deterministic and probabilistic matching for data management
CN105678459A (en) Metadatabase-based business flow customization model
CN103077192A (en) Data processing method and system thereof
CN114787790A (en) Data archiving method and system using hybrid storage of data
CN109344296A (en) Realize domain life cycle control method, system, server and the storage medium of the HASH key of Redis
Erraissi et al. Meta-modeling of data sources and ingestion big data layers
CN108268615A (en) A kind of data processing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180417