CN104850556A - Method and device for data processing - Google Patents

Method and device for data processing Download PDF

Info

Publication number
CN104850556A
CN104850556A CN201410053223.4A CN201410053223A CN104850556A CN 104850556 A CN104850556 A CN 104850556A CN 201410053223 A CN201410053223 A CN 201410053223A CN 104850556 A CN104850556 A CN 104850556A
Authority
CN
China
Prior art keywords
data
result data
event information
database
data identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410053223.4A
Other languages
Chinese (zh)
Other versions
CN104850556B (en
Inventor
李经纬
陈岳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410053223.4A priority Critical patent/CN104850556B/en
Publication of CN104850556A publication Critical patent/CN104850556A/en
Application granted granted Critical
Publication of CN104850556B publication Critical patent/CN104850556B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a method and a device for data processing to solve the problem of lower processing efficiency for event information in the prior art. The method comprises the following steps of processing current event information by a real-time computing system to obtain result data; judging whether a time span from a writing moment corresponding to a data identification to the current moment exceeds a set time span or not according to the data identification carried by the result data, and if yes, writing the result data into a database and updating the writing moment corresponding to the data identification to the current moment, otherwise storing the result data and processing next event information. According to the method, the result data is only written into the database when the time span from the writing moment to the current moment exceeds the set time span, and the subsequent event information can be processed during the rest of the time, so that the processing efficiency for the event information is effectively improved on the premise of guaranteeing consistency of the data in the real-time computing system and the data in the database.

Description

A kind of method of data processing and device
Technical field
The application relates to field of computer technology, particularly relates to a kind of method and device of data processing.
Background technology
In today of computer technology development, needing the event information to producing to carry out real-time process in practical application scene, obtaining result data, and result data is stored in a database, so that subsequent query.
Concrete, receive by real time computation system and process event information in real time, then will the result data write into Databasce obtained be processed.
Such as, in logistics information process scene, the Shipping Information of certain logistics facility, to pull breath of collecting mail, send information etc. with charge free be all an event information.Result data can be then to the delivering amount of certain logistics facility within certain time period (e.g., the same day), pull the result that receipts amount, the amount of sending with charge free etc. carry out adding up.
Concrete, after logistic information systems generates an event information, then this event information is sent to real time computation system.Certain treatment progress that this event information then can be distributed to self by real time computation system processes, this treatment progress is according to the attribute information carried in this event information, and the corresponding relation of the attribute information preset and Data Identification, determine the Data Identification of the result data that this attribute information is corresponding, again according to the result data of this Data Identification obtained before, the result data of this Data Identification is upgraded, finally the result data after renewal is written in database.
But, for database, can be conditional to the number of times of this database write data in unit interval, and for real time computation system, each treatment progress in real time computation system is each event information of serial processing, only after result data corresponding for current event information is written in database, just can process next event information, therefore, once real time computation system is greater than the restriction of database at the number of times of unit interval inbound data storehouse write data, just the accumulation of event information will be caused, reduce the treatment effeciency of event information, even can cause real time computation system fault.
Such as, for a database, per second can to this database write data number of times maximum be 10000 times, suppose that each event information can cause the renewal of 4 result datas, then real time computation system can only support at most the process of a 10000/4=2500 per second event information.If real time computation system have received 2501 event informations in 1 second, the real time computation system then caused due to the restriction (10000 times per second) of database write data is per second can only process 2500 event informations, will make the accumulation of 2501-2500=1 event information.Obviously, if the event information that real time computation system received in 1 second is much larger than 2500, a large amount of event informations will be caused to pile up, reduce the treatment effeciency of event information, even can cause real time computation system fault.
Summary of the invention
The embodiment of the present application provides a kind of method and device of data processing, event information is caused to be piled up in order to solve in prior art because the number of times of unit interval inbound data storehouse write data exists restriction, the treatment effeciency of event information is lower, even causes the problem of real time computation system fault.
The method of a kind of data processing that the embodiment of the present application provides, comprising:
Current event information is processed, obtains result data, and be kept at this locality;
According to the Data Identification that described result data carries, determine that the write moment that the described Data Identification of record is corresponding, said write moment are the last moment be written to by the result data carrying described Data Identification in database;
Judge whether the said write moment exceedes setting-up time length to the time span of current time;
If so, then the described result data that this locality is preserved is written in database, and the write moment corresponding for described Data Identification is updated to current time;
Otherwise, continue to preserve described result data in this locality, and next event information processed.
The device of a kind of data processing that the embodiment of the present application provides, comprising:
Event processing module, for processing current event information, obtains result data, and is kept at this locality;
Determination module, for the Data Identification carried according to described result data, determines that the write moment that the described Data Identification of record is corresponding, said write moment are the last moment be written to by the result data carrying described Data Identification in database;
Judge module, for judging whether the said write moment exceedes setting-up time length to the time span of current time;
Writing module, for when the judged result of described judge module is for being, is written to the described result data that this locality is preserved in database, and the write moment corresponding for described Data Identification is updated to current time;
Described event processing module also for, when the judged result of described judge module is no, continues to preserve described result data in this locality, and next event information processed.
The embodiment of the present application provides a kind of method and device of data processing, after the method real time computation system obtains result data to current event information processing, according to the Data Identification that this result data carries, judge whether the write moment that this Data Identification of record is corresponding exceedes setting-up time length to the time span of current time, if, then this result data is written in database, and the write moment corresponding for this Data Identification is updated to current time, otherwise continue to preserve this result data in this locality, and process next event information.Said method due to real time computation system only write the moment exceed setting-up time length to the time span of current time time ability result data is written to database, all the other time real time computation systems can process follow-up event information, therefore can under the prerequisite ensureing the data consistent in real time computation system and database, the treatment effeciency of effective raising event information, the accumulation of event information can not be caused, effectively can reduce the probability broken down because event information piles up.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide further understanding of the present application, and form a application's part, the schematic description and description of the application, for explaining the application, does not form the improper restriction to the application.In the accompanying drawings:
The process of the data processing that Fig. 1 provides for the embodiment of the present application;
The schematic diagram of the data handling procedure under normal circumstances that Fig. 2 provides for the embodiment of the present application;
The schematic diagram of data handling procedure under the abnormal conditions that Fig. 3 provides for the embodiment of the present application;
The apparatus structure schematic diagram of the data processing that Fig. 4 provides for the embodiment of the present application.
Embodiment
Due in practical application scene, for some business, real time computation system is after obtaining result data to current event information processing, result data is written in database and can't makes a big impact to business by delay a period of time again, therefore, it is complete acceptable that result data is written in database by delay a period of time again, and postpone during this period of time in, real time computation system just can process next event information, so just can break through the speed bottle-neck of the real time computation system process event information caused due to the number of times restriction of database unit time write data, thus can under the prerequisite ensureing the data consistent in real time computation system and database, improve event information treatment effeciency, reduce the probability broken down due to event information accumulation.
For making the object of the application, technical scheme and advantage clearly, below in conjunction with the application's specific embodiment and corresponding accompanying drawing, technical scheme is clearly and completely described.Obviously, described embodiment is only some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making the every other embodiment obtained under creative work prerequisite, all belong to the scope of the application's protection.
The process of the data processing that Fig. 1 provides for the embodiment of the present application, specifically comprises the following steps:
S101: current event information is processed, obtains result data, and be kept at this locality.
In the embodiment of the present application, real time computation system processes current event information by the treatment progress of self, obtains corresponding result data, and is kept in local storage medium, as being kept in local internal memory or hard disk.Wherein, carry attribute information, carry Data Identification in result data in current event information, also can carry end value, Data Identification is used for this result data of unique identification.
Concrete, because real time computation system processes different event informations respectively by multiple treatment progress, in order to concurrent processing event information, improve the treatment effeciency of event information, therefore, the corresponding relation in the embodiment of the present application between this three of predeterminable attribute information, treatment progress and Data Identification.Wherein, a treatment progress may correspond to multiple attribute information, a treatment progress may correspond to multiple Data Identification, but attribute information can only a corresponding treatment progress, Data Identification can only a corresponding treatment progress, make the event information of a treatment progress process be the event information carrying the attribute information corresponding with this treatment progress, the result data obtained is the result data carrying Data Identification corresponding to this treatment progress.Also namely, a treatment progress only processes the event information carrying the attribute information corresponding with this treatment progress, and a treatment progress can only obtain the result data carrying the Data Identification corresponding with this treatment progress.
Thus, real time computation system process current event information the method obtaining result data are specifically as follows: real time computation system is according to the attribute information carried in current event information, the treatment progress that this attribute information is corresponding is given by current event information distribution, this treatment progress is then according to current event information, the result data carrying Data Identification corresponding to this attribute information that this locality is preserved is upgraded, and using the result data after renewal as current event information being processed to the result data obtained.
Such as, current event information is that logistics facility A performs the information of delivery operation at time T to the order that order note identification is 1, the attribute information then carried in current event information can for logistics facility identification information (namely, logistics facility A), the combination of one or more in order note identification, action type (that is, delivery operation).
Suppose that the attribute information that carries in above-mentioned current event information is for " logistics facility A+ deliver operation ", the treatment progress of attribute information " the logistics facility A+ deliver operation " correspondence preset is 1, corresponding Data Identification is " logistics facility A delivering amount ", then real time computation system by above-mentioned current event information distribution to treatment progress 1, Data Identification corresponding to " logistics facility A+ deliver operation " attribute information that treatment progress 1 is determined to carry in this current event information is " logistics facility A delivering amount ", therefore, that extracts local preservation carries the result data that Data Identification is " logistics facility A delivering amount ".
Suppose that the end value of carrying in the result data extracted is that n(represents that the delivering amount of the logistics facility A of current statistic is n), then treatment progress 1 is according to above-mentioned current event information, the end value of carrying in the result data of extraction is updated to n+1, and the result data after upgrading is processed as to above-mentioned current event information the result data obtained.
Because real time computation system in practical application scene may be also the server cluster system be made up of several calculation servers, therefore, Distributor can be added in real time computation system, and in Distributor the corresponding relation of preset attribute information and calculation server, the corresponding relation of preset attribute information and treatment progress and Data Identification in each calculation server.Event information is received by Distributor, and according to the attribute information carried in the corresponding relation of the attribute information preset and calculation server and the event information that receives, the event information received is sent to corresponding calculation server in real time computation system.Again by calculation server according to the attribute information preset and the corresponding relation for the treatment of progress and Data Identification, this event information is processed.
S102: the Data Identification carried according to this result data, determines the write moment that this Data Identification of record is corresponding.
In the embodiment of the present application, real time computation system can for each result data preserved, record the write moment that the Data Identification that carries in this result data is corresponding, wherein, the write moment that this Data Identification is corresponding is the last moment be written to by the result data carrying this Data Identification in database.Thus real time computation system obtains result data by above-mentioned steps S101 process current event information and after being kept at this locality, then can determine the write moment that the Data Identification that carries in this result data is corresponding.
S103: judge whether this write moment exceedes setting-up time length to the time span of current time, if so, performs step S104, otherwise performs step S105.
Wherein, this setting-up time length can set as required, such as, can be set as 10 seconds.Concrete, can determine the most high delay time of result data write into Databasce for business in advance, and by above-mentioned setting-up time length setting for being not more than this most high delay time.
S104: this result data that this locality is preserved is written in database, and the write moment corresponding for this Data Identification is updated to current time.
If it is determined that this write moment has exceeded setting-up time length to the time span of current time, what step S101 then obtained by real time computation system be kept at, and local result data is written in database, and the write moment corresponding for the Data Identification carried in this result data obtained is updated to current time.
Wherein, when result data is written in database, the Data Identification carried in the result data that real time computation system can obtain according to step S101, determine the data of carrying this Data Identification of preserving in database, and the data of preserving in established data storehouse are directly updated to this result data, also the end value of carrying in the data of preserving in established data storehouse can be updated to the end value of carrying in this result data.
Further, the Data Identification carried in the result data that real time computation system also can obtain according to step S101, determine the data of carrying this Data Identification of preserving in database, and judge that whether the data of carrying this Data Identification of preserving in database are identical with the result data that step S101 obtains, if identical, if the Data Update of preserving in established data storehouse different, is then this result data by the result data write into Databasce then without the need to being obtained by step S101.
S105: continue to preserve this result data in this locality, and next event information is processed.
If it is determined that this write moment does not exceed setting-up time length to the time span of current time, real time computation system then can continue this result data obtained at this locality preservation step S101, and continues to process next event information.
Pass through said method, result data is only just written to database when writing the time span of moment to current time and exceeding setting-up time length by real time computation system, if do not exceed setting-up time length, real time computation system then can temporarily in local saving result data, and process follow-up event information, and without the need to waiting until event information follow-up for reprocessing after result data write into Databasce, therefore under the prerequisite ensureing the data consistent in real time computation system and database, (data consistent in real time computation system and database can be ensured after postponing above-mentioned setting-up time length), the treatment effeciency of effective raising event information, the accumulation of event information can not be caused, reduce the probability broken down because event information piles up.
Further, due in logistics information process scene, be transferred to the feature that logistics event information has continuously, data volume is larger of real time computation system, and be not discrete, bulk transfer in batches in real time computation system, therefore, above-mentioned data processing method as shown in Figure 1 can be used in logistics information process scene, namely, current event information described in the embodiment of the present application and next event information comprise logistics event information, and described result data comprises logistics result data.Certainly, data processing method as shown in Figure 1 also can be used for that other have continuously, in the data processing scene of the larger feature of data volume, e.g., and commodity transaction information processing scene.
Below to apply data processing method as shown in Figure 1 in logistics information process scene, the effect of the data processing method that the embodiment of the present application provides is described.
In logistics information process scene, event information mainly comprises the event information that certain logistics facility carries out delivering, the event information pulling receipts, the event information of entry/exit terminal, the event information etc. signed for, and each event information may cause the renewal of multiple logistics result data.
Such as, for the event information of " logistics facility A performs delivery operation to order 1 ", suppose that the seller user of order 1 is user a, place of departure is city b, place of acceptance is city c, then this event information can cause the renewal (specifically the end value of carrying these 4 result datas all being added 1) of " logistics facility A delivering amount ", " delivering amount that logistics facility A provides for user a ", " delivering amount that logistics facility A delivers from city b ", " logistics facility A mails to the delivering amount of city c " these 4 logistics result datas.
For another example, for the event information of " logistics facility A perform pull to order 1 work of bringing drill to an end ", suppose that upper one is city d with pulling receipts, current be city e with pulling receipts, the next one is with pulling receipts for city f, then this event information can cause the renewal (specifically the end value of carrying these 4 result datas all being added 1) of " logistics facility A pulls receipts amount ", " logistics facility A to pull receipts from city d at city e pull receipts amount ", " logistics facility A at city e pull receipts amount ", " logistics facility A is from the delivering amount mailing to city f after city e pulls receipts " these 4 logistics result datas.
Still suppose that per second can to write the number of times of data maximum to database be 10000 times, average each logistics event information can cause the renewal of 4 logistics result datas, because logistics event information is not that discrete bulk transfer arrives real time computation system, but successively and to be transferred to real time computation system in a large number, suppose that the logistics event information be transferred in real time computation system per second is 20000, method so conventionally, treatment progress is for above-mentioned logistics event information, after needing 4 equal write into Databasces of logistics result data after by renewal, next logistics event information could be processed, per secondly can process at most 10000/4=2500 logistics event information, far below the speed receiving 20000 logistics event informations per second, therefore event information can be caused to pile up.And according to the application's method as shown in Figure 1, if predetermined time period is 10 seconds, then due to only write the moment to the time span of current time more than 10 seconds time ability write a logistics result data to database, if do not exceed, next logistics event information directly can be processed, therefore real time computation system is per second can process at most 10000/4 × 10=25000 event information, be greater than the speed receiving 20000 logistics event informations per second, therefore, in logistics information process scene, the data processing method that the embodiment of the present application the provides efficiency that at most can process 2500 event informations per second compared to real time computation system under the same terms in prior art, the treatment effeciency of event information is improved 10 times by the data processing method that the application provides, the accumulation of event information can not be caused.In addition, due to for logistics information process scene, the logistics result data that obtains after upgrading being postponed 10 seconds write into Databasces is the requirement that can meet logistics business data query, therefore, the logistics result data obtained after renewal is postponed 10 seconds write into Databasces and also can not cause larger impact to the inquiry of logistics business data.
Further, because real time computation system in the embodiment of the present application is in local saving result data.Therefore, in the step S101 shown in Fig. 1, real time computation system is processing current event information and is obtaining in the process of result data, according to current event information, when the result data carrying Data Identification corresponding to this attribute information (this attribute information is the attribute information carried in current event information) of this locality preservation is upgraded, specifically can extract the local result data carrying Data Identification corresponding to this attribute information preserved, and the result data extracted is upgraded, if do not extract the result data carrying this Data Identification from this locality, then can read from database carry this Data Identification data to this locality, and upgrade reading local data.In addition, if real time computation system does not also read the data of carrying this Data Identification from database, then can add at local (as local internal memory or hard disk) result data carrying this Data Identification, and according to this current event information, this result data added is upgraded, or, also the end value of carrying in the result data of interpolation can be set to default initial value, and according to this current event information, the result of carrying in the result data added only is upgraded.Wherein, this initial value preset can set according to actual needs, such as, be set as 0.
Such as, Data Identification corresponding to treatment progress 1 attribute information " logistics facility A+ deliver operation " determining to carry in current event information is for after " logistics facility A delivering amount ", can extract preserve in local internal memory carry Data Identification for the result data of " logistics facility A delivering amount ", and the result data extracted to be upgraded.The result data that Data Identification is " logistics facility A delivering amount " is carried if do not extracted from local internal memory, also be, do not exist in internal memory and carry the result data that Data Identification is " logistics facility A delivering amount ", then read from database and carry the data of Data Identification for " logistics facility A delivering amount " to local internal memory, and the result data read is upgraded.The data that Data Identification is " logistics facility A delivering amount " are carried if do not read from database, also be, all do not exist in local internal memory and in database and carry the data that Data Identification is " logistics facility A delivering amount ", then can add in local internal memory and carry the result data that Data Identification is " logistics facility A delivering amount ", and the end value of carrying in this result data added is set to 0, and this result data added is upgraded.
Said method can ensure that certain result data of real time computation system this locality is when upgrading, be written to after predetermined time period in database, also be, data in database can be consistent with the accordingly result data in real time computation system internal memory after predetermined time period, in other words, real time computation system have updated a result data and after predetermined time period, can inquire this data correct from database.But, certain treatment progress in real time computation system can be there is occur the situation of interrupting extremely in practical application scene, if certain treatment progress interrupts, so this treatment progress is kept at local result data and also can be cleared, and again by the method for result data write into Databasce after the delay predetermined time period adopting the application above-mentioned, if the result data of this locality was not yet written into database before being cleared, then there will be local result data to lose and the situation that causes the data in database and real time computation system inconsistent, this will cause can not inquiring correct data in a database, the accuracy of data processing can be reduced.
Therefore, in order to ensure the accuracy of the data stored in database, in the embodiment of the present application, treatment progress is obtaining result data by step S101 process current event information and after being kept at this locality, also the result data obtained will be recorded in journal file.Also supervising device can be set in real time computation system or outside this real time computation system, for monitoring each treatment progress of real time computation system.When monitoring treatment progress and being abnormal, then for each Data Identification that this treatment progress is corresponding, the result data carrying this Data Identification of this treatment progress last record in journal file is written in database.Like this, can treatment progress occur abnormal and interrupt time, still can ensure effectively can improve the accuracy of the data stored in database by the data consistent of data in database and real time computation system this locality.Wherein, when the result data obtained is recorded in journal file by treatment progress, can the temporal information obtaining result data be recorded in journal file, then real time computation system according to journal file by result data write into Databasce time, can for each Data Identification, by this treatment progress record carry this Data Identification and the temporal information of correspondence result data is the latest written in database.
Wherein, above-mentioned supervising device can be ZooKeeper assembly, and this ZooKeeper assembly can be arranged on the calculation server of real time computation system, also can be arranged on the other system independent of real time computation system.For a treatment progress of real time computation system, this treatment progress is when normally running, even if do not process any event information, this treatment progress is also be in running status instead of interruption status, therefore, according to ZooKeeper monitor component treatment progress, treatment progress can register a transient node when starting on ZooKeeper assembly, this transient node is only corresponding with this treatment progress, if this treatment progress is in running status, then this transient node exists always, once this treatment progress interrupts, then this transient node just disappears, thus ZooKeeper assembly can be monitored each transient node of self, once find that certain transient node disappears, can determine that the treatment progress that this transient node is corresponding occurs abnormal and interrupts, thus can notify that real time computation system is according to the record in journal file, for each Data Identification that this treatment progress is corresponding, the result data carrying this Data Identification of this treatment progress last record in journal file is written in database, to ensure the accuracy of data in database.
Below for Fig. 2 and Fig. 3 to illustrate in the embodiment of the present application under normal circumstances with the data processing method under abnormal conditions.
Suppose that real time computation system has 3 treatment progress, be respectively process 1, process 2 and process 3, the Data Identification that these 3 treatment progress are corresponding is respectively R1, R2, R3, and result data is all kept in local internal memory by each treatment progress.
Suppose that real time computation system have received two current event information, be respectively event 1 and event 3, the corresponding process 1 of attribute information that event 1 is carried and Data Identification R1, the corresponding process 3 of attribute information that event 3 is carried and Data Identification R3.Two end values are carried in carrying in the result data of R1 of preserving in current memory, first end value is the 100, second end value is 200, and two end values are also carried in carrying in the result data of R3 of preserving in current memory, first end value is the 300, second end value is 400.The write moment that R1 is corresponding is write moment corresponding on November 11st, 2011 12:00:00, R3 is 12:01:00 on November 11st, 2011, and current time is 12:01:02 on November 11st, 2011.Predetermined time period is 10 seconds.Then:
The schematic diagram of the data handling procedure under normal circumstances that Fig. 2 provides for the embodiment of the present application, in fig. 2, process 1 processes event 1, be then that the first end value in the result data of R1 is updated to 101 by 100 by carrying Data Identification in internal memory, and the second end value is updated to 201 by 200.Process 3 processes event 3, be then that the first end value in the result data of R3 is updated to 301 by 300 by carrying Data Identification in internal memory, and the second end value is updated to 401 by 400.Process 2 does not process event.
For process 1, it obtains the result data carrying Data Identification R1, and due to the write moment that R1 is corresponding be 12:00:00 on November 11st, 2011, current time is 12:01:02 on November 11st, 2011, therefore the write moment that R1 is corresponding has exceeded predetermined time period 10 seconds to the time span of current time, thus the result data carrying Data Identification R1 in internal memory is written in database by process 1, and the write moment upgrading the R1 that preserves in internal memory corresponding be current time 12:01:02 on November 11st, 2011.
For process 3, it obtains the result data carrying Data Identification R3, and due to the write moment that R3 is corresponding be 12:01:00 on November 11st, 2011, current time is 12:01:02 on November 11st, 2011, therefore the write moment that R1 is corresponding does not exceed predetermined time period 10 seconds to the time span of current time, thus the result data carrying Data Identification R3 in internal memory wouldn't be written in database by process 3, self event information to be dealt with next can be processed.
Certainly, the result data obtained and the temporal information (temporal information obtaining result data is current time 12:01:02 on November 11st, 2011) obtaining result data also will be recorded in journal file by process 1 and process 3.
In the process shown in Fig. 2, ZooKeeper assembly is to process 1, process 2 and process 3 are monitored, all exception is not monitored to these 3 processes, therefore real time computation system without the need to according to journal file by result data write into Databasce, and, the result data carrying Data Identification R1 is consistent in a database with in the internal memory of real time computation system, but, due to write moment corresponding to R3 to the time span of current time more than 10 seconds, therefore the result data that carries Data Identification R3 in the data of Data Identification R3 and the internal memory of real time computation system is carried in database and inconsistent.
The schematic diagram of data handling procedure under the abnormal conditions that Fig. 3 provides for the embodiment of the present application, in figure 3, suppose when 12:02:00 on the 11st November in 2011, there is exception and interrupt in process 3, then there is exception to process 3 in ZooKeeper monitor component, therefore notify each Data Identification of real time computation system for process 3 correspondence, determine the result data carrying this Data Identification of process 3 last record in journal file.
The result data carrying R3 of real time computation system determination process 3 last record in journal file is the result data of 12:01:02 record on November 11st, 2011, therefore the result data this being carried R3 is written in database, when occurring abnormal with guarantee process 3, the data stored in database are still accurately.
Certainly, after writing data according to journal file in database, also can be updated to the moment of write into Databasce the write moment corresponding for the Data Identification carried in the result data of write.Above-mentionedly only be designated example with the corresponding data of process 3 and be described, in fact, process 3 may correspond to multiple Data Identification.
It should be noted that, above-mentioned Fig. 2 and Fig. 3 is deployed in for the supervising device (ZooKeeper assembly as shown in Figures 2 and 3) of monitoring treatment progress that real time computation system is inner to be described for example, supervising device can also be disposed independent of real time computation system, just repeats no longer one by one here.
In addition, in order to ensure the accuracy of the data of preserving in database further, all result datas that this locality is preserved also when meeting preset trigger condition, can be written in database by real time computation system.Wherein, this trigger condition preset can be: according to the cycle of setting, when determining that the finish time of current period arrives, determines to meet the trigger condition preset.This be due in the embodiment of the present application for the result data carrying certain Data Identification that this locality of real time computation system is preserved, when only having local this result data preserved to be updated, just can judge whether the write moment that this Data Identification is corresponding exceedes predetermined time period to the time span of current time, thus according to judged result decision whether by the result data write into Databasce after renewal, therefore, inevitably there is following extreme case in practical application scene:
When a result data is updated, judge that the write moment does not exceed predetermined time period to the time span of current time, therefore this result data is not written into database, but this follow-up result data is not all updated within very long a period of time, this result data is therefore caused all not to be written in database in a very long time.
When there is above-mentioned extreme case, also the accuracy of the data stored in database can be caused to decline, therefore, real time computation system can according to the cycle of setting, when each end cycle, all be written in database by all result datas that this locality is preserved, the cycle of wherein this setting is greater than above-mentioned setting-up time length.Such as, in all result data write into Databasces that this locality can be preserved by real time computation system for every 24 hours, the inaccurate problem of data stored in database is caused to avoid occurring above-mentioned extreme case.
Above-mentioned default trigger condition can also be: when real time computation system receives the write instruction be written to by result data in internal memory in database, determines to meet the trigger condition preset, is written in database by all result datas that this locality is preserved.
Further, also can comprise: set specified services type in the database in advance, described database is after receiving the inquiry request of user, judge whether the type of service that will inquire about of carrying in inquiry request is default specified services type, if so, the write instruction be written to by the result data of specified services type in database is then sent.
Due in practical application scene, if certain business need real time computation system can not postpone the long time again by result data write into Databasce after obtaining result data.For providing more accurate to user, result data timely, specified services type can be set in a database (such as in advance, the type of service that requirement of real-time is high, type of service as required time delay short), database is after receiving the inquiry request of user, then can judge whether the type of service that will inquire about of carrying in inquiry request is default specified services type, if, then send the write instruction be written to by all result datas in database, now, can all result datas that this locality is preserved be written in database, data are provided again to user, during if not the specified services type preset, can directly according to the data that database query result provides it to inquire about to user.
The method of the data processing provided for the embodiment of the present application above, based on same thinking, the embodiment of the present application also provides a kind of device of data processing, as shown in Figure 4.
The apparatus structure schematic diagram of the data processing that Fig. 4 provides for the embodiment of the present application, specifically comprises:
Event processing module 401, for processing current event information, obtains result data, and is kept at this locality;
Determination module 402, for the Data Identification carried according to described result data, determines that the write moment that the described Data Identification of record is corresponding, said write moment are the last moment be written to by the result data carrying described Data Identification in database;
Judge module 403, for judging whether the said write moment exceedes setting-up time length to the time span of current time;
Writing module 404, for when the judged result of described judge module is for being, is written to the described result data that this locality is preserved in database, and the write moment corresponding for described Data Identification is updated to current time;
Described event processing module 401 also for, when the judged result of described judge module is no, continues to preserve described result data in this locality, and next event information processed.
Described current event information and described next event information comprise logistics event information, and described result data comprises logistics result data.
Carry attribute information in current event information, attribute information is corresponding with treatment progress and Data Identification;
Described event processing module 401 specifically for, according to the attribute information carried in described current event information, the treatment progress that described attribute information is corresponding is given by described current event information distribution, make described treatment progress according to described current event information, the result data carrying Data Identification corresponding to described attribute information that this locality is preserved is upgraded, using the result data after renewal as current event information being processed to the result data obtained.
Described event processing module 401 specifically for, extract the local result data carrying Data Identification corresponding to described attribute information preserved, and the result data extracted is upgraded, if do not extract the result data carrying described Data Identification from this locality, then from database, reading carries the data of described Data Identification to this locality, and upgrades reading local data.
The described result data obtained is recorded in journal file by described treatment progress;
Described device also comprises:
Monitoring module 405, for monitoring described treatment progress, when monitoring described treatment progress and being abnormal, for each Data Identification that described treatment progress is corresponding, the result data carrying this Data Identification of described treatment progress last record in described journal file is written in database.
Said write module 404 also for, when meeting preset trigger condition, be written in database by all result datas that this locality is preserved, wherein, described preset trigger condition comprises: receive and all result datas that this locality is preserved are written to write instruction in database.
Further, set specified services type in the database in advance, said write instruction is, receive the inquiry request of user at described database after, judge whether the type of service that will inquire about of carrying in inquiry request is default specified services type, if so, the write instruction be written to by the result data of specified services type in database is then sent.
The device of concrete above-mentioned data processing can be arranged in real time computation system.
The embodiment of the present application provides a kind of method and device of data processing, after the method real time computation system obtains result data to current event information processing, according to the Data Identification that this result data carries, judge whether the write moment that this Data Identification of record is corresponding exceedes setting-up time length to the time span of current time, if, then this result data is written in database, and the write moment corresponding for this Data Identification is updated to current time, otherwise continue to preserve this result data in this locality, and process next event information.Said method due to real time computation system only write the moment exceed setting-up time length to the time span of current time time ability result data is written to database, all the other time real time computation systems can process follow-up event information, therefore can under the prerequisite ensureing the data consistent in real time computation system and database, the treatment effeciency of effective raising event information, the accumulation of event information can not be caused, effectively can reduce the probability broken down because event information piles up.
In one typically configuration, computing equipment comprises one or more processor (CPU), input/output interface, network interface and internal memory.
Internal memory may comprise the volatile memory in computer-readable medium, and the forms such as random access memory (RAM) and/or Nonvolatile memory, as ROM (read-only memory) (ROM) or flash memory (flashRAM).Internal memory is the example of computer-readable medium.
Computer-readable medium comprises permanent and impermanency, removable and non-removable media can be stored to realize information by any method or technology.Information can be computer-readable instruction, data structure, the module of program or other data.The example of the storage medium of computing machine comprises, but be not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic RAM (DRAM), the random access memory (RAM) of other types, ROM (read-only memory) (ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc ROM (read-only memory) (CD-ROM), digital versatile disc (DVD) or other optical memory, magnetic magnetic tape cassette, tape magnetic rigid disk stores or other magnetic storage apparatus or any other non-transmitting medium, can be used for storing the information can accessed by computing equipment.According to defining herein, computer-readable medium does not comprise temporary computer readable media (transitorymedia), as data-signal and the carrier wave of modulation.
Also it should be noted that, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, commodity or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, commodity or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, commodity or the equipment comprising described key element and also there is other identical element.
It will be understood by those skilled in the art that the embodiment of the application can be provided as method, system or computer program.Therefore, the application can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the application can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code.
The foregoing is only the embodiment of the application, be not limited to the application.To those skilled in the art, the application can have various modifications and variations.Any amendment done within all spirit in the application and principle, equivalent replacement, improvement etc., within the right that all should be included in the application.

Claims (12)

1. a method for data processing, is characterized in that, comprising:
Current event information is processed, obtains result data, and be kept at this locality;
According to the Data Identification that described result data carries, determine that the write moment that the described Data Identification of record is corresponding, said write moment are the last moment be written to by the result data carrying described Data Identification in database;
Judge whether the said write moment exceedes setting-up time length to the time span of current time;
If so, then the described result data that this locality is preserved is written in database, and the write moment corresponding for described Data Identification is updated to current time;
Otherwise, continue to preserve described result data in this locality, and next event information processed.
2. the method for claim 1, is characterized in that, described current event information and described next event information comprise logistics event information, and described result data comprises logistics result data.
3. the method for claim 1, is characterized in that, carries attribute information in current event information, and attribute information is corresponding with treatment progress and Data Identification;
Current event information is processed, obtains result data, specifically comprise:
According to the attribute information carried in described current event information, give described current event information distribution the treatment progress that described attribute information is corresponding;
Described treatment progress, according to described current event information, upgrades the result data carrying Data Identification corresponding to described attribute information that this locality is preserved, using the result data after renewal as current event information being processed to the result data obtained.
4. method as claimed in claim 3, is characterized in that, upgrades, specifically comprise the result data carrying Data Identification corresponding to described attribute information that this locality is preserved:
Extract the local result data carrying Data Identification corresponding to described attribute information preserved, and the result data extracted is upgraded;
If do not extract the result data carrying described Data Identification from this locality, then from database, reading carries the data of described Data Identification to this locality, and upgrades reading local data.
5. method as claimed in claim 3, it is characterized in that, described method also comprises:
The described result data obtained is recorded in journal file by described treatment progress;
Described treatment progress is monitored;
When monitoring described treatment progress and being abnormal, for each Data Identification that described treatment progress is corresponding, the result data carrying this Data Identification of described treatment progress last record in described journal file is written in database.
6. the method for claim 1, is characterized in that, described method also comprises:
When meeting preset trigger condition, all result datas that this locality is preserved are written in database;
Wherein, described preset trigger condition comprises: receive and the result data that this locality is preserved is written to write instruction in database.
7. method as claimed in claim 6, it is characterized in that, described method also comprises:
Set specified services type in the database in advance, described database is after receiving the inquiry request of user, judge whether the type of service that will inquire about of carrying in inquiry request is default specified services type, if so, the write instruction be written to by the result data of specified services type in database is then sent.
8. a device for data processing, is characterized in that, comprising:
Event processing module, for processing current event information, obtains result data, and is kept at this locality;
Determination module, for the Data Identification carried according to described result data, determines that the write moment that the described Data Identification of record is corresponding, said write moment are the last moment be written to by the result data carrying described Data Identification in database;
Judge module, for judging whether the said write moment exceedes setting-up time length to the time span of current time;
Writing module, for when the judged result of described judge module is for being, is written to the described result data that this locality is preserved in database, and the write moment corresponding for described Data Identification is updated to current time;
Described event processing module also for, when the judged result of described judge module is no, continues to preserve described result data in this locality, and next event information processed.
9. device as claimed in claim 8, it is characterized in that, described current event information and described next event information comprise logistics event information, and described result data comprises logistics result data.
10. device as claimed in claim 8, it is characterized in that, carry attribute information in current event information, attribute information is corresponding with treatment progress and Data Identification;
Described event processing module specifically for, according to the attribute information carried in described current event information, the treatment progress that described attribute information is corresponding is given by described current event information distribution, make described treatment progress according to described current event information, the result data carrying Data Identification corresponding to described attribute information that this locality is preserved is upgraded, using the result data after renewal as current event information being processed to the result data obtained.
11. devices as claimed in claim 10, it is characterized in that, described event processing module specifically for, extract the local result data carrying Data Identification corresponding to described attribute information preserved, and the result data extracted is upgraded, if do not extract the result data carrying described Data Identification from this locality, then from database, reading carries the data of described Data Identification to this locality, and upgrades reading local data.
12. devices as claimed in claim 11, it is characterized in that, the described result data obtained is recorded in journal file by described treatment progress;
Described device also comprises:
Monitoring module, for monitoring described treatment progress, when monitoring described treatment progress and being abnormal, for each Data Identification that described treatment progress is corresponding, the result data carrying this Data Identification of described treatment progress last record in described journal file is written in database.
CN201410053223.4A 2014-02-17 2014-02-17 A kind of method and device of data processing Active CN104850556B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410053223.4A CN104850556B (en) 2014-02-17 2014-02-17 A kind of method and device of data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410053223.4A CN104850556B (en) 2014-02-17 2014-02-17 A kind of method and device of data processing

Publications (2)

Publication Number Publication Date
CN104850556A true CN104850556A (en) 2015-08-19
CN104850556B CN104850556B (en) 2018-06-29

Family

ID=53850203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410053223.4A Active CN104850556B (en) 2014-02-17 2014-02-17 A kind of method and device of data processing

Country Status (1)

Country Link
CN (1) CN104850556B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203531A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 A kind of document handling method and device
CN107784021A (en) * 2016-08-31 2018-03-09 北京国双科技有限公司 The method, apparatus and system that control data is deleted
CN109709587A (en) * 2018-12-27 2019-05-03 上海司南卫星导航技术股份有限公司 Multiple affair processing method and its circuit
CN110460902A (en) * 2018-05-08 2019-11-15 腾讯科技(深圳)有限公司 Playing method and device, storage medium, the electronic device of media information
CN113177032A (en) * 2021-06-29 2021-07-27 南京云联数科科技有限公司 Database-based data sharing method and system
CN113377792A (en) * 2021-06-10 2021-09-10 上海微盟企业发展有限公司 Data write-back method and device, electronic equipment and storage medium
CN113780017A (en) * 2021-09-03 2021-12-10 珠海格力电器股份有限公司 Near field communication triggering method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174863A1 (en) * 2007-11-30 2010-07-08 Yahoo! Inc. System for providing scalable in-memory caching for a distributed database
CN102075670A (en) * 2009-11-24 2011-05-25 新奥特(北京)视频技术有限公司 Log recording method and device for broadcast machine
CN102609337A (en) * 2012-01-19 2012-07-25 北京神州数码思特奇信息技术股份有限公司 Rapid data recovery method for memory database
CN102810050A (en) * 2011-05-31 2012-12-05 深圳市金蝶友商电子商务服务有限公司 Log data writing method and log system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174863A1 (en) * 2007-11-30 2010-07-08 Yahoo! Inc. System for providing scalable in-memory caching for a distributed database
CN102075670A (en) * 2009-11-24 2011-05-25 新奥特(北京)视频技术有限公司 Log recording method and device for broadcast machine
CN102810050A (en) * 2011-05-31 2012-12-05 深圳市金蝶友商电子商务服务有限公司 Log data writing method and log system
CN102609337A (en) * 2012-01-19 2012-07-25 北京神州数码思特奇信息技术股份有限公司 Rapid data recovery method for memory database

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203531A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 A kind of document handling method and device
CN107784021A (en) * 2016-08-31 2018-03-09 北京国双科技有限公司 The method, apparatus and system that control data is deleted
CN110460902A (en) * 2018-05-08 2019-11-15 腾讯科技(深圳)有限公司 Playing method and device, storage medium, the electronic device of media information
CN109709587A (en) * 2018-12-27 2019-05-03 上海司南卫星导航技术股份有限公司 Multiple affair processing method and its circuit
CN113377792A (en) * 2021-06-10 2021-09-10 上海微盟企业发展有限公司 Data write-back method and device, electronic equipment and storage medium
CN113177032A (en) * 2021-06-29 2021-07-27 南京云联数科科技有限公司 Database-based data sharing method and system
CN113780017A (en) * 2021-09-03 2021-12-10 珠海格力电器股份有限公司 Near field communication triggering method and device, electronic equipment and storage medium
CN113780017B (en) * 2021-09-03 2024-02-09 珠海格力电器股份有限公司 Near field communication triggering method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104850556B (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN104850556A (en) Method and device for data processing
CN109857737B (en) Cold and hot data storage method and device and electronic equipment
CN112801579B (en) Distribution task abnormity monitoring method, distribution task abnormity monitoring device, computer equipment and storage medium
AU2016351079A1 (en) Service processing method and apparatus
CN106997431B (en) Data processing method and device
CN111061752B (en) Data processing method and device and electronic equipment
CN104346264A (en) System and method for processing system event logs
CN109918382A (en) Data processing method, device, terminal and storage medium
CN110599267A (en) Electronic invoice billing method and device, computer readable storage medium and computer equipment
CN111695847A (en) Number section management method, system, equipment and storage medium for logistics electronic bill
CN112416972A (en) Real-time data stream processing method, device, equipment and readable storage medium
CN102339264A (en) Plug and play control method and system for satellite-borne electronic system equipment
CN112328602B (en) Method, device and equipment for writing data into Kafka
CN108108126B (en) Data processing method, device and equipment
CN106959906B (en) Information processing method, information processing device and electronic equipment
CN108958665B (en) Method and device for storing historical record information
CN109165305B (en) Characteristic value storage and retrieval method and device
CN110955587A (en) Method and device for determining equipment to be replaced
CN103713911A (en) Single version upgrading method and single version upgrading device
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN112948501B (en) Data analysis method, device and system
CN111694850B (en) Method, system, equipment and storage medium for recovering single number of logistics electronic bill
CN114896298A (en) Same type label data prediction method, terminal device and storage medium
CN111861502B (en) Information processing method, system, electronic device and storage medium
CN110737525B (en) Task processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant