Summary of the invention
In view of this, the technical matters that the present invention will solve is to provide a kind of real-time data processing method, by preset time threshold, carries out data persistence when processing queue data every fixed time interval.
A kind of real-time data processing method, comprising: described data are also put into queue by real-time reception data; Circulation is read the data in described queue and the data of reading is put into buffer memory; Judge whether the time of reading data from described queue exceedes default time threshold, if so, then the data in described buffer memory are polymerized, and by the data persistence after polymerization.
According to one embodiment of present invention, further, the time that data are read in described judgement from described queue whether exceed default time threshold if, then the data in described buffer memory are carried out being polymerized and the data persistence after polymerization are comprised: by the circulation current time that reads data in described queue be set to very first time T1 and record; From described queue, data are taken out and when putting into described buffer memory each, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization; If do not exceed described time threshold, then continue circulation and carry out the operation of reading next data from described queue.
According to one embodiment of present invention, further, described method also comprises: arrange described time threshold according to the minimum update cycle T2 of the polymerization result of setting and the minimum interval T3 of permission persistence; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
According to one embodiment of present invention, further, the minimum update cycle T2 of the described polymerization result according to setting and allow the minimum interval T3 of persistence to arrange described time threshold to comprise: described time threshold is set to very first time threshold value and the second time threshold respectively, and carry out the persistence process of described data, wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value, obtain the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies, described data persistence result comprises: the line number of extent of polymerization and storage, if, then the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out to perform and very first time threshold value and the second time threshold are set and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now, if not, then determine that described time threshold is T2.
According to one embodiment of present invention, further, the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.
Another technical matters that the present invention will solve is to provide a kind of Real-Time Data Handling System (RTDHS), by preset time threshold, carries out data persistence when processing queue data every fixed time interval.
A kind of Real-Time Data Handling System (RTDHS), comprising: data receipt unit, for real-time reception data, described data put into queue; Data buffer storage unit, the data of reading are also put into buffer memory by the data read in described queue for circulating; Data in described buffer memory, for judging whether the time of reading data from described queue exceedes default time threshold, if so, are then polymerized by persistence unit, and by the data persistence after polymerization.
According to one embodiment of present invention, further, described data buffer storage unit comprises digital independent record sub module, for the current time starting data in the described queue of circulation reading is set to very first time T1 and record; Described persistence unit comprises data aggregate submodule, for taking out data at each described data buffer storage unit from described queue and when putting into described buffer memory, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization, if do not exceed described time threshold, then described data buffer storage unit continues circulation and carries out the operation of reading next data from described queue.
According to one embodiment of present invention, further, time threshold setup unit, for arranging described time threshold according to the minimum update cycle T2 of polymerization result of setting and the minimum interval T3 of the persistence of permission; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
According to one embodiment of present invention, further, described time threshold setup unit comprises: test threshold value setting module, for described time threshold is set to very first time threshold value and the second time threshold respectively, and carries out the persistence process of described data, wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value, persistence tentative module, for obtaining the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies, if, then the value of the second time threshold is assigned to very first time threshold value by described test threshold setting unit, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out to perform and very first time threshold value and the second time threshold are set and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now, if not, then determine that described time threshold is T2.
According to one embodiment of present invention, further, described persistence unit comprises and processes submodule lastingly, for the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional is become data, and the data after being polymerized are stored in database or file.
Real-time data processing method of the present invention and system, preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, regular persistence can be ensured, also the data aggregate of identical dimensional can be ensured to become as far as possible few number, the contradictory relation between persistence pressure, cycle of persistence and this three of polymerization effect can be balanced, avoid due to data volume is excessive or data are come in and gone out inconsistent all adverse consequencess caused of queue speed.
Description of the invention provides in order to example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is in order to principle of the present invention and practical application are better described, and enables those of ordinary skill in the art understand the present invention thus design the various embodiments with various amendment being suitable for special-purpose.
Embodiment
With reference to the accompanying drawings the present invention is described more fully, exemplary embodiment of the present invention is wherein described.Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the process flow diagram of an embodiment according to real-time data processing method of the present invention, as shown in Figure 1:
Step 101, data are also put into queue by real-time reception data.
Step 102, circulation is read the data in queue and the data of reading is put into buffer memory.
Step 103, judges whether the time of reading data from queue exceedes default time threshold.
Data in buffer memory if so, are then polymerized by step 104, and by the data persistence after polymerization.
Real-time data processing method of the present invention, by the mode of preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, data in buffer memory in Fixed Time Interval are once polymerized and are carried out persistence by such guarantee, when performing polymerization and storing, no longer consider in queue, whether data process, as long as arrange a rational time threshold, both ensure that regular persistence, the data aggregate of identical dimensional also can be ensured to become as far as possible few number.
In one embodiment, the operation that single-threaded or multithreading completes data persistence can be started, in order to ensure that the sequencing processing data is the sequence consensus receiving data, so the data received have been put into queue.Single-threaded or multithreading are started circulate and read the current time of data in queue and be set to very first time T1 and record, then start the data that read in queue and the data read are put into cache set.
From queue, data are taken out and when putting into buffer memory each, judge mistiming between current time and T1 whether overtime threshold value, if overtime threshold value, then jump out current circulation of reading data from queue, data in buffer memory are polymerized, and by the data persistence after polymerization; If non-overtime threshold value, then continue circulation and carry out the operation of reading next data from queue.
In one embodiment, the minimum update cycle T2 according to the polymerization result of setting and the minimum interval T3 setup times threshold value allowing persistence is also comprised.The known minimum interval T3 of persistence needing the minimum update cycle of polymerization result to be T2 (namely each T of minimum needs can have Data Update) and to allow, the time cycle of being so polymerized also persistence is certainly less than and equals T2, and is more than or equal to T3.
In one embodiment, a process or task can be started separately, for automatically determining the time threshold needing to arrange.Startup process or task, be set to very first time threshold value and the second time threshold respectively by time threshold, and carry out the persistence process of data.The value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value.
Respectively acquisition time threshold value be very first time threshold value and the second time threshold time data persistence result, and judge whether 2 data persistence results all meet default persistence inspection policies.Data persistence result comprises: the line number of extent of polymerization and storage.
If, then the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out data according to this, until when judging that 2 data persistence results not all meet default persistence inspection policies, then jump out circulation, determine that time threshold is very first time threshold value now; If not, then determine that time threshold is T2.
In one embodiment, test half threshold value being set to T2 and T2 at twice respectively, test the result of final data persistence in these two kinds of situations, the line number that both contrasts store and extent of polymerization (whether the data aggregate of identical dimensional being become one) difference, if the pressure of persistence is large especially, to be threshold value the be T2 so more valued two/for the moment the line number of persistence have the minimizing do not had at double, if what more pay close attention to is polymerization effect, so Water demand threshold value be T2 two/whether extent of polymerization still desirable for the moment.
Threshold value is larger, and the unreasonable of polymerization is thought, the pressure of storage is larger; Threshold value is less, and polymerization effect is poorer, and the pressure of storage is also less.If the line number of polymerization effect and persistence is all more or less the same when threshold value is T2 and 1/2nd T2, so threshold value is set to 1/4th T2, compare with the persistence result of 1/2nd T2, by that analogy ...., determine the principle of the time threshold arranged, namely persistence inspection policies can be: minimizing persistence is to the pressure of server as far as possible, ensures that polymerization effect will be got well simultaneously, simultaneously because be real time data processing, The faster the better certainly also will to go out the speed of result.
Start the program or process that are used for time threshold setting, various parameter is set, comprises: the decision plan of T1, T2, persistence result, according to above method of testing after testing in a large number, the threshold value of expectation can be automatically found.
In one embodiment, the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.Can multiple dimension be set according to concrete application scenarios, forms a dimension system, possess access and filter true ability, comprise a complete dimension system coding, keyword and relevant expression.Such as, time dimension comprises the levels such as year, season, the moon, day, and regional dimension comprises levels such as country, province, city etc.Data in buffer memory be polymerized according to different dimensions, by the data persistence after being polymerized, storage mode can adopt existing technology.
Real-time data processing method of the present invention, can ensure the data in the buffer memory in Fixed Time Interval are once polymerized and carry out persistence, when performing polymerization and storing, no longer consider in queue, whether data process, and by automatically carrying out test setting rational time threshold of data persistence, both ensure that regular persistence, the data aggregate of identical dimensional also can be ensured to become as far as possible few number.
As shown in Figure 2, the invention provides a kind of Real-Time Data Handling System (RTDHS), comprising: data receipt unit 21, data buffer storage unit 22 and persistence unit 23.Data are also put into queue by data receipt unit 21 real-time reception data.Data buffer storage unit 22 circulation is read the data in queue and the data of reading is put into buffer memory.Persistence unit 23 judges whether the time of reading data from queue exceedes default time threshold, if so, then the data in buffer memory is polymerized, and by the data persistence after polymerization.
In one embodiment, the current time starting data in circulation reading queue is set to very first time T1 and record by the digital independent record sub module of data buffer storage unit 22.The data aggregate submodule of persistence unit 23 is when each data buffer storage unit takes out data and puts into buffer memory from queue, judge mistiming between current time and T1 whether overtime threshold value, if overtime threshold value, then jump out current circulation of reading data from queue, data in buffer memory are polymerized, and by the data persistence after polymerization, if non-overtime threshold value, then data buffer storage unit 22 continues circulation and carries out the operation of reading next data from queue.
In one embodiment, the minimum update cycle T2 of polymerization result of time threshold setup unit 24 according to setting and the minimum interval T3 setup times threshold value of the persistence of permission.Time threshold is less than or equal to T2, and is more than or equal to T3.
In one embodiment, time threshold setup unit 24 comprises: test threshold value setting module 241 and persistence tentative module 242.Time threshold is set to very first time threshold value and the second time threshold by test threshold value setting module 241 respectively, and carries out the persistence process of data, and the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value.
Persistence tentative module 242 respectively acquisition time threshold value be very first time threshold value and the second time threshold time data persistence result, and judge whether 2 data persistence results all meet default persistence inspection policies.If, then test threshold setting unit 241 and the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out data according to this, until when judging that 2 data persistence results not all meet default persistence inspection policies, then jump out circulation; Determine that time threshold is very first time threshold value now; If not, then determine that time threshold is T2.
In a real-time example, the data in buffer memory are polymerized according to different dimensions by the lasting process submodule of persistence unit 241, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.
Real-time data processing method of the present invention and system, preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, regular persistence can be ensured, also the data aggregate of identical dimensional can be ensured to become as far as possible few number, the contradictory relation between persistence pressure, cycle of persistence and this three of polymerization effect can be balanced, avoid due to data volume is excessive or data the are come in and gone out inconsistent all bad consequence caused of queue speed.
Method and system of the present invention may be realized in many ways.Such as, any combination by software, hardware, firmware or software, hardware, firmware realizes method and system of the present invention.Said sequence for the step of method is only to be described, and the step of method of the present invention is not limited to above specifically described order, unless specifically stated otherwise.In addition, in certain embodiments, can be also record program in the recording medium by the invention process, these programs comprise the machine readable instructions for realizing according to method of the present invention.Thus, the present invention also covers the recording medium stored for performing the program according to method of the present invention.