CN104317958A - Method and system for processing data in real time - Google Patents

Method and system for processing data in real time Download PDF

Info

Publication number
CN104317958A
CN104317958A CN201410645385.7A CN201410645385A CN104317958A CN 104317958 A CN104317958 A CN 104317958A CN 201410645385 A CN201410645385 A CN 201410645385A CN 104317958 A CN104317958 A CN 104317958A
Authority
CN
China
Prior art keywords
data
time threshold
persistence
time
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410645385.7A
Other languages
Chinese (zh)
Other versions
CN104317958B (en
Inventor
郭涛
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410645385.7A priority Critical patent/CN104317958B/en
Publication of CN104317958A publication Critical patent/CN104317958A/en
Application granted granted Critical
Publication of CN104317958B publication Critical patent/CN104317958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for processing data in real time. The method comprises receiving data in real time and putting the data into a queue, reading the data in the queue circularly and putting the read data into a cache, determining whether the time of reading data from the queue exceeds a preset time threshold, and if so, aggregating the data in the cache and persisting the aggregated data. According to the method and the system for processing data in real time, the time threshold is preset; when the queue data are processed, jumping out of the cycle of the processing queue is carried out at a fixed time interval, and therefore, regular persistence can be guaranteed, the data of the same dimensionality can be aggregated as less as possibly, the contradictory relation of the persistence pressure, the persistence period and the aggregation effect can be balanced, and various negative effects due to large data size or different speeds of data going in and out of the queue can be avoided.

Description

A kind of real-time data processing method and system
Technical field
The present invention relates to data persistence technical field, particularly relate to a kind of real-time data processing method and system.
Background technology
In Real-Time Data Handling System (RTDHS), a large amount of data with dimension can enter queue according to regular time window on the one hand, queue is when processing data on the other hand, expect fast the data aggregate in window at the same time with identical dimensional to be become as far as possible few number, when dimension is many especially, number of entries after polymerization is also huge, consider that the data after persistence polymerization are to the pressure of server, also the data volume as far as possible reducing each persistence is needed, that is when the data in a time window put into buffer memory not yet completely, need first will put into the data aggregate of buffer memory and carry out persistence.
Existing real-time data processing method is: after fetching data from queue, the data of taking-up are put into the medium to be polymerized and persistence of buffer memory, if queue is empty, then the data aggregate in buffer memory is carried out persistence, if queue is not empty, then put into buffer memory after only the data in queue being taken out.When the speed of fetching data from queue exceed data enter the speed of queue time, then prior art is just once polymerized after may causing often taking out data and persistence immediately, and persistence can be frequent especially.And when the speed of fetching data in queue be less than data enter the speed of queue time, data in queue may be caused to overstock in a large number, if data cannot not got completely in queue always, then can cause can not carrying out polymerization persistence in buffer memory always, data processed result is just no longer real-time, if the data in queue are empty overstocking for a long time finally, so usually also can be more by the data volume after the data aggregate in buffer memory, disposable storage mass data also can become large instantaneously to the pressure of server.Above both of these case is not expect, the number after the data aggregate under the identical dimensional it is desirable that at the same time in window is as far as possible few, and can persistence as soon as possible, and each data when storing are as far as possible few to alleviate the pressure to server.
Summary of the invention
In view of this, the technical matters that the present invention will solve is to provide a kind of real-time data processing method, by preset time threshold, carries out data persistence when processing queue data every fixed time interval.
A kind of real-time data processing method, comprising: described data are also put into queue by real-time reception data; Circulation is read the data in described queue and the data of reading is put into buffer memory; Judge whether the time of reading data from described queue exceedes default time threshold, if so, then the data in described buffer memory are polymerized, and by the data persistence after polymerization.
According to one embodiment of present invention, further, the time that data are read in described judgement from described queue whether exceed default time threshold if, then the data in described buffer memory are carried out being polymerized and the data persistence after polymerization are comprised: by the circulation current time that reads data in described queue be set to very first time T1 and record; From described queue, data are taken out and when putting into described buffer memory each, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization; If do not exceed described time threshold, then continue circulation and carry out the operation of reading next data from described queue.
According to one embodiment of present invention, further, described method also comprises: arrange described time threshold according to the minimum update cycle T2 of the polymerization result of setting and the minimum interval T3 of permission persistence; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
According to one embodiment of present invention, further, the minimum update cycle T2 of the described polymerization result according to setting and allow the minimum interval T3 of persistence to arrange described time threshold to comprise: described time threshold is set to very first time threshold value and the second time threshold respectively, and carry out the persistence process of described data, wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value, obtain the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies, described data persistence result comprises: the line number of extent of polymerization and storage, if, then the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out to perform and very first time threshold value and the second time threshold are set and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now, if not, then determine that described time threshold is T2.
According to one embodiment of present invention, further, the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.
Another technical matters that the present invention will solve is to provide a kind of Real-Time Data Handling System (RTDHS), by preset time threshold, carries out data persistence when processing queue data every fixed time interval.
A kind of Real-Time Data Handling System (RTDHS), comprising: data receipt unit, for real-time reception data, described data put into queue; Data buffer storage unit, the data of reading are also put into buffer memory by the data read in described queue for circulating; Data in described buffer memory, for judging whether the time of reading data from described queue exceedes default time threshold, if so, are then polymerized by persistence unit, and by the data persistence after polymerization.
According to one embodiment of present invention, further, described data buffer storage unit comprises digital independent record sub module, for the current time starting data in the described queue of circulation reading is set to very first time T1 and record; Described persistence unit comprises data aggregate submodule, for taking out data at each described data buffer storage unit from described queue and when putting into described buffer memory, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization, if do not exceed described time threshold, then described data buffer storage unit continues circulation and carries out the operation of reading next data from described queue.
According to one embodiment of present invention, further, time threshold setup unit, for arranging described time threshold according to the minimum update cycle T2 of polymerization result of setting and the minimum interval T3 of the persistence of permission; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
According to one embodiment of present invention, further, described time threshold setup unit comprises: test threshold value setting module, for described time threshold is set to very first time threshold value and the second time threshold respectively, and carries out the persistence process of described data, wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value, persistence tentative module, for obtaining the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies, if, then the value of the second time threshold is assigned to very first time threshold value by described test threshold setting unit, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out to perform and very first time threshold value and the second time threshold are set and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now, if not, then determine that described time threshold is T2.
According to one embodiment of present invention, further, described persistence unit comprises and processes submodule lastingly, for the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional is become data, and the data after being polymerized are stored in database or file.
Real-time data processing method of the present invention and system, preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, regular persistence can be ensured, also the data aggregate of identical dimensional can be ensured to become as far as possible few number, the contradictory relation between persistence pressure, cycle of persistence and this three of polymerization effect can be balanced, avoid due to data volume is excessive or data are come in and gone out inconsistent all adverse consequencess caused of queue speed.
Description of the invention provides in order to example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is in order to principle of the present invention and practical application are better described, and enables those of ordinary skill in the art understand the present invention thus design the various embodiments with various amendment being suitable for special-purpose.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the process flow diagram of an embodiment according to real-time data processing method of the present invention;
Fig. 2 is the schematic diagram of an embodiment according to Real-Time Data Handling System (RTDHS) of the present invention.
Embodiment
With reference to the accompanying drawings the present invention is described more fully, exemplary embodiment of the present invention is wherein described.Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the process flow diagram of an embodiment according to real-time data processing method of the present invention, as shown in Figure 1:
Step 101, data are also put into queue by real-time reception data.
Step 102, circulation is read the data in queue and the data of reading is put into buffer memory.
Step 103, judges whether the time of reading data from queue exceedes default time threshold.
Data in buffer memory if so, are then polymerized by step 104, and by the data persistence after polymerization.
Real-time data processing method of the present invention, by the mode of preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, data in buffer memory in Fixed Time Interval are once polymerized and are carried out persistence by such guarantee, when performing polymerization and storing, no longer consider in queue, whether data process, as long as arrange a rational time threshold, both ensure that regular persistence, the data aggregate of identical dimensional also can be ensured to become as far as possible few number.
In one embodiment, the operation that single-threaded or multithreading completes data persistence can be started, in order to ensure that the sequencing processing data is the sequence consensus receiving data, so the data received have been put into queue.Single-threaded or multithreading are started circulate and read the current time of data in queue and be set to very first time T1 and record, then start the data that read in queue and the data read are put into cache set.
From queue, data are taken out and when putting into buffer memory each, judge mistiming between current time and T1 whether overtime threshold value, if overtime threshold value, then jump out current circulation of reading data from queue, data in buffer memory are polymerized, and by the data persistence after polymerization; If non-overtime threshold value, then continue circulation and carry out the operation of reading next data from queue.
In one embodiment, the minimum update cycle T2 according to the polymerization result of setting and the minimum interval T3 setup times threshold value allowing persistence is also comprised.The known minimum interval T3 of persistence needing the minimum update cycle of polymerization result to be T2 (namely each T of minimum needs can have Data Update) and to allow, the time cycle of being so polymerized also persistence is certainly less than and equals T2, and is more than or equal to T3.
In one embodiment, a process or task can be started separately, for automatically determining the time threshold needing to arrange.Startup process or task, be set to very first time threshold value and the second time threshold respectively by time threshold, and carry out the persistence process of data.The value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value.
Respectively acquisition time threshold value be very first time threshold value and the second time threshold time data persistence result, and judge whether 2 data persistence results all meet default persistence inspection policies.Data persistence result comprises: the line number of extent of polymerization and storage.
If, then the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out data according to this, until when judging that 2 data persistence results not all meet default persistence inspection policies, then jump out circulation, determine that time threshold is very first time threshold value now; If not, then determine that time threshold is T2.
In one embodiment, test half threshold value being set to T2 and T2 at twice respectively, test the result of final data persistence in these two kinds of situations, the line number that both contrasts store and extent of polymerization (whether the data aggregate of identical dimensional being become one) difference, if the pressure of persistence is large especially, to be threshold value the be T2 so more valued two/for the moment the line number of persistence have the minimizing do not had at double, if what more pay close attention to is polymerization effect, so Water demand threshold value be T2 two/whether extent of polymerization still desirable for the moment.
Threshold value is larger, and the unreasonable of polymerization is thought, the pressure of storage is larger; Threshold value is less, and polymerization effect is poorer, and the pressure of storage is also less.If the line number of polymerization effect and persistence is all more or less the same when threshold value is T2 and 1/2nd T2, so threshold value is set to 1/4th T2, compare with the persistence result of 1/2nd T2, by that analogy ...., determine the principle of the time threshold arranged, namely persistence inspection policies can be: minimizing persistence is to the pressure of server as far as possible, ensures that polymerization effect will be got well simultaneously, simultaneously because be real time data processing, The faster the better certainly also will to go out the speed of result.
Start the program or process that are used for time threshold setting, various parameter is set, comprises: the decision plan of T1, T2, persistence result, according to above method of testing after testing in a large number, the threshold value of expectation can be automatically found.
In one embodiment, the data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.Can multiple dimension be set according to concrete application scenarios, forms a dimension system, possess access and filter true ability, comprise a complete dimension system coding, keyword and relevant expression.Such as, time dimension comprises the levels such as year, season, the moon, day, and regional dimension comprises levels such as country, province, city etc.Data in buffer memory be polymerized according to different dimensions, by the data persistence after being polymerized, storage mode can adopt existing technology.
Real-time data processing method of the present invention, can ensure the data in the buffer memory in Fixed Time Interval are once polymerized and carry out persistence, when performing polymerization and storing, no longer consider in queue, whether data process, and by automatically carrying out test setting rational time threshold of data persistence, both ensure that regular persistence, the data aggregate of identical dimensional also can be ensured to become as far as possible few number.
As shown in Figure 2, the invention provides a kind of Real-Time Data Handling System (RTDHS), comprising: data receipt unit 21, data buffer storage unit 22 and persistence unit 23.Data are also put into queue by data receipt unit 21 real-time reception data.Data buffer storage unit 22 circulation is read the data in queue and the data of reading is put into buffer memory.Persistence unit 23 judges whether the time of reading data from queue exceedes default time threshold, if so, then the data in buffer memory is polymerized, and by the data persistence after polymerization.
In one embodiment, the current time starting data in circulation reading queue is set to very first time T1 and record by the digital independent record sub module of data buffer storage unit 22.The data aggregate submodule of persistence unit 23 is when each data buffer storage unit takes out data and puts into buffer memory from queue, judge mistiming between current time and T1 whether overtime threshold value, if overtime threshold value, then jump out current circulation of reading data from queue, data in buffer memory are polymerized, and by the data persistence after polymerization, if non-overtime threshold value, then data buffer storage unit 22 continues circulation and carries out the operation of reading next data from queue.
In one embodiment, the minimum update cycle T2 of polymerization result of time threshold setup unit 24 according to setting and the minimum interval T3 setup times threshold value of the persistence of permission.Time threshold is less than or equal to T2, and is more than or equal to T3.
In one embodiment, time threshold setup unit 24 comprises: test threshold value setting module 241 and persistence tentative module 242.Time threshold is set to very first time threshold value and the second time threshold by test threshold value setting module 241 respectively, and carries out the persistence process of data, and the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value.
Persistence tentative module 242 respectively acquisition time threshold value be very first time threshold value and the second time threshold time data persistence result, and judge whether 2 data persistence results all meet default persistence inspection policies.If, then test threshold setting unit 241 and the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out data according to this, until when judging that 2 data persistence results not all meet default persistence inspection policies, then jump out circulation; Determine that time threshold is very first time threshold value now; If not, then determine that time threshold is T2.
In a real-time example, the data in buffer memory are polymerized according to different dimensions by the lasting process submodule of persistence unit 241, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.
Real-time data processing method of the present invention and system, preset time threshold, when processing queue data, the circulation of processing queue is jumped out every fixed time interval, regular persistence can be ensured, also the data aggregate of identical dimensional can be ensured to become as far as possible few number, the contradictory relation between persistence pressure, cycle of persistence and this three of polymerization effect can be balanced, avoid due to data volume is excessive or data the are come in and gone out inconsistent all bad consequence caused of queue speed.
Method and system of the present invention may be realized in many ways.Such as, any combination by software, hardware, firmware or software, hardware, firmware realizes method and system of the present invention.Said sequence for the step of method is only to be described, and the step of method of the present invention is not limited to above specifically described order, unless specifically stated otherwise.In addition, in certain embodiments, can be also record program in the recording medium by the invention process, these programs comprise the machine readable instructions for realizing according to method of the present invention.Thus, the present invention also covers the recording medium stored for performing the program according to method of the present invention.

Claims (10)

1. a real-time data processing method, is characterized in that, comprising:
Described data are also put into queue by real-time reception data;
Circulation is read the data in described queue and the data of reading is put into buffer memory;
Judge whether the time of reading data from described queue exceedes default time threshold, if so, then the data in described buffer memory are polymerized, and by the data persistence after polymerization.
2. the method for claim 1, is characterized in that, the time that data are read in described judgement from described queue whether exceed default time threshold if, then the data in described buffer memory are carried out being polymerized and the data persistence after polymerization are comprised:
The current time starting data in the described queue of circulation reading is set to very first time T1 and record;
From described queue, data are taken out and when putting into described buffer memory each, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization; If do not exceed described time threshold, then continue circulation and carry out the operation of reading next data from described queue.
3. method as claimed in claim 2, it is characterized in that, described method also comprises:
According to the minimum update cycle T2 of the polymerization result of setting and the minimum interval T3 of permission persistence, described time threshold is set; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
4. method as claimed in claim 3, is characterized in that, the minimum update cycle T2 of the described polymerization result according to setting and the minimum interval T3 of permission persistence arranges described time threshold and comprises:
Described time threshold is set to very first time threshold value and the second time threshold respectively, and carries out the persistence process of described data; Wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value;
Obtain the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies; Described data persistence result comprises: the line number of extent of polymerization and storage;
If, then the value of the second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out this to perform and arrange very first time threshold value and the second time threshold and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now,
If not, then determine that described time threshold is T2.
5. the method for claim 1, is characterized in that, described data in described buffer memory are carried out being polymerized and by polymerization after data persistence comprise:
Data in buffer memory are polymerized according to different dimensions, the data aggregate with identical dimensional are become data, and the data after being polymerized are stored in database or file.
6. a Real-Time Data Handling System (RTDHS), is characterized in that, comprising:
Described data are put into queue for real-time reception data by data receipt unit;
Data buffer storage unit, the data of reading are also put into buffer memory by the data read in described queue for circulating;
Data in described buffer memory, for judging whether the time of reading data from described queue exceedes default time threshold, if so, are then polymerized by persistence unit, and by the data persistence after polymerization.
7. system as claimed in claim 6, is characterized in that:
Described data buffer storage unit comprises digital independent record sub module, for the current time starting data in the described queue of circulation reading is set to very first time T1 and record;
Described persistence unit comprises data aggregate submodule, for taking out data at each described data buffer storage unit from described queue and when putting into described buffer memory, judge whether the mistiming between current time and T1 exceedes described time threshold, if exceed described time threshold, then jump out the circulation of reading data in described queue, data in buffer memory are polymerized, and by the data persistence after polymerization, if do not exceed described time threshold, then described data buffer storage unit continues circulation and carries out the operation of reading next data from described queue.
8. system as claimed in claim 7, is characterized in that, also comprise:
Time threshold setup unit, for arranging described time threshold according to the minimum update cycle T2 of polymerization result of setting and the minimum interval T3 of the persistence of permission; Wherein, described time threshold is less than or equal to T2, and is more than or equal to T3.
9. system as claimed in claim 8, is characterized in that:
Described time threshold setup unit comprises:
Test threshold value setting module, for described time threshold is set to very first time threshold value and the second time threshold respectively, and carries out the persistence process of described data; Wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value;
Persistence tentative module, for obtaining the second data persistence result when the first data persistence result that described time threshold is very first time threshold value and described time threshold are the second time threshold respectively, and judge whether described first data persistence result and described second data persistence result all meet default persistence inspection policies, if, then the value of the second time threshold is assigned to very first time threshold value by described test threshold setting unit, and the value of the second time threshold is set to the half of new very first time threshold value, carry out the persistence process of data and detect data persistence result, circulation performs and arranges very first time threshold value and the second time threshold and the persistence process carrying out described data according to this, until when judging that described first data persistence result and described second data persistence result not all meet default persistence inspection policies, then jump out this to perform and arrange very first time threshold value and the second time threshold and the circulation carrying out the persistence process of described data, determine that described time threshold is very first time threshold value now, if not, then determine that described time threshold is T2.
10. the system as described in claim 6 to 9 any one, is characterized in that:
Described persistence unit comprises and processes submodule lastingly, for the data in buffer memory being polymerized according to different dimensions, the data aggregate with identical dimensional is become data, and the data after being polymerized is stored in database or file.
CN201410645385.7A 2014-11-12 2014-11-12 A kind of real-time data processing method and system Active CN104317958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410645385.7A CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410645385.7A CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Publications (2)

Publication Number Publication Date
CN104317958A true CN104317958A (en) 2015-01-28
CN104317958B CN104317958B (en) 2018-01-16

Family

ID=52373190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410645385.7A Active CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Country Status (1)

Country Link
CN (1) CN104317958B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294444A (en) * 2015-05-27 2017-01-04 阿里巴巴集团控股有限公司 A kind of data processing method and equipment
WO2017092444A1 (en) * 2015-12-02 2017-06-08 中兴通讯股份有限公司 Log data mining method and system based on hadoop
WO2017107793A1 (en) * 2015-12-22 2017-06-29 阿里巴巴集团控股有限公司 Data processing method and device
CN107589907A (en) * 2017-08-10 2018-01-16 上海壹账通金融科技有限公司 Data processing method, electronic equipment and computer-readable recording medium
CN108063746A (en) * 2016-11-08 2018-05-22 北京国双科技有限公司 Processing method, client, server and the system of data
CN108268523A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 Database aggregation processing method and device
CN108664322A (en) * 2017-03-29 2018-10-16 广东神马搜索科技有限公司 Data processing method and system
CN109117432A (en) * 2017-06-22 2019-01-01 北京京东尚科信息技术有限公司 A kind of method and device obtaining data
CN109508244A (en) * 2018-10-18 2019-03-22 北京新唐思创教育科技有限公司 Data processing method and computer-readable medium
CN110297602A (en) * 2019-06-14 2019-10-01 北京奇艺世纪科技有限公司 A kind of processing method and processing device of real time data
CN110837511A (en) * 2019-11-15 2020-02-25 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN110955654A (en) * 2018-09-26 2020-04-03 北京国双科技有限公司 Multi-dimensional index calculation method and device
CN111026746A (en) * 2019-10-16 2020-04-17 中国平安财产保险股份有限公司 Method and device for multi-channel data calling, computer equipment and storage medium
CN113312434A (en) * 2021-07-29 2021-08-27 北京快立方科技有限公司 Pre-polymerization treatment method for massive structured data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894119A (en) * 2008-10-20 2010-11-24 亚马逊技术股份有限公司 Mass data storage system for monitoring
CN102291243A (en) * 2011-09-09 2011-12-21 中兴通讯股份有限公司 Service processing server, system and method
CN102760101A (en) * 2012-05-22 2012-10-31 中国科学院计算技术研究所 SSD-based (Solid State Disk) cache management method and system
CN103020175A (en) * 2012-11-28 2013-04-03 深圳市华为技术软件有限公司 Method and device for acquiring aggregated data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894119A (en) * 2008-10-20 2010-11-24 亚马逊技术股份有限公司 Mass data storage system for monitoring
CN102291243A (en) * 2011-09-09 2011-12-21 中兴通讯股份有限公司 Service processing server, system and method
CN102760101A (en) * 2012-05-22 2012-10-31 中国科学院计算技术研究所 SSD-based (Solid State Disk) cache management method and system
CN103020175A (en) * 2012-11-28 2013-04-03 深圳市华为技术软件有限公司 Method and device for acquiring aggregated data

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294444B (en) * 2015-05-27 2020-02-18 阿里巴巴集团控股有限公司 Data processing method and equipment
CN106294444A (en) * 2015-05-27 2017-01-04 阿里巴巴集团控股有限公司 A kind of data processing method and equipment
WO2017092444A1 (en) * 2015-12-02 2017-06-08 中兴通讯股份有限公司 Log data mining method and system based on hadoop
WO2017107793A1 (en) * 2015-12-22 2017-06-29 阿里巴巴集团控股有限公司 Data processing method and device
US11055272B2 (en) 2015-12-22 2021-07-06 Alibaba Group Holding Limited Data processing method and apparatus
CN108063746A (en) * 2016-11-08 2018-05-22 北京国双科技有限公司 Processing method, client, server and the system of data
CN108063746B (en) * 2016-11-08 2020-05-15 北京国双科技有限公司 Data processing method, client, server and system
CN108268523A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 Database aggregation processing method and device
CN108664322A (en) * 2017-03-29 2018-10-16 广东神马搜索科技有限公司 Data processing method and system
CN109117432A (en) * 2017-06-22 2019-01-01 北京京东尚科信息技术有限公司 A kind of method and device obtaining data
CN107589907A (en) * 2017-08-10 2018-01-16 上海壹账通金融科技有限公司 Data processing method, electronic equipment and computer-readable recording medium
CN107589907B (en) * 2017-08-10 2019-12-13 深圳壹账通智能科技有限公司 Data processing method, electronic device and computer readable storage medium
CN110955654A (en) * 2018-09-26 2020-04-03 北京国双科技有限公司 Multi-dimensional index calculation method and device
CN110955654B (en) * 2018-09-26 2023-10-31 北京国双科技有限公司 Multi-dimensional index calculation method and device
CN109508244B (en) * 2018-10-18 2021-03-12 北京新唐思创教育科技有限公司 Data processing method and computer readable medium
CN109508244A (en) * 2018-10-18 2019-03-22 北京新唐思创教育科技有限公司 Data processing method and computer-readable medium
CN110297602A (en) * 2019-06-14 2019-10-01 北京奇艺世纪科技有限公司 A kind of processing method and processing device of real time data
CN110297602B (en) * 2019-06-14 2023-03-07 北京奇艺世纪科技有限公司 Real-time data processing method and device
CN111026746A (en) * 2019-10-16 2020-04-17 中国平安财产保险股份有限公司 Method and device for multi-channel data calling, computer equipment and storage medium
CN111026746B (en) * 2019-10-16 2023-07-07 中国平安财产保险股份有限公司 Method, device, computer equipment and storage medium for calling multi-channel data
CN110837511A (en) * 2019-11-15 2020-02-25 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN110837511B (en) * 2019-11-15 2022-08-23 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN113312434A (en) * 2021-07-29 2021-08-27 北京快立方科技有限公司 Pre-polymerization treatment method for massive structured data

Also Published As

Publication number Publication date
CN104317958B (en) 2018-01-16

Similar Documents

Publication Publication Date Title
CN104317958A (en) Method and system for processing data in real time
CN108108127B (en) File reading method and system
CN106649346B (en) Data repeatability checking method and device
CN108089814B (en) Data storage method and device
CN107122126B (en) Data migration method, device and system
CN106407207B (en) Real-time newly-added data updating method and device
US11704036B2 (en) Deduplication decision based on metrics
RU2016150418A (en) DEVICE AND METHOD FOR CLUSTER STORAGE
CN104166621B (en) A kind of data processing method and device
CN108255886B (en) Evaluation method and device of recommendation system
JP2015519807A5 (en)
CN104091164A (en) Face picture name recognition method and system
CN106648839B (en) Data processing method and device
CN110737717A (en) database migration method and device
CN111125088B (en) Multi-level data processing method and device
CN105701645A (en) Material management method and device
US11250001B2 (en) Accurate partition sizing for memory efficient reduction operations
US20220005004A1 (en) Method and device for blockchain transaction tracing
CN106681837A (en) Data sheet based data eliminating method and device
CN104883394A (en) Method and system for server load balancing
CN115049618A (en) Hardware surface detection method and system
CN108664322A (en) Data processing method and system
CN110008382B (en) Method, system and equipment for determining TopN data
CN110427557A (en) Main broadcaster's recommended method, device, electronic equipment and computer readable storage medium
CN105912452A (en) Automated data analysis method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Patentee before: Beijing Guoshuang Technology Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and system for processing data in real time

Effective date of registration: 20190531

Granted publication date: 20180116

Pledgee: Shenzhen Black Horse World Investment Consulting Co., Ltd.

Pledgor: Beijing Guoshuang Technology Co.,Ltd.

Registration number: 2019990000503