CN104317958B - A kind of real-time data processing method and system - Google Patents

A kind of real-time data processing method and system Download PDF

Info

Publication number
CN104317958B
CN104317958B CN201410645385.7A CN201410645385A CN104317958B CN 104317958 B CN104317958 B CN 104317958B CN 201410645385 A CN201410645385 A CN 201410645385A CN 104317958 B CN104317958 B CN 104317958B
Authority
CN
China
Prior art keywords
data
time threshold
persistence
time
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410645385.7A
Other languages
Chinese (zh)
Other versions
CN104317958A (en
Inventor
郭涛
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410645385.7A priority Critical patent/CN104317958B/en
Publication of CN104317958A publication Critical patent/CN104317958A/en
Application granted granted Critical
Publication of CN104317958B publication Critical patent/CN104317958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of real-time data processing method and system, including:Real-time reception data are simultaneously placed data into queue;Circulation reads the data in queue and the data of reading is put into caching;Whether the time for judging to read data from queue exceedes default time threshold, if it is, the data in caching are polymerize, and by the data persistence after polymerization.The real-time data processing method and system of the present invention, preset time threshold, when handling queuing data, the circulation of processing queue is jumped out at interval at every fixed time, regularly persistence can be ensured, also can ensure can to balance the data aggregate of identical dimensional into as far as possible few bar number the contradictory relation between persistence pressure, the cycle of persistence and polymerization effect this three, avoid due to data volume is excessive or data go out enqueue speed it is inconsistent caused by a variety of adverse consequences.

Description

A kind of real-time data processing method and system
Technical field
The present invention relates to data persistence technical field, more particularly to a kind of real-time data processing method and system.
Background technology
In real-time data processing system, on the one hand largely the data with dimension can enter according to regular time window Queue, another aspect queue is in processing data, it is desirable to the number that quickly will have identical dimensional in window at the same time According to as far as possible few bar number is aggregated into, in the case where dimension is especially more, the number of entries after polymerization is also huge, it is contemplated that is held Data after longization polymerization are to the pressure of server, it is also desirable to reduce the data volume of each persistence as far as possible, that is to say, that one , it is necessary to first by the data aggregate having been placed in caching and be held when data in individual time window are not yet put into caching completely Longization.
Existing real-time data processing method is:The data of taking-up are put into caching and waited after by access from queue It polymerize simultaneously persistence, if queue is sky, by the data aggregate in caching and carries out persistence, if queue is not sky, It is put into after only the data in queue are taken out in caching.Enter the speed of enqueue when the speed for evidence of being fetched from queue exceedes data When, then prior art is soon once polymerize after may result in every taking-up a data and persistence, persistence can be special It is not frequent.And when the speed for evidence of being fetched in queue enters the speed of enqueue less than data, it is big to may result in data in queue Amount overstocks, in queue data if take always it is endless if, then can cause carry out polymerizeing and persistence always in caching, Data processed result is just no longer real-time, if the data in queue are finally sky after some time is overstock, then will The data volume after data aggregate in caching generally also can be more, and disposable storage mass data also can wink to the pressure of server Anaplasia is big.Above both of which is not desired, it is desirable to the number under the identical dimensional in window at the same time It is as far as possible few according to the bar number after polymerization, and can persistence as soon as possible, and data when storing every time are as far as possible few to mitigate pair The pressure of server.
The content of the invention
In view of this, the invention solves a technical problem be to provide a kind of real-time data processing method, by pre- If time threshold, carry out data persistence is spaced at every fixed time when handling queuing data.
A kind of real-time data processing method, including:The data are simultaneously put into queue by real-time reception data;Circulation is read The data of reading are simultaneously put into caching by data in the queue;Whether the time for judging to read data from the queue surpasses Default time threshold is crossed, if it is, the data in the caching are polymerize, and by the data persistence after polymerization.
According to one embodiment of present invention, further, judgement time of reading data from the queue is It is no exceed default time threshold, if, then the data in the caching are polymerize and by the lasting data after polymerization Change includes:The current time for starting the cycle over data in the reading queue is arranged to very first time T1 and recorded;Every time from When a data is taken out in the queue and being put into the caching, judge whether the time difference between current time and T1 exceedes institute Time threshold is stated, if it exceeds the time threshold, then jump out the circulation for reading data in the queue, by the data in caching It is polymerize, and by the data persistence after polymerization;If not less than the time threshold, progress is continued cycling through from the team The operation of lower a data is read in row.
According to one embodiment of present invention, further, methods described also includes:According to the polymerization result of setting most Small update cycle T2 and the minimum interval T3 settings time threshold for allowing persistence;Wherein, the time threshold is small In equal to T2, and it is more than or equal to T3.
According to one embodiment of present invention, further, the minimum update cycle of the polymerization result according to setting T2 and the minimum interval T3 of permission persistence set the time threshold to include:By being respectively set to for the time threshold Very first time threshold value and the second time threshold, and carry out the persistence processing of the data;Wherein, the value of very first time threshold value is T2, the value of the second time threshold are the half of very first time threshold value;The time threshold is obtained respectively as very first time threshold value The second data persistence result when first data persistence result and the time threshold are the second time threshold, and judge institute State the first data persistence result and whether the second data persistence result all meets default persistence inspection policies;Institute Stating data persistence result includes:Extent of polymerization and the line number of storage;If it is, the value of the second time threshold is assigned to first Time threshold, and the value of the second time threshold is arranged to the half of new very first time threshold value, at the persistence for carrying out data Manage and detect data persistence result, circulation, which performs, according to this sets very first time threshold value and the second time threshold and carry out the number According to persistence handle, until judging that the first data persistence result and the second data persistence result not all meet During default persistence inspection policies, then jump out execution and very first time threshold value and the second time threshold are set and carry out the data Persistence processing circulation, determine the time threshold for very first time threshold value now;If it is not, then determine the time Threshold value is T2.
According to one embodiment of present invention, further, the data in caching are polymerize according to different dimensions, By the data aggregate with identical dimensional into a data, and by the data storage after the completion of polymerization in database or file.
The invention solves another technical problem be to provide a kind of real-time data processing system, pass through preset time threshold Value, carry out data persistence is spaced when handling queuing data at every fixed time.
A kind of real-time data processing system, including:Data receipt unit, put for real-time reception data and by the data In enqueue;Data buffer storage unit, for circulating the data read in the queue and the data of reading being put into caching;Hold Whether longization unit, the time for judging to read data from the queue exceed default time threshold, if it is, will Data in the caching are polymerize, and by the data persistence after polymerization.
According to one embodiment of present invention, further, the data buffer storage unit includes digital independent record submodule Block, the current time for that will start the cycle over data in the reading queue are arranged to very first time T1 and recorded;It is described lasting Changing unit includes data aggregate submodule, for taking out a data simultaneously from the queue in each data buffer storage unit When being put into the caching, judge whether the time difference between current time and T1 exceedes the time threshold, if it exceeds described Time threshold, then the circulation for reading data in the queue is jumped out, the data in caching are polymerize, and by the number after polymerization According to persistence, if not less than the time threshold, the data buffer storage unit continues cycling through progress and read from the queue Remove the operation of a data.
According to one embodiment of present invention, further, time threshold setup unit, for the polymerization knot according to setting The minimum update cycle T2 of fruit and the minimum interval T3 of the persistence allowed set the time threshold;Wherein, when described Between threshold value be less than or equal to T2, and be more than or equal to T3.
According to one embodiment of present invention, further, the time threshold setup unit includes:Test threshold value setting Module, for the time threshold to be respectively set into very first time threshold value and the second time threshold, and carry out the data Persistence processing;Wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value; Persistence tentative module, for obtaining the time threshold respectively as the first data persistence result of very first time threshold value and institute State the second data persistence result when time threshold is the second time threshold, and judge the first data persistence result and Whether the second data persistence result all meets default persistence inspection policies;If it is, the experiment threshold value is set The value of second time threshold is assigned to very first time threshold value by order member, and when the value of the second time threshold is arranged into new first Between threshold value half, the persistence for carrying out data handles and detects data persistence result, and circulation performs according to this when setting first Between threshold value and the second time threshold and carry out the data persistence processing, until judging the first data persistence result When not all meeting default persistence inspection policies with the second data persistence result, then jump out execution and the very first time is set Threshold value and the second time threshold and the circulation for carrying out the persistence processing of the data;Determine the time threshold for the now One time threshold;If it is not, then determine that the time threshold is T2.
According to one embodiment of present invention, further, the persistence unit includes persistently processing submodule, is used for Data in caching are polymerize according to different dimensions, by the data aggregate with identical dimensional into a data, and will Data storage after the completion of polymerization is in database or file.
The real-time data processing method and system of the present invention, preset time threshold, when handling queuing data, every fixation Time interval jump out processing queue circulation, regularly persistence can be ensured, can also be ensured the data of identical dimensional As far as possible few bar number is aggregated into, the contradiction between persistence pressure, the cycle of persistence and polymerization effect this three can be balanced Relation, avoid due to data volume is excessive or data go out enqueue speed it is inconsistent caused by a variety of adverse consequences.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only Some embodiments of the present invention, for those of ordinary skill in the art, without having to pay creative labor, also Other accompanying drawings can be obtained according to these accompanying drawings.
Fig. 1 is the flow chart according to one embodiment of the real-time data processing method of the present invention;
Fig. 2 is the schematic diagram according to one embodiment of the real-time data processing system of the present invention.
Embodiment
The present invention is described more fully with reference to the accompanying drawings, wherein illustrating the exemplary embodiment of the present invention.Under The accompanying drawing that face will be combined in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, and shows So, described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Based on the reality in the present invention Example is applied, the every other embodiment that those of ordinary skill in the art are obtained under the premise of creative work is not made, is all belonged to In the scope of protection of the invention.
Fig. 1 be according to the present invention real-time data processing method one embodiment flow chart, as shown in Figure 1:
Step 101, real-time reception data and place data into queue.
Step 102, circulation reads the data in queue and the data of reading is put into caching.
Step 103, whether the time for judging to read data from queue exceedes default time threshold.
Step 104, if it is, the data in caching are polymerize, and by the data persistence after polymerization.
The real-time data processing method of the present invention, by way of preset time threshold, when handling queuing data, every Fixed time interval jumps out the circulation of processing queue, so ensures the data in the caching in Fixed Time Interval carrying out one It is secondary to polymerize and carry out persistence, when performing polymerization and storage, do not consider further that whether data have been handled in queue, as long as setting one Individual rational time threshold, both ensure that regularly persistence, can also ensure the data aggregate of identical dimensional into the greatest extent Measure few bar number.
In one embodiment, single thread can be started or multithreading completes the operation of data persistence, in order to ensure to locate The sequencing for managing data is the sequence consensus for receiving data, so the data received have been put into queue.By single line The current time that journey or multithreading start the cycle over data in reading queue is arranged to very first time T1 and recorded, and then starts to read The data read are simultaneously put into cache set by data in queue.
When taking out a data from queue every time and being put into caching, judge that the time difference between current time and T1 is It is no to exceed time threshold, if it exceeds time threshold, then jump out the circulation that data are currently read from queue, by the number in caching According to being polymerize, and by the data persistence after polymerization;If not less than time threshold, continue cycling through progress and read from queue Remove the operation of a data.
In one embodiment, the minimum update cycle T2 and permission persistence of the polymerization result according to setting are included Minimum interval T3 sets time threshold.The known minimum update cycle for needing polymerization result is T2 (i.e. minimum to need each T Will have data renewal) and the minimum interval T3 of persistence that is allowed, then it polymerize and the time cycle of persistence agrees Surely it is less than equal to T2, and is more than or equal to T3.
In one embodiment, a process or task can be activated individually, for automatically determining the time for needing to set Threshold value.Launching process or task, time threshold is respectively set to very first time threshold value and the second time threshold, line number of going forward side by side According to persistence handle.The value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value.
Data persistence result when time threshold is very first time threshold value and the second time threshold is obtained respectively, and is judged Whether 2 data persistence results all meet default persistence inspection policies.Data persistence result includes:Extent of polymerization and The line number of storage.
If it is, the value of the second time threshold is assigned into very first time threshold value, and the value of the second time threshold is set For the half of new very first time threshold value, the persistence for carrying out data is handled and detects data persistence result, and circulation according to this is held Row sets very first time threshold value and the second time threshold and carries out the persistence processing of data, until judging 2 data persistences When as a result not all meeting default persistence inspection policies, then circulation is jumped out, determine time threshold for very first time threshold now Value;If it is not, then determine that time threshold is T2.
In one embodiment, test the half that threshold value is set to T2 and T2 respectively at twice, test in the case of this two kinds The result of final data persistence, whether the line number and extent of polymerization of both storages of contrast are (by the data aggregate of identical dimensional Into one) difference, if the pressure of persistence is especially big, then persistence when what is more valued is the half that threshold value is T2 Line number is either with or without reduction at double, if if being more concerned with polymerization effect, then needs to analyze two points that threshold value is T2 A period of time extent of polymerization it is whether still preferable.
Threshold value is bigger, polymerization it is more preferable, the pressure of storage is bigger;Threshold value is smaller, and polymerization effect is poorer, the pressure of storage Also it is smaller.If the line number of polymerization effect and persistence is all more or less the same when threshold value is T2 and half T2, then will Threshold value is set to a quarter T2, and half T2 persistence result is compared, by that analogy ..., it is determined that set when Between threshold value principle, i.e. persistence inspection policies can be:Reduce by a persistence as far as possible to the pressure of server, ensure simultaneously Polymerization effect will get well, simultaneously as be real time data processing, also going out the speed of result certainly, The faster the better.
Start the program or process for time threshold setting, various parameters are set, including:T1, T2, persistence result Decision plan, the method for testing more than can be automatically found desired threshold value after a large amount of tests.
In one embodiment, the data in caching are polymerize according to different dimensions, by with identical dimensional Data aggregate is into a data, and by the data storage after the completion of polymerization in database or file.It can be answered according to specific With the multiple dimensions of scene setting, a dimension system is formed, possesses and accesses and filter true ability, including a complete dimension The expression of degree system coding, keyword and correlation.For example, time dimension includes the levels such as year, season, the moon, day, regional dimension Including levels such as country, province, city etc..Data in caching are polymerize according to different dimensions, after the completion of polymerization Data persistence, storage mode can use existing technology.
The real-time data processing method of the present invention, it can ensure the data in the caching in Fixed Time Interval carrying out one It is secondary to polymerize and carry out persistence, when performing polymerization and storage, do not consider further that whether data have been handled in queue, and can be by certainly Dynamic one rational time threshold of test setting for carrying out data persistence, both ensure that regularly persistence, and can also ensure By the data aggregate of identical dimensional into as far as possible few bar number.
As shown in Fig. 2 the present invention provides a kind of real-time data processing system, including:Data receipt unit 21, data buffer storage Unit 22 and persistence unit 23.The real-time reception data of data receipt unit 21 are simultaneously placed data into queue.Data buffer storage list The circulation of member 22 reads the data in queue and the data of reading is put into caching.Persistence unit 23 judges to read from queue Whether the time of data exceedes default time threshold, if it is, the data in caching are polymerize, and by after polymerization Data persistence.
In one embodiment, the digital independent record sub module of data buffer storage unit 22, which will start the cycle over, reads in queue The current time of data is arranged to very first time T1 and recorded.The data aggregate submodule of persistence unit 23 delays in each data When memory cell takes out a data from queue and is put into caching, when judging whether the time difference between current time and T1 exceedes Between threshold value, if it exceeds time threshold, then jump out the circulation that data are currently read from queue, the data in caching gathered Close, and by the data persistence after polymerization, if not less than time threshold, data buffer storage unit 22 continues cycling through progress from team The operation of lower a data is read in row.
In one embodiment, time threshold setup unit 24 according to the minimum update cycle T2 of the polymerization result of setting and The minimum interval T3 of the persistence of permission sets time threshold.Time threshold is less than or equal to T2, and is more than or equal to T3.
In one embodiment, time threshold setup unit 24 includes:Test threshold setting module 241 and persistence experiment Module 242.Time threshold is respectively set to very first time threshold value and the second time threshold by experiment threshold setting module 241, And the persistence processing of data is carried out, the value of very first time threshold value is T2, and the value of the second time threshold is very first time threshold value Half.
Persistence tentative module 242 obtains data when time threshold is very first time threshold value and the second time threshold respectively Persistence result, and judge whether 2 data persistence results all meet default persistence inspection policies.If it is, examination Test threshold setting unit 241 and the value of second time threshold is assigned to very first time threshold value, and the value of the second time threshold is set For the half of new very first time threshold value, the persistence for carrying out data is handled and detects data persistence result, and circulation according to this is held Row sets very first time threshold value and the second time threshold and carries out the persistence processing of data, until judging 2 data persistences When as a result not all meeting default persistence inspection policies, then circulation is jumped out;Determine time threshold for very first time threshold now Value;If it is not, then determine that time threshold is T2.
In a real-time example, the lasting processing submodule of persistence unit 241 is by the data in caching according to different Dimension is polymerize, by the data aggregate with identical dimensional into a data, and by the data storage after the completion of polymerization in number According in storehouse or file.
The real-time data processing method and system of the present invention, preset time threshold, when handling queuing data, every fixation Time interval jump out processing queue circulation, regularly persistence can be ensured, can also be ensured the data of identical dimensional As far as possible few bar number is aggregated into, the contradiction between persistence pressure, the cycle of persistence and polymerization effect this three can be balanced Relation, avoid due to data volume is excessive or data go out enqueue speed it is inconsistent caused by a variety of bad consequences.
The method and system of the present invention may be achieved in many ways.For example, can by software, hardware, firmware or Software, hardware, firmware any combinations come realize the present invention method and system.The said sequence of the step of for method is only Order described in detail above is not limited in order to illustrate, the step of method of the invention, is especially said unless otherwise It is bright.In addition, in certain embodiments, the present invention can be also embodied as recording program in the recording medium, these programs include For realizing the machine readable instructions of the method according to the invention.Thus, the present invention also covering storage is used to perform according to this hair The recording medium of the program of bright method.
Description of the invention provides for the sake of example and description, and is not exhaustively or by the present invention It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Select and retouch State embodiment and be to more preferably illustrate the principle and practical application of the present invention, and one of ordinary skill in the art is managed The present invention is solved so as to design the various embodiments with various modifications suitable for special-purpose.

Claims (8)

  1. A kind of 1. real-time data processing method, it is characterised in that including:
    The data are simultaneously put into queue by real-time reception data;
    Circulation reads the data in the queue and the data of reading is put into caching;
    Whether the time for judging to read data from the queue exceedes default time threshold, if it is, by the caching In data polymerize, and by the data persistence after polymerization;
    Wherein, it is described judgement from the queue read data time whether exceed default time threshold, if, then will Data in the caching are polymerize and are included the data persistence after polymerization:
    The current time for starting the cycle over data in the reading queue is arranged to very first time T1 and recorded;
    When taking out a data from the queue every time and being put into the caching, judgement is put into corresponding during the caching work as Whether the time difference between preceding time and T1 exceedes the time threshold, if it exceeds the time threshold, then jump out and read institute The circulation of data in queue is stated, the data in caching are polymerize, and by the data persistence after polymerization;If not less than institute Time threshold is stated, then continues cycling through the operation for carrying out that lower a data is read from the queue.
  2. 2. the method as described in claim 1, it is characterised in that methods described also includes:
    When according to the minimum update cycle T2 of the polymerization result of setting and allowing the minimum interval T3 of persistence to set described Between threshold value;Wherein, the time threshold is less than or equal to T2, and is more than or equal to T3.
  3. 3. method as claimed in claim 2, it is characterised in that minimum update cycle T2 of the polymerization result according to setting The time threshold is set to include with the minimum interval T3 of permission persistence:
    The time threshold is respectively set to very first time threshold value and the second time threshold, and carries out the lasting of the data Change is handled;Wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the half of very first time threshold value;
    It is to obtain the first data persistence result that the time threshold is very first time threshold value and the time threshold respectively The second data persistence result during two time thresholds, and judge that the first data persistence result and second data are held Whether longization result all meets default persistence inspection policies;The data persistence result includes:Extent of polymerization and storage Line number;
    If it is, the value of the second time threshold is assigned into very first time threshold value, and the value of the second time threshold is arranged to new Very first time threshold value half, the persistence for carrying out data handles and detects data persistence result, circulates execution according to this and sets Put very first time threshold value and the second time threshold and carry out the persistence processing of the data, until judging that first data are held When longization result and the second data persistence result not all meet default persistence inspection policies, then jump out this execution and set Put very first time threshold value and the second time threshold and carry out the circulation of the persistence processing of the data, determine the time threshold For very first time threshold value now;
    If it is not, then determine that the time threshold is T2.
  4. 4. the method as described in claim 1, it is characterised in that the data by the caching are polymerize and will be poly- Data persistence after conjunction includes:
    Data in caching are polymerize according to different dimensions, by the data aggregate with identical dimensional into a data, And by the data storage after the completion of polymerization in database or file.
  5. A kind of 5. real-time data processing system, it is characterised in that including:
    Data receipt unit, it is put into for real-time reception data and by the data in queue;
    Data buffer storage unit, for circulating the data read in the queue and the data of reading being put into caching;
    Whether persistence unit, the time for judging to read data from the queue exceed default time threshold, if It is then to be polymerize the data in the caching, and by the data persistence after polymerization;
    Wherein, the data buffer storage unit includes digital independent record sub module, is read for that will start the cycle in the queue The current time of data is arranged to very first time T1 and recorded;
    The persistence unit includes data aggregate submodule, for being taken in each data buffer storage unit from the queue When going out a data and being put into the caching, time difference when judging to be put into the caching between corresponding current time and T1 is It is no to exceed the time threshold, if it exceeds the time threshold, then jump out the circulation for reading data in the queue, will cache In data polymerize, and by the data persistence after polymerization, if not less than the time threshold, the data buffer storage Unit continues cycling through the operation for carrying out that lower a data is read from the queue.
  6. 6. system as claimed in claim 5, it is characterised in that also include:
    Time threshold setup unit, for the polymerization result according to setting minimum update cycle T2 and allow persistence most Small time interval T3 sets the time threshold;Wherein, the time threshold is less than or equal to T2, and is more than or equal to T3.
  7. 7. system as claimed in claim 6, it is characterised in that:
    The time threshold setup unit includes:
    Threshold setting module is tested, for the time threshold to be respectively set into very first time threshold value and the second time threshold Value, and carry out the persistence processing of the data;Wherein, the value of very first time threshold value is T2, and the value of the second time threshold is the The half of one time threshold;
    Persistence tentative module, for obtaining the first data persistence result that the time threshold is very first time threshold value respectively The second data persistence result when with the time threshold being the second time threshold, and judge the first data persistence knot Whether fruit and the second data persistence result all meet default persistence inspection policies;If it is, the experiment threshold The value of second time threshold is assigned to very first time threshold value by value setup unit, and the value of the second time threshold is arranged into new The half of one time threshold, the persistence for carrying out data handle and detect data persistence result, and circulation, which performs, according to this sets the One time threshold and the second time threshold and the persistence processing for carrying out the data, until judging first data persistence When as a result not all meeting default persistence inspection policies with the second data persistence result, then jump out this and perform and set the One time threshold and the second time threshold and the circulation for carrying out the persistence processing of the data;It is this to determine the time threshold When very first time threshold value;If it is not, then determine that the time threshold is T2.
  8. 8. the system as described in claim 5 to 7 any one, it is characterised in that:
    The persistence unit includes persistently processing submodule, for the data in caching to be gathered according to different dimensions Close, by the data aggregate with identical dimensional into a data, and by the data storage after the completion of polymerization in database or file In.
CN201410645385.7A 2014-11-12 2014-11-12 A kind of real-time data processing method and system Active CN104317958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410645385.7A CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410645385.7A CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Publications (2)

Publication Number Publication Date
CN104317958A CN104317958A (en) 2015-01-28
CN104317958B true CN104317958B (en) 2018-01-16

Family

ID=52373190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410645385.7A Active CN104317958B (en) 2014-11-12 2014-11-12 A kind of real-time data processing method and system

Country Status (1)

Country Link
CN (1) CN104317958B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294444B (en) * 2015-05-27 2020-02-18 阿里巴巴集团控股有限公司 Data processing method and equipment
CN106815274B (en) * 2015-12-02 2022-02-18 中兴通讯股份有限公司 Hadoop-based log data mining method and system
CN106911589B (en) 2015-12-22 2020-04-24 阿里巴巴集团控股有限公司 Data processing method and equipment
CN108063746B (en) * 2016-11-08 2020-05-15 北京国双科技有限公司 Data processing method, client, server and system
CN108268523B (en) * 2016-12-30 2021-06-22 北京国双科技有限公司 Database aggregation processing method and device
CN108664322A (en) * 2017-03-29 2018-10-16 广东神马搜索科技有限公司 Data processing method and system
CN109117432A (en) * 2017-06-22 2019-01-01 北京京东尚科信息技术有限公司 A kind of method and device obtaining data
CN107589907B (en) * 2017-08-10 2019-12-13 深圳壹账通智能科技有限公司 Data processing method, electronic device and computer readable storage medium
CN110955654B (en) * 2018-09-26 2023-10-31 北京国双科技有限公司 Multi-dimensional index calculation method and device
CN109508244B (en) * 2018-10-18 2021-03-12 北京新唐思创教育科技有限公司 Data processing method and computer readable medium
CN110297602B (en) * 2019-06-14 2023-03-07 北京奇艺世纪科技有限公司 Real-time data processing method and device
CN111026746B (en) * 2019-10-16 2023-07-07 中国平安财产保险股份有限公司 Method, device, computer equipment and storage medium for calling multi-channel data
CN110837511B (en) * 2019-11-15 2022-08-23 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN113312434A (en) * 2021-07-29 2021-08-27 北京快立方科技有限公司 Pre-polymerization treatment method for massive structured data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894119A (en) * 2008-10-20 2010-11-24 亚马逊技术股份有限公司 Mass data storage system for monitoring
CN102291243A (en) * 2011-09-09 2011-12-21 中兴通讯股份有限公司 Service processing server, system and method
CN102760101A (en) * 2012-05-22 2012-10-31 中国科学院计算技术研究所 SSD-based (Solid State Disk) cache management method and system
CN103020175A (en) * 2012-11-28 2013-04-03 深圳市华为技术软件有限公司 Method and device for acquiring aggregated data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894119A (en) * 2008-10-20 2010-11-24 亚马逊技术股份有限公司 Mass data storage system for monitoring
CN102291243A (en) * 2011-09-09 2011-12-21 中兴通讯股份有限公司 Service processing server, system and method
CN102760101A (en) * 2012-05-22 2012-10-31 中国科学院计算技术研究所 SSD-based (Solid State Disk) cache management method and system
CN103020175A (en) * 2012-11-28 2013-04-03 深圳市华为技术软件有限公司 Method and device for acquiring aggregated data

Also Published As

Publication number Publication date
CN104317958A (en) 2015-01-28

Similar Documents

Publication Publication Date Title
CN104317958B (en) A kind of real-time data processing method and system
CN107864071B (en) Active safety-oriented dynamic data acquisition method, device and system
US7970755B2 (en) Test execution of user SQL in database server code
CN108089814B (en) Data storage method and device
CN106528418B (en) A kind of test method and device
CN105610654B (en) Server, and method and system for policy online testing
JP6857598B2 (en) Coverage test support device and coverage test support method
CN107122126B (en) Data migration method, device and system
JP7335430B2 (en) AUTOMATIC MODELING METHOD AND DEVICE FOR TARGET DETECTION MODEL
CN110601900A (en) Network fault early warning method and device
CN106155646B (en) Method and device for limiting external application program to call service
CN106131641A (en) A kind of barrage control method, system and Android intelligent television
CN109882834A (en) The operation data monitoring method and device of boiler plant
CN104063307A (en) Software testing method and system
CN107241650A (en) A kind of method of quick positioning playing Caton phenomenon reason
CN107526551A (en) A kind of I/O request processing method, device and the equipment of CPU multinuclears
CN105094742B (en) A kind of method and apparatus for writing data
CN109189673B (en) Software test scheme, and method and device for determining test cases
CN102546235A (en) Performance diagnosis method and system of web-oriented application under cloud computing environment
CN106681837A (en) Data sheet based data eliminating method and device
CN116028930B (en) Defense detection method and system for energy data in Internet of things
CN106484539A (en) A kind of determination method of processor cache characteristic
CN106933750A (en) For data in multi-level buffer and the verification method and device of state
CN113642667B (en) Picture enhancement strategy determination method and device, electronic equipment and storage medium
CN109117091A (en) A kind of SSD equipment mount point acquisition methods and relevant apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Patentee before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and system for processing data in real time

Effective date of registration: 20190531

Granted publication date: 20180116

Pledgee: Shenzhen Black Horse World Investment Consulting Co.,Ltd.

Pledgor: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Registration number: 2019990000503

PP01 Preservation of patent right

Effective date of registration: 20240604

Granted publication date: 20180116