CN104424186B - The method and device of persistence is realized in a kind of stream calculation application - Google Patents
The method and device of persistence is realized in a kind of stream calculation application Download PDFInfo
- Publication number
- CN104424186B CN104424186B CN201310362269.XA CN201310362269A CN104424186B CN 104424186 B CN104424186 B CN 104424186B CN 201310362269 A CN201310362269 A CN 201310362269A CN 104424186 B CN104424186 B CN 104424186B
- Authority
- CN
- China
- Prior art keywords
- persistence
- message
- starting
- offset
- batch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Retry When Errors Occur (AREA)
Abstract
The method and device of persistence, including present lot information consumption success are realized in being applied this application discloses a kind of stream calculation, according to the first starting offset and the persistence interval pre-set, judges whether to need to carry out persistence operation;When needing to carry out persistence operation, persistence processing is carried out according to the message position of the second start offset amount instruction, and after persistence success, the first starting offset and the second start offset amount are updated to the start offset amount of next batch message.Persistence operation in the application is carried out behind persistence interval, the time interval of disk persistence is increased, so as to greatly improve real-time computational efficiency.In fault recovery, at most only need to consume the message of the batch in persistence interval again, avoid the performance bottleneck that frequent write magnetic dribbling comes in existing synchronous persistence, the message throughput performance calculated in real time improves an order of magnitude;Meanwhile the delay reduction that fault recovery is brought real-time is not interfered with into second level.
Description
Technical field
The application is related to stream calculation technology, is the method and device for realizing persistence in espespecially a kind of stream calculation application.
Background technology
Generally, data flow is referred to as message in stream calculation, and the series of computation, processing to data flow are referred to as consuming.
Stream calculation product is mainly used in calculating in real time.Calculate and carried out generally in internal memory in real time, and result of calculation will pass through
Certain approach is preserved and shown.At present, it is main to use caching or be persisted to disk such as database(Non- internal storage data
Storehouse)Middle two ways preserves to result of calculation.Wherein, because cache way does not have physical disk input/output(I/O),
Therefore, cache way has unsurpassed message handling capacity;But because result of calculation does not have persistence, cache way
Almost there is no fault-tolerant ability, that is to say, that once occurring that application program is interrupted, server is delayed machine, caching situations such as being cleared, protect
The result of calculation deposited in the buffer will be unable to recover.And by the way of being persisted in disk, it is possible to achieve highest level
Fault tolerance, still, it is persisted to disk and is related to substantial amounts of disk write, which in turn reduces the calculating speed of stream calculation, performs
Efficiency is about than the order of magnitude lower by the way of caching.
Fig. 1 is data flow schematic diagram in existing basic error-tolerance type stream calculation application, as shown in figure 1, message-oriented middleware
The message flow for collecting pocket transmission is one by one.For the ease of fault-tolerant, the stream calculation cluster pair in usual stream calculation product such as Fig. 1
It is in units of batch that message flow, which carries out consumption, i.e., some message is bundled in a batch, each batch has one
Individual unique mark(ID).For the message of a batch, after every a piece of news only in batch is all successfully consumed, this
The message of batch is just marked as successfully being consumed;As long as there is a piece of news not consumed successfully in a batch, whole batch
Secondary message will be resend by message-oriented middleware, be consumed again by stream calculation cluster.
The message flow that final process is crossed, which is stored to disk, is referred to as persistence, and this step is most important for fault recovery
's.Once occur application program interrupt, server delay machine situations such as, only persistence operation just can guarantee that the result that calculates in real time
Do not lose.Fault recovery, calculate in real time when application is restarted, it is necessary to load the process data calculated in real time and knot from disk again
Fruit data, by it is stateful return to failure occur before a correct time point.In the ZooKeeper collection shown in Fig. 1
The offset for the message queue being stored with group where the message of message-oriented middleware collection pocket transmission.When message-oriented middleware collection pocket transmission
During message a collection of to stream calculation cluster, start offset amount of the message of this batch in message queue can be recorded in ZooKeeper.
If the message of this batch is successfully consumed by stream calculation cluster, then, message-oriented middleware cluster can send next batch message,
The offset recorded in ZooKeeper is updated to start offset amount of the next batch message in message queue therewith;If this
Batch message is failed by consumption, then, stream calculation cluster can re-read the skew of this batch message from ZooKeeper clusters
Measure, then re-request batch message into message-oriented middleware cluster, to realize the failure retransfer of message.
The content of the invention
In order to solve the above-mentioned technical problem, the method and dress of persistence are realized in being applied this application provides a kind of stream calculation
Put, the security recovery of data after failure can be ensured, improve real-time computational efficiency.
In order to reach the application purpose, the application provides a kind of method that persistence is realized in stream calculation application, including:
Present lot information consumption success, according to for preserving the current batch message consumed in message queue
First starting offset of original position and the persistence interval pre-set, judge whether to need to carry out persistence operation;
When needing to carry out persistence operation, exist according to the next batch message for preserving the last persistence operation
The message position of second start offset amount instruction of the original position in message queue carries out persistence processing;
After persistence operates successfully, it is next batch message to update the first starting offset and the second start offset amount respectively
Start offset amount.
When the stream calculation is using normally starting, or starting after fault recovery, this method also includes:
Second is changed to according to the second start offset amount request message, while by the value of the described first starting offset
The value of start offset amount.
When the value of the second start offset amount is sky or does not preserve the second start offset amount, the present lot disappears
Original position of the breath positioned at the message queue of message-oriented middleware;
Also include simultaneously:The value for setting the first starting offset is sky.
The persistence operation failure, this method also include:According to the described first starting offset instruction, again to described
Message in present lot message is consumed.
It is described to judge whether to need progress persistence operation to include:By the ID of the present lot divided by persistence interval,
When its remainder is zero, judge to need to carry out persistence operation;
Wherein, batch ID is the integer using incremental steps as 1 since 1.
When the stream calculation is using normally starting, or starting after fault recovery, the batch ID then stream calculation applications
It is 1 incremental that the batch ID that last success persistence before stopping is crossed, which continues incremental steps,.
The device that persistence is realized in a kind of stream calculation application, at least memory module, judge module is also disclosed in the application, with
And processing module, wherein,
Memory module, wherein preserving persistence interval, for preserving the current batch message consumed in message team
First starting offset of the original position in row, and the next batch message for preserving the last persistence operation exist
Second start offset amount of the original position in message queue;
Judge module, present lot information consumption success, according to the first starting offset preserved in memory module and in advance
The persistence interval first set, persistence notice is sent to processing module when judging to need to carry out persistence operation;
Processing module, the persistence notice from judge module is received, according to the second starting preserved in memory module
The message position for offseting amount instruction carries out persistence operation;And after persistence operates successfully, by the first starting offset and the
Two start offset amounts are updated to the start offset amount of next batch message.
The processing module is further used for:
When starting stream calculation application is normal, or starting after fault recovery, according to the preserved in the memory module
Two start offset amount request messages from message-oriented middleware, while originate offset by preserved in the memory module first
Value is changed to the value of the second start offset amount.
The processing module is further used for, in the persistence operation failure, according to being protected in the memory module
The the first starting offset instruction deposited, is consumed to the message in the present lot message from message-oriented middleware again.
The judge module is specifically used for:The ID of present lot indicated by offset divided by described is originated by described first
Persistence interval, when its remainder is zero, judge to need to carry out persistence operation, sending persistence to the processing module leads to
Know;Wherein, batch ID is the integer using incremental steps as 1 since 1.
The scheme that the application provides includes present lot information consumption success, according to for preserving current batch consumed
First starting offset of original position of the secondary message in message queue and the persistence interval pre-set, judge whether to need
Carry out persistence operation;When needing to carry out persistence operation, according to for preserving the next of the last persistence operation
The message position of second start offset amount instruction of original position of the batch message in message queue carries out persistence processing, and
After persistence success, the first starting offset and the second start offset amount are updated to the start offset of next batch message
Amount.Persistence operation in the application will be carried out after not consumed successfully for each batch, but be set in advance at one
Carried out behind the time interval put i.e. persistence interval, increase the time interval of disk persistence, so as to greatly improve in real time
Computational efficiency.So, in fault recovery, at most only need to consume the message of the batch in persistence interval again, it is and existing
Synchronous persistence is compared, and avoids the performance bottleneck that frequent write magnetic dribbling comes in synchronous persistence scheme, the message calculated in real time
Throughput performance improves an order of magnitude, and has reached with the scheme of cache way in the same order of magnitude;Meanwhile by failure
The delay reduction for recovering to bring does not interfere with real-time to second level.
Other features and advantage will illustrate in the following description, also, partly become from specification
Obtain it is clear that or being understood by implementing the application.The purpose of the application and other advantages can be by specification, rights
Specifically noted structure is realized and obtained in claim and accompanying drawing.
Brief description of the drawings
Accompanying drawing is used for providing further understanding technical scheme, and a part for constitution instruction, with this
The embodiment of application is used for the technical scheme for explaining the application together, does not form the limitation to technical scheme.
Fig. 1 is the schematic diagram of data flow in existing basic error-tolerance type stream calculation application;
Fig. 2 is the flow chart for the method that persistence is realized in the application of the application stream calculation;
Fig. 3 is the schematic flow sheet for the embodiment that persistence is realized in the application of the application stream calculation;
Fig. 4 is the composition structural representation for the device that persistence is realized in the application of the application stream calculation.
Embodiment
For the purpose, technical scheme and advantage of the application are more clearly understood, below in conjunction with accompanying drawing to the application
Embodiment be described in detail.It should be noted that in the case where not conflicting, in the embodiment and embodiment in the application
Feature can mutually be combined.
In one typical configuration of the application, computing device includes one or more processors(CPU), input/output
Interface, network interface and internal memory.
Internal memory may include the volatile memory in computer-readable medium, random access memory(RAM)And/or
The forms such as Nonvolatile memory, such as read-only storage(ROM)Or flash memory(flashRAM).Internal memory is showing for computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory(PRAM), static RAM(SRAM), it is dynamic
State random access memory(DRAM), other kinds of random access memory(RAM), read-only storage(ROM), electric erasable
Programmable read only memory(EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage(CD-ROM)、
Digital versatile disc(DVD)Or other optical storages, magnetic cassette tape, tape magnetic rigid disk stores or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include non-temporary computer readable media(transitory media), such as data-signal and carrier wave of modulation.
Can be in the computer system of such as one group computer executable instructions the flow of accompanying drawing illustrates the step of
Perform.Also, although logical order is shown in flow charts, in some cases, can be with suitable different from herein
Sequence performs shown or described step.
At present, error-tolerance type stream calculation scheme substantially has following several:
One kind is to write real-time result of calculation and cache, and disaster tolerance is realized by disposing two sets of identical computing clusters.This
The advantages of kind mode is that, due to no magnetic disc i/o, execution efficiency is fast, and concurrent access performance is high;But shortcoming is also apparent
, i.e., lower deployment cost is double, and if two sets of computing clusters break down simultaneously, result of calculation still can lose.
Another kind is to start one to calculate application in real time to complete normal service computation function, and will calculate knot in real time
Fruit write buffer;Meanwhile open another independent real-time calculating and apply so that the origination message received is persisted into disk
In.Because the application for being persisted to the application of disk with finishing service calculates is separate, it is thereby achieved that high perform effect
Rate.When occur cluster delay the failures such as machine when, in fault recovery, it is necessary to be consumed again to backing up message in disk,
Although loss of data will not be caused, when size of message is very big, the process of fault recovery result in calculates application in real time
Significant delays, so as to lose real-time value.
Also one kind is improved on the basis of second scheme, using the method for synchronous persistence, i.e., is counted in real time
Calculation is applied when consumption is per batch message, all by real-time result of calculation(It is not origination message, but disappearing after consumption is processed
Breath)It is persisted in disk, moreover, only after the consumption to the batch and persistence operation all successes, among message
Part cluster can just send the message of next batch.This mode is the stream calculation scheme for fault recovery general at present.Tool
For body,
Synchronous persistence, it is exactly the operation that each batch will make a write magnetic disk.As shown in figure 1, it is stored in
Offset in ZooKeeper clusters can be updated to next batch message institute with the successful consumption to a batch message
In the original position of message queue.If not doing synchronous persistence, once generation application interruption, cluster are delayed situations such as machine, in event
Barrier recover, using restarting after, will have partial data loss.T batches are successfully being consumed as an example it is assumed that applying in real time
Message after, do not carry out persistence, the result of calculation of this batch be saved in disk, and then consume (T+1) batch
Message, the offset being now stored in Zookeeper clusters has been updated to message queue where (T+1) batch message
Original position.If server now occurs to delay machine, after fault recovery, using restarting, application in real time is from Zookeeper collection
Offset and the request message into message-oriented middleware cluster are obtained in group, it is clear that what is at this moment asked is (T+1) batch
Message;And the message of T batches is not stored in disk before(It is not carried out synchronous persistence), necessarily cause T batches
The loss of message.
In synchronous persistence scheme, because each batch will do the operation of a write magnetic disk, and during the transmission of batch
Between be spaced about between 400 milliseconds~2 seconds, in real time application in internal memory is calculated about the time of cost to each batch message
Within 1 second.So, the frequency of disk write can be very frequent.According to a large amount of experiences it is recognised that being spent per batch disk write
Time account for whole batch and consume the proportion of spent total time and reached 50% or more, it is real-time that disk write becomes influence
The main bottleneck of computational efficiency.
Fig. 2 is the flow chart for the method that persistence is realized in the application of the application stream calculation, as shown in Fig. 2 including following step
Suddenly:
Step 200:Present lot information consumption success, according to for preserving the current batch message consumed in message
First starting offset of the original position in queue and the persistence interval pre-set, judge whether to need to carry out persistence
Operation.
In this step, the consumption to present lot message belongs to prior art, implements and is not belonging to the guarantor of the application
Scope is protected, is repeated no more here.
In this step, persistence interval N is one pre-set and is more than 1 integer, such as 50.Generally, will can hold
Longization interval N is arranged to an integer between 10 to 100.
Judge whether to need progress persistence operation to include in this step:By the ID of present lot divided by persistence interval
N, when its remainder is zero, judge to need to carry out persistence operation.Wherein, batch ID be since 1 incremental steps be 1 it is whole
Number.
It will be carried out after the persistence operation in this step, the application is successfully consumed for each batch
, but be to carry out after the N of persistence interval at an interval.So, in fault recovery, at most only need to consume N again
The message of individual batch, compared with existing synchronous persistence, avoid the performance that frequent write magnetic dribbling comes in synchronous persistence scheme
Bottleneck, the message throughput performance calculated in real time improve an order of magnitude, and have reached with the scheme for not doing persistence as delayed
Mode is deposited in the same order of magnitude, meanwhile, by the delay reduction that fault recovery is brought to second level.
Step 201:When needing to carry out persistence operation, according to the next group for preserving the last persistence operation
The message position of second start offset amount instruction of original position of the secondary message in message queue carries out persistence processing.
Persistence processing is exactly that stream calculation is applied from the message position of the second start offset amount instruction, by data buffering
Calculation result data is written in disk.Specific implementation belongs to the conventional techniques of those skilled in the art, except that,
The result of calculation for needing exist for persistence be by the second start offset amount instruction, and the second start offset amount preserve be nearest
Original position of the next batch message of persistence operation in message queue, that is to say, that persistence is from upper one
Batch after the success of secondary persistence starts, and includes persistence interval N result of calculation of the batch message in data buffering.I.e. originally
Persistence in application is operated after not each batch is successfully consumed and will carried out, but is persistence at a batch interval
It is spaced N's.
Step 202:After persistence success, the first starting offset and the second start offset amount are updated to next batch and disappeared
The start offset amount of breath.The renewal process of this step, it ensure that the success of the never persistence processing of persistence processing next time disappears
The batch message of expense starts.
When normally starting in stream calculation application, or starting after fault recovery, the application method also includes:According to second
Beginning offset request message, while the value of the first starting offset is changed to the value of the second start offset amount.Now, if
The value of two start offset amounts is empty or do not preserve the second start offset amount, then from the starting of the message queue of message-oriented middleware
Position starts request message, while the value of the first starting offset is arranged to empty.
Meanwhile stream calculation application is normal starts, or when starting after fault recovery, batch ID can then using stopping before
Batch ID for crossing of last success persistence continue to be incremented by, rather than be incremented by again since 1.With ensure batch ID for
The same uniqueness applied in real time.
If persistence operation failure, the application method also includes:According to the first starting offset instruction, again to current
Message in batch message is consumed.
The inventive method is described in detail with reference to embodiment.Fig. 3 is to realize to hold in the application of the application stream calculation
The schematic flow sheet for the embodiment changed long, using storm as stream calculation framework in the present embodiment, use java language development flowmeters
Application is calculated, and is described by taking the application of error-tolerance type stream calculation shown in Fig. 1 as an example, as shown in figure 3, including:
Step 300~step 301:Stream calculation application starts as started after normal startup or fault recovery, from ZooKeeper
The second start offset amount of middle reading, is then asked according to the message position of the second start offset amount instruction into message-oriented middleware
Message, meanwhile, by ZooKeeper first starting offset value be changed to read the second start offset amount value.
In this step, if without preserving the second start offset amount or its value as sky in ZooKeeper, then, from disappearing
The original position for ceasing the message queue of middleware starts request message, while first in ZooKeeper is originated into offset
Value is arranged to empty.
Step 302:The request message into message-oriented middleware.
Step 303:The computing unit of stream calculation cluster is consumed to the present lot message received, if consumption is lost
Lose, into step 308;Otherwise step 304 is entered.
Step 304:After the successful consumption of present lot message, present lot ID divided by the persistence pre-set are judged
Whether the remainder obtained after the N of interval is equal to 0.If not equal to 0, illustrate now it is not necessary to persistence be carried out, into step
309;Otherwise step 305 is entered.
Step 305:When needing to carry out persistence operation, carry out persistence processing and real-time result of calculation is saved in magnetic
In disk.
Step 306:Judge whether persistence operation succeeds, if being successfully entered step 307;If unsuccessfully enter step
310.Wherein, judge whether persistence successfully belongs to the conventional techniques of those skilled in the art, usual database software meeting
It is supplied to the interface of user's persistence, judges whether persistence is successful by return code after calling interface.
Step 307:Persistence operates successfully, the start offset amount of next batch message is obtained from message-oriented middleware, together
When the offset saved as into the first starting offset and the second start offset amount, return to step 302 afterwards.
Step 308:Fail if consumed to current message, after the first starting offset is re-read from ZooKeeper
Return to step 302.If now the value of the first starting offset is sky, return to step 302 and from the message of message-oriented middleware
The original position of queue starts request message.
Step 309:When persistence operation need not be carried out, the starting of next batch message is obtained from message-oriented middleware
Offset, while the offset is saved as into return to step 302 after the first starting offset, continue to consume message.
Step 310:If persistence operation failure, returned after the first starting offset is re-read from ZooKeeper
Step 302.
Flow shown in Fig. 3 stops when stream calculation application receives termination order.
Fig. 4 is the composition structural representation for the device that persistence is realized in the application of the application stream calculation, as shown in figure 4, extremely
Include memory module, judge module, and processing module less, wherein,
Memory module, wherein preserving persistence interval, for preserving the current batch message consumed in message team
First starting offset of the original position in row, and the next batch message for preserving the last persistence operation exist
Second start offset amount of the original position in message queue;
Judge module, present lot information consumption success, according to the first starting offset preserved in memory module and in advance
The persistence interval first set, persistence notice is sent to processing module when judging to need to carry out persistence operation;It is specific to use
In by the ID of the present lot indicated by the first starting offset divided by persistence interval, when its remainder is zero, judging need to
Carry out persistence operation;Wherein, batch ID is the integer using incremental steps as 1 since 1.
Processing module, the persistence notice from judge module is received, according to the second starting preserved in memory module
The message position for offseting amount instruction carries out persistence processing;And after persistence success, by the first starting offset and second
Beginning offset is updated to the start offset amount of next batch message.
Processing module is further used for:When normally starting in stream calculation application, or starting after fault recovery, according to storage
The second start offset amount request message from message-oriented middleware preserved in module, while the first that will be preserved in memory module
The value of beginning offset is changed to the value of the second start offset amount.
Processing module is further used for, in persistence operation failure, according to the first starting preserved in memory module
Amount instruction is offset, the message in the present lot message from message-oriented middleware is consumed again.
By taking framework shown in Fig. 1 as an example, the memory module in the application device can be arranged in ZooKeeper, judge mould
Block and processing module can be arranged in stream calculation cluster.In actual applications, also in can be substituted using other software
ZooKeeper, such as HBase, Mysql etc.;Or memory module is arranged on realization etc. in message-oriented middleware.
Those skilled in the art should be understood that each part for the device that above-mentioned the embodiment of the present application is provided,
And each step in method, they can be concentrated on single computing device, or are distributed in multiple computing device institutes group
Into network on.Alternatively, they can be realized with the program code that computing device can perform.It is thus possible to they are deposited
Storage performed in the storage device by computing device, either they are fabricated to respectively each integrated circuit modules or by it
In multiple modules or step be fabricated to single integrated circuit module to realize.So, the application is not restricted to any specific
Hardware and software combine.
Although the embodiment disclosed by the application is as above, described content is only to readily appreciate the application and use
Embodiment, it is not limited to the application.Technical staff in any the application art, is taken off not departing from the application
On the premise of the spirit and scope of dew, any modification and change, but the application can be carried out in the form and details of implementation
Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.
Claims (10)
1. the method for persistence is realized in a kind of stream calculation application, it is characterised in that including:
Present lot information consumption success, according to for preserving starting of the current batch message consumed in message queue
First starting offset of position and the persistence interval pre-set, judge whether to need to carry out persistence operation;
When needing to carry out persistence operation, according to the next batch message for preserving the last persistence operation in message
The message position of second start offset amount instruction of the original position in queue carries out persistence processing;
After persistence operates successfully, the first starting offset and the second start offset amount rising for next batch message are updated respectively
Beginning offset.
2. according to the method for claim 1, it is characterised in that the stream calculation application is normal to be started, or fault recovery
After when starting, this method also includes:
The second starting is changed to according to the second start offset amount request message, while by the value of the described first starting offset
The value of offset.
3. according to the method for claim 2, it is characterised in that the value of the second start offset amount is sky or does not preserve
When having the second start offset amount, the present lot message is located at the original position of the message queue of message-oriented middleware;
Also include simultaneously:The value for setting the first starting offset is sky.
4. according to the method for claim 1, it is characterised in that the persistence operation failure, this method also include:According to
The first starting offset instruction, is consumed to the message in the present lot message again.
5. the method according to claim 2 or 4, it is characterised in that described to judge whether to need to carry out persistence operation bag
Include:By the ID of the present lot divided by persistence interval, when its remainder is zero, judge to need to carry out persistence operation;
Wherein, batch ID is the integer using incremental steps as 1 since 1.
6. according to the method for claim 5, it is characterised in that the stream calculation application is normal to be started, or fault recovery
After when starting, batch ID that last the success persistence of the batch ID before then stream calculation application stops is crossed continues to be incremented by
Step-length is 1 incremental.
7. the device of persistence is realized in a kind of stream calculation application, it is characterised in that at least memory module, judge module, and
Processing module, wherein,
Memory module, wherein preserving persistence interval, for preserving the current batch message consumed in message queue
Original position the first starting offset, and for preserving the next batch message of the last persistence operation in message
Second start offset amount of the original position in queue;
Judge module, present lot information consumption success, set according to the first starting offset preserved in memory module and in advance
The persistence interval put, persistence notice is sent to processing module when judging to need to carry out persistence operation;
Processing module, the persistence notice from judge module is received, according to the second start offset preserved in memory module
The message position of amount instruction carries out persistence operation;And after persistence operates successfully, by the first starting offset and second
Beginning offset is updated to the start offset amount of next batch message.
8. device according to claim 7, it is characterised in that the processing module is further used for:
When normally starting in stream calculation application, or starting after fault recovery, according to second preserved in the memory module
Beginning offset request message from message-oriented middleware, while by the value of the preserved in the memory module first starting offset more
It is changed to the value of the second start offset amount.
9. device according to claim 7, it is characterised in that the processing module is further used for, described lasting
When changing operation failure, according to the first starting offset instruction preserved in the memory module, again to from message-oriented middleware
Message in present lot message is consumed.
10. according to the device described in any one of claim 7~9, it is characterised in that the judge module is specifically used for:By institute
The ID of the present lot indicated by the first starting offset divided by the persistence interval are stated, when its remainder is zero, is judged
Need to carry out persistence operation, persistence notice is sent to the processing module;Wherein, batch ID is with incremental step since 1
A length of 1 integer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310362269.XA CN104424186B (en) | 2013-08-19 | 2013-08-19 | The method and device of persistence is realized in a kind of stream calculation application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310362269.XA CN104424186B (en) | 2013-08-19 | 2013-08-19 | The method and device of persistence is realized in a kind of stream calculation application |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104424186A CN104424186A (en) | 2015-03-18 |
CN104424186B true CN104424186B (en) | 2018-04-03 |
Family
ID=52973190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310362269.XA Active CN104424186B (en) | 2013-08-19 | 2013-08-19 | The method and device of persistence is realized in a kind of stream calculation application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104424186B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106598473B (en) * | 2015-10-15 | 2020-09-04 | 南京中兴新软件有限责任公司 | Message persistence method and device |
CN107783728B (en) * | 2016-08-31 | 2021-07-23 | 百度在线网络技术(北京)有限公司 | Data storage method, device and equipment |
CN106789741B (en) * | 2016-12-26 | 2020-02-18 | 北京奇虎科技有限公司 | Consumption method and device of message queue |
CN107273228B (en) * | 2017-07-13 | 2020-09-04 | 焦点科技股份有限公司 | Message transmission method based on star topology architecture |
CN107295106B (en) * | 2017-07-31 | 2020-08-14 | 杭州多麦电子商务股份有限公司 | Message data service cluster |
CN108418879B (en) * | 2018-02-26 | 2021-03-02 | 新疆熙菱信息技术股份有限公司 | High-reliability massive heterogeneous data transmission method and system |
CN108509299B (en) * | 2018-03-29 | 2022-08-12 | 广西电网有限责任公司 | Message processing method, device and computer readable storage medium |
CN108984770A (en) * | 2018-07-23 | 2018-12-11 | 北京百度网讯科技有限公司 | Method and apparatus for handling data |
CN111931025B (en) * | 2020-07-20 | 2023-08-15 | 武汉美和易思数字科技有限公司 | Data continuous grabbing method and system based on Actor model |
CN112000489A (en) * | 2020-07-29 | 2020-11-27 | 新华三大数据技术有限公司 | Kafka data processing method and server |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1975684A (en) * | 2006-12-13 | 2007-06-06 | 天津理工大学 | Distributing real-time data bank fault recovering method capable of supporting serving and recovering simultaneously |
CN101510838A (en) * | 2009-02-26 | 2009-08-19 | 北京北纬点易信息技术有限公司 | Method for implementing perdurable data queue |
US8145859B2 (en) * | 2009-03-02 | 2012-03-27 | Oracle International Corporation | Method and system for spilling from a queue to a persistent store |
-
2013
- 2013-08-19 CN CN201310362269.XA patent/CN104424186B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1975684A (en) * | 2006-12-13 | 2007-06-06 | 天津理工大学 | Distributing real-time data bank fault recovering method capable of supporting serving and recovering simultaneously |
CN101510838A (en) * | 2009-02-26 | 2009-08-19 | 北京北纬点易信息技术有限公司 | Method for implementing perdurable data queue |
US8145859B2 (en) * | 2009-03-02 | 2012-03-27 | Oracle International Corporation | Method and system for spilling from a queue to a persistent store |
Non-Patent Citations (2)
Title |
---|
一体化电能实时信息采集和管理分析系统;陈凯平;《中国优秀硕士学位论文全文数据库 信息科技辑》;20100515;正文第49页 * |
使用storm实现实时大数据分析;真实的归宿;《http://blog.csdn.net/hguisu/article/details/8454368》;20121231;第4-10页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104424186A (en) | 2015-03-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104424186B (en) | The method and device of persistence is realized in a kind of stream calculation application | |
CN107544862B (en) | Stored data reconstruction method and device based on erasure codes and storage node | |
CN110493148B (en) | Block processing, block consensus and block synchronization method and device | |
CN103942252B (en) | A kind of method and system for recovering data | |
CN103309767A (en) | Method and device for processing client log | |
US11271748B2 (en) | Consensus methods and systems in consortium blockchain | |
US20230098190A1 (en) | Data processing method, apparatus, device and medium based on distributed storage | |
CN102843396A (en) | Data writing and reading method and device in distributed caching system | |
EP3680787B1 (en) | Method for synchronization between primary database and standby database, database system and device | |
CN109491609B (en) | Cache data processing method, device and equipment and readable storage medium | |
CN111383031A (en) | Intelligent contract execution method and system in block chain and electronic equipment | |
CN106899654A (en) | A kind of sequence value generation method, apparatus and system | |
CN106815094B (en) | Method and equipment for realizing transaction submission in master-slave synchronization mode | |
CN109144787A (en) | A kind of data reconstruction method, device, equipment and readable storage medium storing program for executing | |
CN108293003A (en) | Distribution figure handles the fault-tolerant of network | |
US20130103910A1 (en) | Cache management for increasing performance of high-availability multi-core systems | |
CN108206839A (en) | One kind is based on majority's date storage method, apparatus and system | |
CN108984779A (en) | Distributed file system snapshot rollback metadata processing method, device and equipment | |
CN109189615A (en) | A kind of delay machine treating method and apparatus | |
CN110798366B (en) | Task logic processing method, device and equipment | |
CN113448647B (en) | Resource synchronization method, implementation equipment and electronic equipment | |
CN111541747B (en) | Data check point setting method and device | |
WO2020238653A1 (en) | Encoding method in distributed system environment, decoding method in distributed system environment, and corresponding apparatuses | |
CN112015325B (en) | Method for generating decoding matrix, decoding method and corresponding device | |
CN109344630B (en) | Block generation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20211105 Address after: Room 233, building 14, No. 788, Guangzhou Avenue South, Haizhu District, Guangzhou City, Guangdong Province Patentee after: Alibaba South China Technology Co.,Ltd. Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Patentee before: ALIBABA GROUP HOLDING Ltd. |
|
TR01 | Transfer of patent right |