CN115001998A - Disaster recovery method and device for message service - Google Patents

Disaster recovery method and device for message service Download PDF

Info

Publication number
CN115001998A
CN115001998A CN202210446627.4A CN202210446627A CN115001998A CN 115001998 A CN115001998 A CN 115001998A CN 202210446627 A CN202210446627 A CN 202210446627A CN 115001998 A CN115001998 A CN 115001998A
Authority
CN
China
Prior art keywords
message
main channel
channel
disaster recovery
consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210446627.4A
Other languages
Chinese (zh)
Other versions
CN115001998B (en
Inventor
闻秋实
任彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shell Time Network Technology Co ltd
Original Assignee
Beijing Shell Time Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shell Time Network Technology Co ltd filed Critical Beijing Shell Time Network Technology Co ltd
Priority to CN202210446627.4A priority Critical patent/CN115001998B/en
Publication of CN115001998A publication Critical patent/CN115001998A/en
Application granted granted Critical
Publication of CN115001998B publication Critical patent/CN115001998B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements

Abstract

The application discloses a disaster recovery method and a device for message service, comprising the following steps: the messages of the main channel are normally consumed in a normal state, the messages are asynchronously stored in a designated container after the messages are successfully consumed, and the messages of the backup channel are consumed in an empty mode; when the message processing performance of the main channel cannot meet the normal consumption requirement, entering a disaster recovery state; in the disaster recovery state, the backup channel backtracks to T1 time before entering the disaster recovery state, and then the messages are normally consumed; before consuming a message, judging whether the message is stored in a designated container, if not, consuming normally, if so, not consuming any more, and switching to the next message processing; and returning to the normal state when the monitoring finds that the message processing performance of the main channel in the disaster recovery state meets the normal consumption requirement. By applying the method and the device, normal consumption of the message can be guaranteed when the message service terminal has a problem, and the robustness of the message service system is improved.

Description

Disaster recovery method and device for message service
Technical Field
The present application relates to the field of disaster tolerance technologies, and in particular, to a disaster tolerance method and apparatus for a message service, a computer-readable storage medium, and an electronic device.
Background
In current messaging services, there is a typical producer and consumer model: the producer generates a message to be put in the message service platform, and the consumer actively acquires or passively receives the message generated by the producer from the message service platform. And the consumer obtains the message in an active acquisition or passive receiving mode, and then the message is considered to be consumed.
In the current general implementation method, a message channel is established, a producer generates a message and then puts the message into the message channel, and a consumer obtains the message by using the message channel to realize the consumption of the message.
In the using process of the message service, due to upgrading of a message service end (namely, an entity providing the message service, such as a message channel, a message service platform, and the like) or other factors, messages sent by a producer are blocked or lost, a consumer cannot obtain the messages in time, the business flow of an internet company is seriously influenced, and various losses are caused to the company.
Disclosure of Invention
In view of the foregoing prior art, embodiments of the present application disclose a disaster recovery method and apparatus for a message service, a computer-readable storage medium, and an electronic device, which can ensure normal consumption of a message when a problem occurs at a message service end, and improve robustness of a message service system.
A disaster recovery method of message service is provided, a main channel and a backup channel are set, a producer sends a message to the main channel and the backup channel; the method further comprises the following steps:
under a normal state, the message of the main channel is consumed normally, after the message is consumed successfully, the message which is consumed successfully is asynchronously stored in a designated container, and the message of the backup channel is consumed in an empty state; monitoring the message processing performance of the main channel in a normal state, and entering a disaster recovery state when the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, the appointed container can only store one same message;
in the disaster recovery state, the backup channel backtracks to T1 time before entering the disaster recovery state, and then the messages are normally consumed; the message of the main channel is normally consumed; before consuming any message in the main channel and the backup channel, judging whether the any message is stored in the designated container or not, if not, normally consuming the any message, if so, not consuming the any message, and transferring to the next message for consumption processing; monitoring the message processing performance of the main channel in the disaster tolerance state, and returning to the normal state when the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement;
wherein T1 is a preset time.
Preferably, the asynchronously storing the message of successful consumption in the corresponding designated container includes:
and executing SETNX operation of asynchronous redis on the message of successful consumption.
Preferably, the determining whether any message is already stored in the designated container comprises:
and executing SETNX operation of synchronous redis corresponding to any message, if the operation is successful, determining that any message is not saved, and if the operation is failed, determining that any message is saved.
Preferably, the monitoring the message processing performance of the main channel in a normal state includes: monitoring the message consumption numbers of the main channel and the backup channel;
the method for determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption demand comprises the following steps: if the message consumption of the main channel lags behind that of the backup channel by N within a set time period T2, determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, T2 is a preset time, and N is a preset positive integer.
Preferably, the monitoring of the message processing performance of the main channel in the disaster recovery state includes: counting the total consumption of the messages in the main channel and the backup channel and determining the first times of the stored messages in the judgment;
the mode for determining that the message processing performance of the main channel in the disaster recovery state meets the normal consumption demand comprises the following steps: when the total consumption number of the messages in a set time period T3 is equal to the first time number, determining that the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement; wherein T3 is a preset time.
Preferably, T1> T2.
A disaster recovery system of message service comprises a main channel processing unit, a backup channel processing unit and a monitoring unit;
the main channel processing unit is used for normally consuming the message of the main channel in a normal state and asynchronously storing the message successfully consumed in a specified container after the message is successfully consumed; the system is also used for normally consuming the messages of the main channel in a disaster recovery state, judging whether any message is stored in the designated container or not before any message is consumed, if not, normally consuming any message, and if so, not consuming any message, and transferring to the next message for consumption processing; wherein, the appointed container can only store one same message;
the backup channel processing unit is used for performing empty consumption on the message of the backup channel in a normal state; the system is also used for backtracking to T1 time before the system enters the disaster recovery state in the disaster recovery state, normally consuming the subsequent messages, judging whether the messages are stored in the appointed container corresponding to any message before any message is consumed, if the messages are not stored, normally consuming the messages, if the messages are stored, not consuming the messages any more, and switching to the next message for consumption processing;
the monitoring unit is used for monitoring the message processing performance of the main channel in a normal state, determining to enter a disaster recovery state and informing the main channel processing unit and the backup channel processing unit when the message processing performance of the main channel in the normal state cannot meet normal consumption requirements; the main channel processing unit is used for monitoring the message processing performance of the main channel in the disaster recovery state, and when the message processing performance of the main channel in the disaster recovery state meets the normal consumption requirement, the main channel processing unit and the backup channel processing unit are determined to enter the normal state and are notified;
wherein T1 is a preset time.
Preferably, in the main channel processing unit, the asynchronously saving the message of successful consumption in the corresponding designated container includes:
and executing SETNX operation of asynchronous redis on the message of successful consumption.
Preferably, in the primary path processing unit and the backup path processing unit, the determining whether any message has been saved in the designated container includes:
and executing SETNX operation of synchronous redis corresponding to any message, if the operation is successful, determining that any message is not saved, and if the operation is failed, determining that any message is saved.
Preferably, in the monitoring unit,
the monitoring of the message processing performance of the main channel in the normal state includes: monitoring the message consumption numbers of the main channel and the backup channel;
the method for determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption demand comprises the following steps: if the message consumption of the main channel lags behind that of the backup channel by N within a set time period T2, determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, T2 is a preset time, and N is a preset positive integer.
Preferably, in the monitoring unit,
the monitoring of the message processing performance of the main channel in the disaster recovery state includes: counting the total consumption of the messages in the main channel and the backup channel and determining the first times of the stored messages in the judgment;
the mode for determining that the message processing performance of the main channel in the disaster recovery state meets the normal consumption demand comprises the following steps: when the total consumption number of the messages in a set time period T3 is equal to the first time number, determining that the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement; wherein T3 is a preset time.
Preferably, T1> T2.
A computer readable storage medium having stored thereon computer instructions, wherein said instructions when executed by a processor implement any of the above-mentioned message service disaster recovery methods.
A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the disaster recovery method of the messaging service as described in any of the above.
To sum up, the embodiment of the application discloses a disaster recovery method for message service, which comprises the steps of setting a main message channel and a standby message channel, and sending a message to the two message channels by a producer. Under a normal state, the message of the main channel is consumed normally, and the message is asynchronously stored in a specified container after the consumption is successful, so as to record the consumed state of the message; messages of the backup tunnel are consumed without charge; and when the main channel message processing performance can not meet the normal consumption requirement under the normal state, entering a disaster recovery state and carrying out disaster recovery processing. In a disaster recovery state, switching a backup channel into normal consumption, backtracking to T1 time before entering the disaster recovery state, performing normal consumption on subsequent messages, judging whether the messages are stored in a designated container before consuming the messages, if so, indicating that the messages are consumed, not consuming the messages any more, and performing processing on the next message; if the message is not stored in the designated container, the message is not consumed, and the message is normally consumed. Through the operation of judging whether the designated container stores a certain message or not in the backup channel, the message which is not normally consumed by the main channel within the backtracking time can be normally consumed through the backup channel, and the system performance is ensured. Meanwhile, in the disaster recovery state, the main channel and the backup channel perform the same processing, and meanwhile, when the message processing performance of the main channel in the disaster recovery state meets the normal consumption requirement, the main channel is indicated to be recovered to be normal, the disaster recovery state is ended, and the normal state is returned. In a disaster recovery state, on one hand, the backup channel is used for normally consuming messages, so that when the performance of the main channel has problems, a normal message service function can still be provided through the backup channel; on the other hand, the judgment of the main channel and the backup channel specifies the monitoring of whether the container stores a certain message, and can sense the performance of the main channel in time to restore the normal state, thereby restoring the processing mode before disaster recovery. Through the processing of the application, when the performance of the main channel goes wrong, the system can still provide normal message service, and the messages which are not consumed by the main channel before backtracking are normally consumed, so that the robustness of the message service system is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of the processing of the entire message service system in a normal state;
FIG. 2 is a schematic processing diagram of the entire message service system in a disaster recovery state;
fig. 3 is a schematic diagram of a basic structure of a message service disaster recovery device in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
The technical solution of the present invention will be described in detail with specific examples. Several of the following embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.
The basic idea of the application is that: the main message channel and the standby message channel are set, so that when the performance of the main channel is in a problem, the standby channel is utilized to realize normal consumption of the messages; and meanwhile, the message which is successfully consumed is placed in the designated container to mark the consumption action of the message, the backup channel backtracks the message for a period of time to process in the disaster tolerance state, and the message which is not consumed and processed in the backtracking period of time is found out by judging whether the message is stored in the designated container corresponding to the message or not, so that normal consumption is performed in time. Only one identical message can be stored in a given container, i.e. the same message can be stored only once, so that in a disaster-tolerant state, consumed messages can be identified.
Specific implementations of the present application are described in detail below.
In the message service system of the application, two message channels are set, namely a main channel and a backup channel. The producer produces a message and sends it to the primary channel and the backup channel. The processing of the main channel and the backup channel is divided into two states: a normal state and a disaster recovery state. When the performance of the main channel is normal, the main channel and the backup channel both work in a normal state and provide message service by means of the main channel; when the performance of the main channel has problems, the main channel and the backup channel are switched to disaster recovery state, and the backup channel is mainly used for providing message service. The following describes the processing of the main channel and the backup channel in detail, respectively, for the normal state and the disaster recovery state.
Fig. 1 is a schematic view of the processing of the entire message service system in a normal state. As shown in fig. 1, the specific processing in the normal state includes:
1) the message of the main channel is consumed normally, namely the message is transmitted by interacting with a consumer; and the message of the backup channel is consumed in an empty mode, namely the message is marked as consumed after entering the backup channel.
2) After each successful consumption of a message A, the main channel asynchronously stores the message A which is successfully consumed in a designated container (the designated container can only store one same message), thereby marking that the message A is consumed. The process of asynchronously storing the message in the designated container can be realized by SETNX operation of asynchronous redis, and the redis is the designated container. Specifically, the SETNX operation of the redis operation can be performed only once, and if the SETNX operation is already performed on the message a, then if the SETNX operation is performed on the message a again thereafter, an operation failure is displayed. In which, an asynchronous save operation (for example, an SETNX operation of asynchronous redis) is adopted, and it is mainly considered that the operation does not affect the normal consumption processing of the main channel as much as possible.
3) And monitoring the message processing performance of the main channel in a normal state, and entering a disaster recovery state when the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement. The monitoring of the message processing performance of the main channel in a normal state can be carried out by adopting various existing monitoring modes, the existing parameters reflecting the message processing performance are utilized to determine whether the normal consumption requirements are met, and the processing can be carried out by utilizing the existing modes. In addition, preferably, a mode for monitoring and discovering that the message processing performance of the main channel is insufficient is provided in the application, and the message consumption numbers of the main channel and the backup channel can be monitored; if the message consumption of the main channel lags behind the message consumption of the backup channel by N within the set time period T2, it is indicated that the processing capability of the main channel slips more than that of the normal state, and it is determined that the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement. T2 is a preset time, N is a preset positive integer, and may be preconfigured as needed. Through the monitoring processing, the reduction of the message processing performance of the main channel can be found in time, and after a certain limit is reached, the disaster recovery processing is started to enter a disaster recovery state.
Fig. 2 is a schematic processing diagram of the entire message service system in a disaster recovery state. As shown in fig. 2, the specific disaster recovery processing in the disaster recovery state includes:
1) switching the backup channel to a normal consumption message, namely interacting with a consumer to realize message sending; the main channel still normally consumes the message; the processing of the normal consumption message further includes saving the corresponding message to a designated container after the message consumption is successful, and specifically may be an SETNX operation for executing synchronous redis.
2) And the backup channel backtracks to the time T1 before entering the disaster recovery state, and the subsequent messages are normally consumed. Generally, it takes a certain time to monitor that the performance of the main channel is in a problem, and during this time, although the main channel is in a normal state, the processing of the main channel has a problem that it is not timely, and through the processing, a message generated in T1 before the disaster recovery state is entered can be given an opportunity of relief processing. Wherein, T1 may be greater than the statistical time for one monitoring of the main channel, for example, greater than or equal to the aforementioned T2.
3) In the main channel and the backup channel, before consuming a certain message B, judging whether the message B is stored in the designated container, if the message B is not stored, normally consuming the message B, if the message B is stored, not consuming the message B, and transferring to the next message for consumption processing. The processing of judging whether the message B is stored in the designated container can be realized by executing an SETNX operation of synchronous redis on the message B, when the SETNX operation is successful, the message B is determined not to be stored in the designated container, namely the message B is not consumed, and the processing of storing the message B in the designated container is realized by the SETNX operation which is successful in operation, namely the message B is marked to be consumed; when the SETNX operation fails, it is determined that message B has been saved in the designated container, i.e., message B has been consumed. Specifically, in the process of disaster recovery processing, SETNX operation is performed synchronously and is performed before message consumption, so that if SETNX operation fails, it indicates that the message has been performed SETNX operation, that is, the message has been consumed and marked, through such processing, on one hand, for a backtracking time period, a message that the main channel is not processed in time can be found, and the consumed message is prevented from being re-consumed; on the other hand, for the subsequently generated messages, no message is consumed secondarily on the premise that the main channel and the backup channel are consumed normally.
4) And monitoring the message processing performance of the main channel in the disaster tolerance state, and returning to the normal state when the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement. The monitoring of the message processing performance of the main channel in the disaster recovery state can be carried out by adopting various existing monitoring modes, the existing parameters reflecting the message processing performance are utilized to determine whether the normal consumption requirements are met, and the processing can be carried out by utilizing the existing modes. In addition, preferably, the present application provides a method for monitoring and finding recovery of message processing performance of a main channel in a disaster tolerance state by using SETNX operation, which specifically includes: counting the total consumption of the messages in the main channel and the backup channel, determining the times of storing corresponding messages when judging whether a certain message is stored in the designated container, and marking the times as a first time; and when the total consumption number of the messages in the set time period T3 is equal to the first time number, determining that the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement. Judging whether a certain message is stored in the designated container or not, wherein the message can be operated by performing SETNX operation of synchronous redis on the message, if the operation is successful, the corresponding message is not stored in the container, and the message is marked to be consumed through the successful SETNX operation; if the operation fails, it indicates that the corresponding message has been saved in the container, the corresponding message has been consumed and marked as consumed. Accordingly, the first number is the number of failures of the SETNX operation of the synchronous redis.
In more detail, the SETNX operation of synchronous redis fails because the corresponding message (e.g., message C) has been consumed, which means that two SETNX operations are performed on the message C within time T3, and the main channel and the standby channel are always performed once each, which means that the main channel attempts to consume the message C; if the total number of consumed messages is the same as the number of times of failure of the SETNX operation, that is, the main channel attempts to perform consumption processing on all consumed messages, and successful consumption processing may come from the main channel, it can be concluded that this indicates that the main channel is always performing SETNX operation on each message synchronously, and the consumer always receives the message correspondingly, the processing of the main channel in time T3 can keep up with the speed of generating the message, that is, the message processing performance of the main channel is recovered. Wherein, T3 is a preset time.
After returning to the normal state through the above-mentioned processing, carry out the processing under the aforesaid normal state, mainly include: the backup channel is switched to the message generated by the empty consumption, and the message which is successfully consumed is still stored in the corresponding designated container by the main channel in an asynchronous mode (for example, the SETNX operation is switched back to asynchronous processing), so that the influence on the consumption performance is avoided as much as possible; and recovering the performance monitoring mode of the main channel in a normal state.
In order to check the effectiveness of the disaster recovery method, the disaster recovery drilling can be performed according to the method. Specifically, the traffic of the service end can be simulated by using the specially identified message, and specific values of T1, T2(T2 is generally less than or equal to T1), T3 and N are set; and then blocking the main channel, judging whether the disaster tolerance is started normally after T1 time, then releasing the channel, judging whether the disaster tolerance can be ended normally, and finally after the disaster tolerance is ended normally, summarizing simulation messages, judging whether the service flows normally or not, and evaluating the effect of the disaster tolerance scheme. Through disaster tolerance drilling discovery, the disaster tolerance method can realize timely discovery of problems, normally start disaster tolerance processing, timely discover that the main channel is recovered to be normal, normally end a disaster tolerance state, and realize the expected effect of a disaster tolerance scheme because services are always circulated normally in the whole process.
The above is a specific implementation of the message service disaster recovery method in the present application. The application also provides a message disaster recovery device which can be used for realizing the disaster recovery method. Fig. 3 is a schematic diagram of a basic structure of the disaster recovery device. As shown in fig. 3, the disaster recovery device includes: the device comprises a main channel processing unit, a backup channel processing unit and a monitoring unit.
The main channel processing unit is used for normally consuming the message of the main channel in a normal state, wherein the normal consumption comprises the step of asynchronously storing the message which is successfully consumed in a corresponding specified container after the message is successfully consumed; and the system is also used for normally consuming the messages of the main channel in a disaster recovery state, judging whether any message is stored in the designated container or not before any message is consumed, if not, normally consuming the messages, and if any message is stored, not consuming the messages any more, and transferring to the next message for consumption processing.
The backup channel processing unit is used for performing idle consumption on the message of the backup channel in a normal state; and the system is also used for backtracking to T1 time before the disaster recovery state is entered in the disaster recovery state, normally consuming the subsequent messages, judging whether any message is stored in the designated container before any message is consumed, normally consuming the message if any message is not stored, and transferring to the next message for consumption processing if any message is stored.
The monitoring unit is used for monitoring the message processing performance of the main channel in a normal state, and when the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement, the monitoring unit determines to enter a disaster recovery state and informs the main channel processing unit and the backup channel processing unit; and the system is also used for monitoring the message processing performance of the main channel in the disaster recovery state, and when the message processing performance of the main channel in the disaster recovery state meets the normal consumption requirement, determining to enter the normal state and notifying the main channel processing unit and the backup channel processing unit. Wherein T1 is a preset time.
In addition, preferably, in the monitoring unit, monitoring the message processing performance of the main channel in the normal state may specifically include: monitoring the message consumption numbers of the main channel and the backup channel; the method for determining that the message processing performance of the main channel cannot meet the normal consumption requirement in the normal state may specifically include: if the message consumption of the primary channel lags behind the message consumption of the backup channel by N within the set time period T2, it is determined that the message processing performance of the primary channel in the normal state cannot meet the normal consumption requirement. Wherein, T2 is a preset time, and N is a preset positive integer.
In the monitoring unit, the processing for monitoring the message processing performance of the main channel in the disaster tolerance state may specifically include: counting the total consumption number of the messages in the main channel and the backup channel and determining the first times of the stored messages in the judgment; the method for determining that the message processing performance of the main channel in the disaster recovery state meets the normal consumption demand may specifically include: and when the total consumption number of the messages in the set time period T3 is equal to the first time number, determining that the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement. Wherein T3 is a preset time.
Further, preferably, T1> T2.
By the disaster recovery method and the disaster recovery device for the message service, in the normal consumption process, the message needs to asynchronously execute SETNX operation in redis, so that the message is prevented from being resent when the disaster recovery is triggered to carry out message backtracking. Meanwhile, the message consumption conditions in the main and standby message channels within a plurality of recent times need to be monitored in real time, abnormal conditions need to be counted, and the abnormal conditions need to be discovered in time to enter a disaster recovery state.
Embodiments of the present application further provide a computer-readable storage medium, which stores instructions that, when executed by a processor, may perform steps in the disaster recovery method for a message service as described above. In practical applications, the computer readable medium may be included in each device/apparatus/system of the above embodiments, or may exist separately and not be assembled into the device/apparatus/system. Wherein instructions are stored in a computer readable storage medium, which stored instructions, when executed by a processor, may perform the steps in the method of enabling bidirectional real-time data monitoring as described above.
According to embodiments disclosed herein, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example and without limitation: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing, without limiting the scope of the present disclosure. In the embodiments disclosed herein, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Embodiments of the present application further provide a computer program product, which includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the method for disaster recovery of the message service as described above can be implemented.
The flowchart and block diagrams in the figures of the present application illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments disclosed herein. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not explicitly recited in the present application. In particular, the features recited in the various embodiments and/or claims of the present application may be combined and/or coupled in various ways, all of which fall within the scope of the present disclosure, without departing from the spirit and teachings of the present application.
The principles and embodiments of the present invention are explained herein using specific examples, which are provided only to help understanding the method and the core idea of the present invention, and are not intended to limit the present application. It will be appreciated by those skilled in the art that changes may be made in this embodiment and its broader aspects and without departing from the principles, spirit and scope of the invention, and that all such modifications, equivalents, improvements and equivalents as may be included within the scope of the invention are intended to be protected by the claims.

Claims (10)

1. A disaster recovery method of message service is characterized in that a main channel and a backup channel are set, and a producer sends a message to the main channel and the backup channel; the method further comprises the following steps:
under a normal state, the message of the main channel is consumed normally, after the message is consumed successfully, the message which is consumed successfully is asynchronously stored in a designated container, and the message of the backup channel is consumed in an empty state; monitoring the message processing performance of the main channel in a normal state, and entering a disaster recovery state when the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, the appointed container can only store one same message;
in a disaster recovery state, the backup channel backtracks to the time T1 before entering the disaster recovery state, and the subsequent messages are normally consumed; the message of the main channel is normally consumed; before consuming any message in the main channel and the backup channel, judging whether the any message is stored in the designated container or not, if not, normally consuming the any message, if so, not consuming the any message, and transferring to the next message for consumption processing; monitoring the message processing performance of the main channel in the disaster tolerance state, and returning to the normal state when the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement;
wherein T1 is a preset time.
2. The method of claim 1, wherein asynchronously storing the messages that are successfully consumed in the corresponding designated container comprises:
and executing SETNX operation of asynchronous redis on the message of successful consumption.
3. The method of claim 1, wherein said determining whether any of the messages has been saved in the designated container comprises:
and executing SETNX operation of synchronous redis corresponding to any message, if the operation is successful, determining that any message is not saved, and if the operation is failed, determining that any message is saved.
4. The method according to any one of claims 1 to 3, wherein the monitoring of the message processing performance of the main channel in a normal state comprises: monitoring the message consumption numbers of the main channel and the backup channel;
the method for determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption demand comprises the following steps: if the message consumption of the main channel lags behind the message consumption of the backup channel by N within a set time period T2, determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, T2 is a preset time, and N is a preset positive integer.
5. The method according to any one of claims 1 to 3, wherein the monitoring of the message processing performance of the primary channel in the disaster recovery state comprises: counting the total consumption of the messages in the main channel and the backup channel and determining the first times of the stored messages in the judgment;
the mode for determining that the message processing performance of the main channel in the disaster recovery state meets the normal consumption demand comprises the following steps: when the total consumption number of the messages in a set time period T3 is equal to the first time number, determining that the message processing performance of the main channel in the disaster tolerance state meets the normal consumption requirement; wherein, T3 is a preset time.
6. The method of claim 4, wherein T1> T2.
7. A disaster recovery system of message service is characterized in that a main channel processing unit, a backup channel processing unit and a monitoring unit;
the main channel processing unit is used for normally consuming the message of the main channel in a normal state and asynchronously storing the message successfully consumed in a specified container after the message is successfully consumed; the system is also used for normally consuming the messages of the main channel in a disaster recovery state, judging whether any message is stored in the designated container or not before any message is consumed, if not, normally consuming any message, and if so, not consuming any message, and transferring to the next message for consumption processing; wherein, the appointed container can only store one same message;
the backup channel processing unit is used for performing empty consumption on the message of the backup channel in a normal state; the system is also used for backtracking to T1 time before the system enters the disaster recovery state in the disaster recovery state, normally consuming the subsequent messages, judging whether the messages are stored in the designated container corresponding to any message before consuming any message, normally consuming the messages if the messages are not stored, and transferring to the next message for consumption processing if the messages are not stored;
the monitoring unit is used for monitoring the message processing performance of the main channel in a normal state, and when the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement, the monitoring unit determines to enter a disaster recovery state and informs the main channel processing unit and the backup channel processing unit; the main channel processing unit is used for monitoring the message processing performance of the main channel in a disaster recovery state, and when the message processing performance of the main channel in the disaster recovery state meets the normal consumption requirement, the main channel processing unit and the backup channel processing unit are determined to enter a normal state and are informed;
wherein, T1 is a preset time.
8. System according to claim 7, characterized in that in the monitoring unit,
the monitoring of the message processing performance of the main channel in the normal state includes: monitoring the message consumption numbers of the main channel and the backup channel;
the method for determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption demand comprises the following steps: if the message consumption of the main channel lags behind that of the backup channel by N within a set time period T2, determining that the message processing performance of the main channel in the normal state cannot meet the normal consumption requirement; wherein, T2 is a preset time, and N is a preset positive integer.
9. A computer-readable storage medium having stored thereon computer instructions, which when executed by a processor, implement a method for disaster recovery of a messaging service according to any of claims 1 to 6.
10. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the disaster recovery method of a messaging service according to any of claims 1 to 6.
CN202210446627.4A 2022-04-26 2022-04-26 Disaster recovery method and device for message service Active CN115001998B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210446627.4A CN115001998B (en) 2022-04-26 2022-04-26 Disaster recovery method and device for message service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210446627.4A CN115001998B (en) 2022-04-26 2022-04-26 Disaster recovery method and device for message service

Publications (2)

Publication Number Publication Date
CN115001998A true CN115001998A (en) 2022-09-02
CN115001998B CN115001998B (en) 2024-02-23

Family

ID=83024439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210446627.4A Active CN115001998B (en) 2022-04-26 2022-04-26 Disaster recovery method and device for message service

Country Status (1)

Country Link
CN (1) CN115001998B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130297808A1 (en) * 2011-01-06 2013-11-07 Huawei Technologies Co., Ltd Backup Method and Backup Device for TCP Connection
CN110581782A (en) * 2019-09-17 2019-12-17 中国联合网络通信集团有限公司 Disaster tolerance data processing method, device and system
CN111381987A (en) * 2020-03-13 2020-07-07 北京金山云网络技术有限公司 Message processing method and device, electronic equipment and medium
CN112667414A (en) * 2020-12-23 2021-04-16 平安普惠企业管理有限公司 Message queue-based message consumption method and device, computer equipment and medium
US20210218699A1 (en) * 2020-01-14 2021-07-15 Capital One Services, Llc Techniques to provide streaming data resiliency utilizing a distributed message queue system
CN113722117A (en) * 2020-11-10 2021-11-30 北京沃东天骏信息技术有限公司 Message queue processing method, thread pool parameter adjusting method, device and equipment
CN113742107A (en) * 2021-09-03 2021-12-03 广州新丝路信息科技有限公司 Processing method for avoiding message loss in message queue and related equipment
CN113900842A (en) * 2021-12-10 2022-01-07 飞狐信息技术(天津)有限公司 Message consumption method and device, electronic equipment and computer storage medium
CN114020529A (en) * 2021-10-29 2022-02-08 恒安嘉新(北京)科技股份公司 Backup method and device of flow table data, network equipment and storage medium
CN114116262A (en) * 2021-12-02 2022-03-01 北京宇信科技集团股份有限公司 Processing method, device, medium and equipment for distributed asynchronous data communication
CN114363407A (en) * 2021-12-24 2022-04-15 上海软素科技有限公司 Message service method and device, readable storage medium and electronic equipment

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130297808A1 (en) * 2011-01-06 2013-11-07 Huawei Technologies Co., Ltd Backup Method and Backup Device for TCP Connection
CN110581782A (en) * 2019-09-17 2019-12-17 中国联合网络通信集团有限公司 Disaster tolerance data processing method, device and system
US20210218699A1 (en) * 2020-01-14 2021-07-15 Capital One Services, Llc Techniques to provide streaming data resiliency utilizing a distributed message queue system
CN111381987A (en) * 2020-03-13 2020-07-07 北京金山云网络技术有限公司 Message processing method and device, electronic equipment and medium
CN113722117A (en) * 2020-11-10 2021-11-30 北京沃东天骏信息技术有限公司 Message queue processing method, thread pool parameter adjusting method, device and equipment
CN112667414A (en) * 2020-12-23 2021-04-16 平安普惠企业管理有限公司 Message queue-based message consumption method and device, computer equipment and medium
CN113742107A (en) * 2021-09-03 2021-12-03 广州新丝路信息科技有限公司 Processing method for avoiding message loss in message queue and related equipment
CN114020529A (en) * 2021-10-29 2022-02-08 恒安嘉新(北京)科技股份公司 Backup method and device of flow table data, network equipment and storage medium
CN114116262A (en) * 2021-12-02 2022-03-01 北京宇信科技集团股份有限公司 Processing method, device, medium and equipment for distributed asynchronous data communication
CN113900842A (en) * 2021-12-10 2022-01-07 飞狐信息技术(天津)有限公司 Message consumption method and device, electronic equipment and computer storage medium
CN114363407A (en) * 2021-12-24 2022-04-15 上海软素科技有限公司 Message service method and device, readable storage medium and electronic equipment

Also Published As

Publication number Publication date
CN115001998B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN105187249B (en) A kind of fault recovery method and device
Garg et al. Analysis of preventive maintenance in transactions based software systems
CN107451012B (en) Data backup method and stream computing system
CN107078925B (en) Heartbeat period setting method and terminal
CN103246589A (en) Multithread monitoring method and device
CN105978721A (en) Method, device and system for monitoring operation state of services in clustering system
CN100373341C (en) Distributed control method in priority for operation process
CN109391691A (en) The restoration methods and relevant apparatus that NAS is serviced under a kind of single node failure
CN109144792A (en) Data reconstruction method, device and system and computer readable storage medium
CN106572137A (en) Distributed service resource management method and apparatus
CN109768884A (en) The implementation method of communication system and its high availability, device and computer equipment
CN104516796A (en) Command set based network element backup and recovery method and device
CN102882805B (en) Avoid method and the device of link aggregation group from state transition
CN113535480A (en) Data disaster recovery system and method
CN109144787A (en) A kind of data reconstruction method, device, equipment and readable storage medium storing program for executing
CN110312245A (en) A kind of business monitoring method and device of transnational roaming terminal
CN114168071A (en) Distributed cluster capacity expansion method, distributed cluster capacity expansion device and medium
CN116701043B (en) Heterogeneous computing system-oriented fault node switching method, device and equipment
CN103931139A (en) Method and device for redundancy protection, and device and system
CN115001998A (en) Disaster recovery method and device for message service
CN102546250B (en) File publishing method and system based on main/standby mechanism
CN109617716B (en) Data center exception handling method and device
CN112269693B (en) Node self-coordination method, device and computer readable storage medium
CN103246558A (en) Application management method and device
CN109032762A (en) Virtual machine retrogressive method and relevant device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant