CN106209447B - The fault handling method and device of distributed caching - Google Patents

The fault handling method and device of distributed caching Download PDF

Info

Publication number
CN106209447B
CN106209447B CN201610529541.2A CN201610529541A CN106209447B CN 106209447 B CN106209447 B CN 106209447B CN 201610529541 A CN201610529541 A CN 201610529541A CN 106209447 B CN106209447 B CN 106209447B
Authority
CN
China
Prior art keywords
caching
master cache
master
failure
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610529541.2A
Other languages
Chinese (zh)
Other versions
CN106209447A (en
Inventor
雷亚武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN CHUANGMENG TIANDI TECHNOLOGY CO LTD
Original Assignee
SHENZHEN CHUANGMENG TIANDI TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN CHUANGMENG TIANDI TECHNOLOGY CO LTD filed Critical SHENZHEN CHUANGMENG TIANDI TECHNOLOGY CO LTD
Priority to CN201610529541.2A priority Critical patent/CN106209447B/en
Publication of CN106209447A publication Critical patent/CN106209447A/en
Application granted granted Critical
Publication of CN106209447B publication Critical patent/CN106209447B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The disclosure discloses the fault handling method and device of a kind of distributed caching.The described method includes: the master cache example run in monitoring distributed caching, obtain the master cache example of failure, by the master cache example for substituting the failure from caching example of the master cache example, it is corresponding to update the master slave mode between caching example and the master cache example of failure, the update for monitoring the master slave mode, according to the address of service for carrying out cache data access in the update modification proxy configurations of the master slave mode.The fault handling method and device of above-mentioned distributed caching can switch over the master cache example automatically to failure, realize the high availability of caching example, improve the ability to ward off risks of redis.

Description

The fault handling method and device of distributed caching
Technical field
This disclosure relates to distributed caching technical field, the in particular to fault handling method and dress of a kind of distributed caching It sets.
Background technique
With the increasingly increase of internet traffic, when separate unit cache server faces large-scale data access, often result in Overload and the defect for leading to excessively high operating lag, mostly using distributed caching technology come real in existing solution Now large-scale data buffer storage and access.Distributed caching technology passes through consistency hash algorithm for the distribution of data relative equilibrium In multiple cache servers, and the storage system of redis(key-value type) it is used as a kind of distributed caching storage system, by In the high efficiency synchronous and shirtsleeve operation order of data, it is widely used in a variety of applications.
Currently, redis is generally installed with master cache server and from cache server.Redis in master cache server Caching example is mainly responsible for read-write operation, and caches example only to by master cache server from the redis in cache server The data of middle redis caching example read-write carry out backup operation, this is for master cache server, and there is larger pressure.
When master cache server is because of failure delay machine, it cannot achieve the automatic switchover of principal and subordinate's cache server, can only rely on Manual intervention, and then cause business to stagnate in maintenance personnel's not special circumstances such as at the scene, to cannot achieve redis caching The high availability of example greatly reduces the ability to ward off risks of redis.
Summary of the invention
In order to solve to cannot achieve the high availability of redis caching example present in the relevant technologies, redis caching resists The lower problem of risk ability, present disclose provides a kind of fault handling method of distributed caching and devices.
A kind of fault handling method of distributed caching, which is characterized in that the described method includes:
The master cache example run in monitoring distributed caching obtains the master cache example of failure;
By the master cache example for substituting the failure from caching example of the master cache example, accordingly postpone described in update Deposit the master slave mode between example and the master cache example of failure;
The update for monitoring the master slave mode is modified in proxy configurations according to the update of the master slave mode and carries out caching number According to the address of service of access.
A kind of fault treating apparatus of distributed caching, which is characterized in that described device includes:
Failure monitoring module, the master cache example for running in monitoring distributed caching, the master cache for obtaining failure are real Example;
Fault processing module, for the master cache for substituting the failure from caching example of the master cache example is real Example is corresponding to update the master slave mode between caching example and the master cache example of failure;
Proxy configurations modified module is repaired for monitoring the update of the master slave mode according to the update of the master slave mode Change the address of service that cache data access is carried out in proxy configurations.
The technical scheme provided by this disclosed embodiment can include the following benefits:
In the operation of distributed caching, the master cache example of operation is monitored, obtains the master cache example of failure, it will be from caching Example substitutes the master cache example of failure, corresponding to update from the master slave mode between caching example and the master cache example of failure, The update for monitoring master slave mode, according to the service for carrying out cache data access in the update modification proxy configurations of master slave mode Location, so that subsequent carried out reading data is realized by modified address of service, even and if then master cache reality Data cached read-write will not be impacted by now breaking down, automatically will be from caching when master cache example breaks down The master cache example that example substitution breaks down, realizes the high availability of caching example, substantially increases the anti-risk of redis Ability.
It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited It is open.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention Example, and in specification together principle for explaining the present invention.
Fig. 1 is a kind of flow chart of the fault handling method of distributed caching shown according to an exemplary embodiment;
Fig. 2 is the master cache example that runs in the monitoring distributed caching of Fig. 1 corresponding embodiment, obtains the master cache of failure The flow chart of case step;
Fig. 3 is the triggering fault handling operation of Fig. 1 corresponding embodiment, and make master cache example substitutes failure from caching example Master cache example, the corresponding flow chart updated from the master slave mode step between caching example and the master cache example of failure;
Fig. 4 is that the basis of Fig. 3 corresponding embodiment is described from the selected flow chart from caching case step of caching example information;
Fig. 5 be Fig. 4 corresponding embodiment it is described normally from caching example collection in, select from caching case step stream Cheng Tu;
Fig. 6 is a concrete application scene figure of the troubleshooting of distributed caching;
Fig. 7 is a kind of block diagram of the fault treating apparatus of distributed caching shown according to an exemplary embodiment;
Fig. 8 is the block diagram of the failure monitoring module shown in Fig. 7 corresponding embodiment;
Fig. 9 is the block diagram of the fault processing module shown in Fig. 7 corresponding embodiment;
Figure 10 is the block diagram from the selected submodule of caching shown in Fig. 9 corresponding embodiment;
Figure 11 is the block diagram of the selected unit shown in Figure 10 corresponding embodiment.
Specific embodiment
Here will the description is performed on the exemplary embodiment in detail, the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended The example of device and method being described in detail in claims, some aspects of the invention are consistent.
Fig. 1 is a kind of flow chart of the fault handling method of distributed caching shown according to an exemplary embodiment.Such as Shown in Fig. 1, the fault handling method of the distributed caching be may comprise steps of.
In step s 110, the master cache example run in monitoring distributed caching obtains the master cache example of failure.
Distributed caching is deployed in machine, realizes that the master cache of distributed caching is real by the example run in machine Example.In one exemplary embodiment, signified machine is cache server, one or more cache server is formed to reality The server framework of existing data buffer storage.
Master cache example is operate in the redis process in cache server.It is realized by master cache example to caching number According to read-write operation.
It, may be because there are failures to cause the master cache example out of service, in turn in the operational process of master cache example Influence corresponding data cached read-write operation.At this point, monitoring sentry to cache server by the principal and subordinate being deployed in machine In all master cache example be monitored, safeguarded automatically when master cache example breaks down, so as to data cached Read-write is gone on smoothly.
It is that the status data for monitoring sentry according to principal and subordinate is monitored each master cache example that principal and subordinate, which monitors sentry, state In data comprising each master cache example information and it is corresponding from caching example information.According to status data, principal and subordinate monitors sentry Master cache example is monitored.
In the step s 120, the master cache example from caching example substitution failure of master cache example, corresponding update postpone Deposit the master slave mode between example and the master cache example of failure.
In cache server, also run it is corresponding with master cache example from caching example, from caching example be used for master The data for caching instance processes carry out backup operation.
When some master cache example breaks down, principal and subordinate monitoring sentry will make postponing for the master cache example of the failure Deposit example substitute the failure from caching example, i.e., will from caching example be changed to new master cache example, the master of failure is delayed Example is deposited to be changed to update the status data that principal and subordinate monitors sentry therewith from caching example.
For example, master cache example is caching example A in cache server, it is that caching is real from caching example with caching example A Example a, when monitoring caching example A failure, principal and subordinate, which monitors sentry, will make to cache example a substitution caching example A, will delay Example a is deposited by being changed to master cache example from caching example, caching example A is changed to by master cache example from caching example.
In step s 130, the update for monitoring master slave mode is modified in proxy configurations according to the update of master slave mode and is carried out The address of service of cache data access.
Firstly the need of being illustrated, the caching export agent for storing proxy configurations is deployed in a machine, should Machine will be as the interface for carrying out data interaction with cache server.
Proxy configurations are the configuration files cached in export agent, and the address of service in proxy configurations is to be directed toward buffer service It is realized according to the address of service in proxy configurations to data cached read-write operation data cached address in device.
Further, agent monitors client is also deployed in the machine, and agent monitors client is for carrying out principal and subordinate's prison Control the monitoring of sentry and the update of proxy configurations.
Specifically, principal and subordinate monitor sentry to master cache example and from caching example master slave mode change after, agency Monitor client listens to the change operation that principal and subordinate monitors sentry, will correspondingly modify and carry out data cached visit in proxy configurations The address of service asked, and then data cached read-write operation is carried out by the master cache example that new address of service is directed toward.
It, can be automatically real from caching using corresponding when master cache example breaks down by method as described above Example substitutes the master cache example of the failure, realizes the high availability of caching example, substantially increases the anti-risk energy of redis Power.
Fig. 2 is the description of the details shown according to an exemplary embodiment to step S110.Step S110 can wrap Include following steps.
In step S111, information request is sent to master cache example according to preset time interval.
Information request is the request signal that principal and subordinate monitors that sentry sends to master cache example, and information request is main slow for obtaining Deposit the configuration information of example.It, will be to the information request after master cache example receives the information request that principal and subordinate monitors sentry's transmission It is responded, monitors sentry to principal and subordinate and reply, reply content includes the configuration information of master cache example itself and corresponding From caching example information.
Time interval is preset, for example, time interval is set in advance as 10 seconds, then principal and subordinate monitors sentry every 10 Second sends information request to master cache example.
In step S112, reply of the master cache example to information request is received, according to the master cache for replying identification failure Example.
After the master cache example of normal operation receives information request, which is replied, reply content packet Include the master cache example itself configuration information and it is corresponding from caching example information.Principal and subordinate monitor sentry according to from caching example Information updates the status data of itself.
Principal and subordinate monitors sentry according to master cache example to the reply situation of information request, to master cache example with the presence or absence of event Barrier is identified.
Preferably, for a master cache example, there are multiple principals and subordinates to monitor sentry and master cache example is monitored, Duo Gezhu Information request is sent from monitoring sentry to master cache example.
When one of principal and subordinate, which monitors sentry, is not received by the reply of some master cache example, others will be inquired Principal and subordinate monitors sentry, if the principal and subordinate for being not received by the reply of some master cache example, which monitors sentry's quantity, reaches preset quantity When, it just will be considered that the master cache example breaks down.And so on, the prison of multiple master cache examples is just realized by this process Control.
For example, monitoring sentry to the principal and subordinate that master cache example is monitored has 3, respectively principal and subordinate monitors sentry 1, principal and subordinate It monitors sentry 2 and principal and subordinate monitors sentry 3.Returning for master cache example is not received by when there are 2 or more principals and subordinates to monitor sentry When multiple, it is considered as master cache example failure.Principal and subordinate monitors sentry 1, principal and subordinate monitors sentry 2 and principal and subordinate monitors sentry 3 and divides Information request is not sent to master cache example A, when principal and subordinate monitoring sentry 2 is not received by the reply of master cache example A, to principal and subordinate Monitoring sentry 1 and principal and subordinate monitor sentry 3 and inquire.It finally finds, only principal and subordinate monitors sentry 3 and receives caching example A's It replys, principal and subordinate monitors sentry 1 and principal and subordinate monitors the reply that sentry 2 is not received by master cache example A, then it is assumed that master cache example A breaks down.
During can also be according to preset time, master cache example monitor sentry to principal and subordinate and replys, then it is assumed that There are failures for master cache example, can also be other failure criterions.
Master cache example A is monitored for example, principal and subordinate monitors sentry, principal and subordinate monitors sentry every 10 seconds to master cache Example A sends information request, if being not received by the reply of master cache example A more than 30 seconds, then it is assumed that master cache example A occurs Failure.
By method as described above, information request is sent to each master cache example, information is asked according to master cache example The reply asked, the master cache example of automatic identification failure provide convenience for the automatic switching for carrying out principal and subordinate's example.
Fig. 3 is the description of the details shown according to an exemplary embodiment to step S120.Step S120 can wrap Include following steps.
In step S121, the master cache example for being retrieved as failure is preset from caching example information.
Master cache example from example information is cached is obtained by reply content of the master cache example to information request It takes.
Master cache example is in the reply content of information request, the configuration information including the master cache example itself, and It is corresponding from caching example information.
In step S122, according to selected from caching example from caching example information.
One master cache example is corresponding to can be one from caching example, is also possible to multiple.
When a master cache example only one it is corresponding from caching example when, directly determine should from cache example, so as to It carries out principal and subordinate and caches example change;When a master cache example there are it is multiple it is corresponding from caching example when, need to select from caching Example caches example change to carry out principal and subordinate.
In step S123, the master cache example from caching example substitution failure that will be selected, it is corresponding update it is selected from Cache the master slave mode between example and the master cache example of failure.
It will select from caching example as new master cache example, and the master cache example of failure is used as from caching in fact Example.After trouble shooting is resumed operation, master cache example being counted from caching example as new master cache example of failure According to backup.
For example, master cache example is caching example A, it is caching example a from caching example, when caching example A breaks down And carry out principal and subordinate cache example change after, caching example a become new master cache example, and cache example A become caching example a From caching example, caching example A trouble shooting resume operation after, as caching example A from caching example, it is slow for executing Deposit the backup operation of the data of example a processing.
By method as described above, according to the reply content of master cache example, can obtain automatically and master cache reality Example is corresponding from caching example information, and when master cache example breaks down, and selectes slow from the master of caching example and the failure It deposits example and carries out master-slave swap, realize the high availability of caching example, substantially increase the ability to ward off risks of redis.
Fig. 4 is the description of the details shown according to an exemplary embodiment to step S122.The step 122 may include Following steps.
It is corresponding from caching example according to the master cache example for determining failure from caching example information in step S1221.
According to the master cache example of failure from caching example information, get corresponding with the master cache example of the failure From caching example.
When according to from caching example information, it is corresponding from caching example only one when, using this from caching example as replacing Change the master cache example of failure from caching example;When according to from caching example information, it is corresponding from caching example have multiple when, Will be multiple selected from caching example in example from caching from this, so that the master cache example to failure is replaced.
In step S1222, information request is sent to from caching example.
Principal and subordinate monitors sentry and from caching example sends information request to the master cache example of failure, and information request is for obtaining The reply content for monitoring sentry to principal and subordinate from caching example is taken, to know the job information from caching example.
In step S1223, the reply from caching example to information request is received, is arranged according to replying from caching example Except abnormal from caching example, formed normally from caching example collection.
It is abnormal can be to break down from caching example from caching example.Principal and subordinate monitors master cache of the sentry to failure The multiple of example send information request from caching example, monitor sentry to principal and subordinate from caching example and reply, reply content packet Include the configuration information from caching example itself and the data information of backup.When receive some from caching example reply content master Preset quantity is not reached from monitoring sentry, then it is assumed that should break down from caching example.
It is abnormal from caching example may be data backup it is abnormal from caching example.It is received when principal and subordinate monitors sentry Some gets this from when the final updating time for caching instance backup data is more than preset from the reply content of caching example Between range, then it is assumed that should occur from caching instance data backup abnormal.
For example, preset time range be 50 seconds, from caching 1 Backup Data of example the final updating time in from it is current when Between between be divided into 51 seconds, then it is assumed that occur from caching 1 Backup Data of example abnormal.
Exclude abnormal after caching example, to can be used for principal and subordinate slow by the master cache example of failure other from caching example Deposit being formed normally from caching example from caching example collection for example change.
In step S1224, normally selected from caching example collection from caching example.
Normally all be up from caching example collection from caching example from caching example, can be according to from caching The corresponding priority of example is selected from caching example, for substituting the master cache example of failure.Can also according to from caching example ID Digital size, can also be selected according to the selected modes of others from caching example.
By method as described above, when master cache example breaks down, automatically in the master cache example of the failure It excludes to occur from caching example abnormal from caching example, avoids the occurrence of and abnormal is changed to master cache reality from caching example The case where example, improves the efficiency that troubleshooting is carried out when master cache example breaks down.
Fig. 5 is the description of the details shown according to an exemplary embodiment to step S1224.Step S1224 can be with Include the following steps.
In step S12241, obtain normally from caching example collection from the corresponding priority of caching example.
It is normally to carry out customized setting from the corresponding priority of caching example from caching example collection, can be set To respectively customized cis-position sequence is carried out from caching example, also can be set according to from caching example carry out Backup Data update when Between sort, can also be the setting of other priority, the other priority of various priority class can also be ranked up, herein Without limitation.
In step S12242, normally selected from caching example collection according to priority from caching example.
It is arranged according to priority, selectes the most preceding slave caching from caching example, as principal and subordinate's caching example change of ranking Example.
By method as described above, when master cache example breaks down, according to the setting of corresponding priority automatically from The master cache example of failure respectively selectes one from caching example from caching example, and the master cache example for avoiding the occurrence of failure is deposited Situation about can not select in multiple examples from caching, improves and carries out principal and subordinate's caching example when master cache example breaks down The efficiency of switching.
The fault handling method of distributed caching as above is elaborated below with reference to a specific application scenarios.
Specifically, when monitoring, some master is slow as shown in fig. 6, principal and subordinate monitoring sentry 200 is monitored master cache example When depositing example failure, just by its corresponding master cache example for substituting the failure from caching example, as agent monitors client End 300 detects that principal and subordinate monitors principal and subordinate's example replacement operation of sentry 200, just to the proxy configurations in caching export agent 100 File is updated, and modifies the address of service that cache data access is carried out in proxy configurations, even and if then master cache realization appearance Failure will not impact data cached read-write.To which when master cache example breaks down, automatic use is selected The failure master cache example is replaced from caching example, the high availability of caching example is realized, improves the anti-risk energy of redis Power.
Following is embodiment of the present disclosure, and the fault handling method that can be used for executing this above-mentioned distributed caching is implemented Example.For those undisclosed details in the apparatus embodiments, the fault handling method for please referring to disclosure distributed caching is real Apply example.
Fig. 7 is a kind of block diagram of the fault treating apparatus of distributed caching shown according to an exemplary embodiment, such as Fig. 7 Shown, the fault treating apparatus of the distributed caching includes but is not limited to: failure monitoring module 110, fault processing module 120 with And proxy configurations modified module 130.
Failure monitoring module 110, the master cache example for running in monitoring distributed caching, obtains the master cache of failure Example.
Fault processing module 120, it is corresponding to update by the master cache example from caching example substitution failure of master cache example Master slave mode between caching example and the master cache example of failure.
Proxy configurations modified module 130 is modified according to the update of master slave mode and is acted on behalf of for monitoring the update of master slave mode The address of service of cache data access is carried out in configuration.
The function of modules and the realization process of effect are specifically detailed in the failure of above-mentioned distributed caching in above-mentioned apparatus The realization process of step is corresponded in processing method, details are not described herein.
Optionally, as shown in figure 8, failure monitoring module 110 includes but is not limited to: 111 He of information request sending submodule Fault verification submodule 112.
Information request sending submodule 111, for sending information request to master cache example according to preset time interval.
Fault identification submodule 112 identifies failure according to replying for receiving reply of the master cache example to information request Master cache example.
Optionally, as shown in figure 9, fault processing module 120 includes but is not limited to: from caching acquisition submodule 121, postponing It deposits selected submodule 122 and principal and subordinate updates submodule 123.
From caching acquisition submodule 121, for be retrieved as failure master cache example set from caching example information.
From selected submodule 122 is cached, for according to selected from caching example from caching example information.
Principal and subordinate updates submodule 123, the master cache example from caching example substitution failure for that will select, corresponding to update The selected master slave mode between caching example and the master cache example of failure.
Optionally, as shown in Figure 10, include but is not limited to from caching selected submodule 122: from caching acquiring unit 1221, Request transmitting unit 1222, Abnormality remove unit 1223 and selected unit 1224.
From caching acquiring unit 1221, for the slave caching according to the master cache example for determining failure from caching example information Example.
Request transmitting unit 1222, for sending information request to from caching example.
Abnormality remove unit 1223, for receiving from reply of the caching example to information request, according to replying from caching It excludes abnormal from caching example in example, is formed normally from caching example collection.
Selected unit 1224, for normally from caching example collection, selecting from caching example.
Optionally, as shown in figure 11, select unit 1224 include but is not limited to: priority obtain subelement 12241 and from Cache selected subelement 12242.
Priority obtains subelement 12241, normally corresponding excellent from caching example in example collection from caching for obtaining First grade.
From selected subelement 12242 is cached, for normally being selected from caching example collection from caching according to priority Example.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and change can executed without departing from the scope.The scope of the present invention is limited only by the attached claims.

Claims (8)

1. a kind of fault handling method of distributed caching, which is characterized in that the described method includes:
The master cache example run in monitoring distributed caching obtains the master cache example of failure;
By the master cache example for substituting the failure from caching example of the master cache example, corresponding update is described real from caching Master slave mode between example and the master cache example of failure;
The update for monitoring the master slave mode is modified in proxy configurations according to the update of the master slave mode and carries out data cached visit The address of service asked;
Wherein, the master cache example run in the monitoring distributed caching, the step of obtaining the master cache example of failure include:
According to preset time interval, multiple principals and subordinates monitor sentry and send information request to the master cache example respectively, and connect Receive reply of the master cache example to the information request;
When the principal and subordinate monitors sentry and do not receive the reply of the master cache example, inquire whether other principals and subordinates monitor sentry Receive the reply;
If the quantity that the principal and subordinate for not receiving the reply monitors sentry reaches preset quantity, confirm that the master cache example goes out Existing failure.
2. the method according to claim 1, wherein the substituting from caching example by the master cache example The master cache example of the failure, the corresponding update master slave mode between caching example and the master cache example of failure Step includes:
The master cache example for being retrieved as the failure is preset from caching example information;
According to described selected from caching example from caching example information;
The master cache example for substituting the failure described in will be selected from caching example accordingly updates described selected real from caching Master slave mode between example and the master cache example of the failure.
3. according to the method described in claim 2, it is characterized in that, described selected from caching from caching example information according to The step of example includes:
Determine that the master cache example of the failure is corresponding from caching example from caching example information according to described;
Information request is sent from caching example to described;
The reply from caching example to the information request is received, is excluded from caching example according to described reply described It is abnormal from caching example, formed normally from caching example collection;
It is normally selected from caching example collection described from caching example.
4. according to the method described in claim 3, it is characterized in that, it is described it is described normally from caching example collection in select from Cache example the step of include:
It obtains described normally from caching example collection from the corresponding priority of caching example;
It is normally selected from caching example collection described from caching example according to the priority.
5. a kind of fault treating apparatus of distributed caching, which is characterized in that described device includes:
Failure monitoring module, the master cache example for running in monitoring distributed caching, obtains the master cache example of failure;
Fault processing module, for the master cache example to be substituted to the master cache example of the failure, phase from caching example The master slave mode between caching example and the master cache example of failure should be updated;
Proxy configurations modified module modifies generation according to the update of the master slave mode for monitoring the update of the master slave mode The address of service of cache data access is carried out in reason configuration;
Wherein, the failure monitoring module includes:
Request-to-send submodule, for making multiple principals and subordinates monitor sentry respectively to the master cache according to preset time interval Example sends information request, and receives reply of the master cache example to the information request;
Inquiry subelement is replied, for inquiring when the principal and subordinate monitors sentry and do not receive the reply of the master cache example Other principals and subordinates monitor whether sentry receives the reply;
Fault identification submodule, the quantity for monitoring sentry in the principal and subordinate for not receiving the reply reach the feelings of preset quantity Under condition, confirm that the master cache example breaks down.
6. device according to claim 5, which is characterized in that the fault processing module includes:
From caching acquisition submodule, the master cache example for being retrieved as the failure is preset from caching example information;
From selected submodule is cached, for according to described selected from caching example from caching example information;
Principal and subordinate updates submodule, for substituting the master cache example of the failure described in selecting from caching example, accordingly more The new selected master slave mode between caching example and the master cache example of the failure.
7. device according to claim 6, which is characterized in that described to include: from the selected submodule of caching
From caching acquiring unit, for according to it is described from caching example information determine the failure master cache example it is corresponding from Cache example;
Request transmitting unit, for sending information request from caching example to described;
Abnormality remove unit, for receiving the reply from caching example to the information request, according to the reply in institute It states and is excluded from caching example abnormal from caching example, formed normally from caching example collection;
Selected unit, for normally being selected from caching example collection described from caching example.
8. device according to claim 7, which is characterized in that the selected unit includes:
Priority obtains subelement, described normally from caching example collection from the corresponding priority of caching example for obtaining;
From selected subelement is cached, it is used for according to the priority described normally selected real from caching in example collection from caching Example.
CN201610529541.2A 2016-07-07 2016-07-07 The fault handling method and device of distributed caching Active CN106209447B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610529541.2A CN106209447B (en) 2016-07-07 2016-07-07 The fault handling method and device of distributed caching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610529541.2A CN106209447B (en) 2016-07-07 2016-07-07 The fault handling method and device of distributed caching

Publications (2)

Publication Number Publication Date
CN106209447A CN106209447A (en) 2016-12-07
CN106209447B true CN106209447B (en) 2019-11-15

Family

ID=57465636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610529541.2A Active CN106209447B (en) 2016-07-07 2016-07-07 The fault handling method and device of distributed caching

Country Status (1)

Country Link
CN (1) CN106209447B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108234170B (en) * 2016-12-15 2021-06-22 北京神州泰岳软件股份有限公司 Monitoring method and device for server cluster
CN109714430A (en) * 2019-01-16 2019-05-03 深圳壹账通智能科技有限公司 Distributed caching method, device, computer system and storage medium
CN109831521B (en) * 2019-03-11 2021-08-31 深圳市珍爱捷云信息技术有限公司 Cache instance management method and device, computer equipment and storage medium
CN110489255B (en) * 2019-07-19 2023-01-06 苏州浪潮智能科技有限公司 Method and system for optimizing read error processing flow in solid state disk
CN112463514A (en) * 2019-09-06 2021-03-09 北京京东尚科信息技术有限公司 Monitoring method and device for distributed cache cluster
CN110674192A (en) * 2019-10-09 2020-01-10 浪潮云信息技术有限公司 Redis high-availability VIP (very important person) drifting method, terminal and storage medium
CN110737732A (en) * 2019-10-25 2020-01-31 广西交通科学研究院有限公司 electromechanical equipment fault early warning method
CN111831489A (en) * 2020-06-23 2020-10-27 新浪网技术(中国)有限公司 Sentinel mechanism-based MySQL fault switching method and device
CN111884847B (en) * 2020-07-20 2022-06-28 北京百度网讯科技有限公司 Method and device for processing fault
CN112100005B (en) * 2020-08-20 2022-11-25 紫光云(南京)数字技术有限公司 Redis copy set implementation method and device
CN112702209A (en) * 2020-12-28 2021-04-23 紫光云技术有限公司 Method for monitoring sentinel with mysql high-availability architecture
CN112866035A (en) * 2021-02-24 2021-05-28 紫光云技术有限公司 Method for switching specified slave node into master node of redis service on cloud platform
CN114138732A (en) * 2021-09-29 2022-03-04 聚好看科技股份有限公司 Data processing method and device
CN114785713B (en) * 2022-03-31 2024-02-23 度小满科技(北京)有限公司 Method and proxy middleware for realizing high availability of Redis cluster
CN115150470B (en) * 2022-09-06 2022-11-25 百融至信(北京)科技有限公司 Cache data processing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562543A (en) * 2009-05-25 2009-10-21 阿里巴巴集团控股有限公司 Cache data processing method and processing system and device thereof
CN102103544A (en) * 2009-12-16 2011-06-22 腾讯科技(深圳)有限公司 Method and device for realizing distributed cache
CN103207841A (en) * 2013-03-06 2013-07-17 青岛海信传媒网络技术有限公司 Method and device for data reading and writing on basis of key-value buffer
CN104778102A (en) * 2015-03-27 2015-07-15 深圳市创梦天地科技有限公司 Master-slave switching method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562543A (en) * 2009-05-25 2009-10-21 阿里巴巴集团控股有限公司 Cache data processing method and processing system and device thereof
CN102103544A (en) * 2009-12-16 2011-06-22 腾讯科技(深圳)有限公司 Method and device for realizing distributed cache
CN103207841A (en) * 2013-03-06 2013-07-17 青岛海信传媒网络技术有限公司 Method and device for data reading and writing on basis of key-value buffer
CN104778102A (en) * 2015-03-27 2015-07-15 深圳市创梦天地科技有限公司 Master-slave switching method and system

Also Published As

Publication number Publication date
CN106209447A (en) 2016-12-07

Similar Documents

Publication Publication Date Title
CN106209447B (en) The fault handling method and device of distributed caching
CN105871649A (en) Node server, service side and configuration file updating method thereof and updating control method
CN105592127B (en) Application management system for cloud computing environment
CN112764956B (en) Database exception handling system, database exception handling method and device
CN105429799A (en) Server backup method and device
CN113778623A (en) Resource processing method and device, electronic equipment and storage medium
CN115086330B (en) Cross-cluster load balancing system
CN102546269A (en) Method and system capable of fast monitoring internet protocol (IP) network
CN104468283A (en) Multi-host management system monitoring method, device and system
CN103647833A (en) Continuous sequence number generation system and continuous sequence number generation method
CN104506372A (en) Method and system for realizing host-backup server switching
CN113347037A (en) Data center access method and device
CN112118130A (en) Self-adaptive distributed cache master/standby state information switching method and device
CN111988347B (en) Data processing method of board hopping machine system and board hopping machine system
US8117181B2 (en) System for notification of group membership changes in directory service
EP3570169B1 (en) Method and system for processing device failure
CN114328033A (en) Method and device for keeping service configuration consistency of high-availability equipment group
CN112214377B (en) Equipment management method and system
CN105323271B (en) Cloud computing system and processing method and device thereof
CN105429795A (en) Alarm monitoring system and method
CN111314126B (en) Service IP deployment method and system and monitoring equipment
CN104468767A (en) Method and system for detecting cloud storage data collision
GB2440575A (en) Updating data in a child node from a parent node using data version numbers.
WO2020248613A1 (en) Domain name configuration and deployment management method and device
CN111064608A (en) Master-slave switching method and device of message system, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant