CN108628717A - A kind of Database Systems and monitoring method - Google Patents

A kind of Database Systems and monitoring method Download PDF

Info

Publication number
CN108628717A
CN108628717A CN201810174967.XA CN201810174967A CN108628717A CN 108628717 A CN108628717 A CN 108628717A CN 201810174967 A CN201810174967 A CN 201810174967A CN 108628717 A CN108628717 A CN 108628717A
Authority
CN
China
Prior art keywords
node
monitored
operating status
monitoring
redis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810174967.XA
Other languages
Chinese (zh)
Inventor
狄仁杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Chen Sen Century Polytron Technologies Inc
Original Assignee
Beijing Chen Sen Century Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chen Sen Century Polytron Technologies Inc filed Critical Beijing Chen Sen Century Polytron Technologies Inc
Priority to CN201810174967.XA priority Critical patent/CN108628717A/en
Publication of CN108628717A publication Critical patent/CN108628717A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/142Reconfiguring to eliminate the error
    • G06F11/143Reconfiguring to eliminate the error with loss of software functionality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Abstract

This application discloses a kind of Database Systems, including:Monitored node group receives condition monitoring and the switching control of monitoring node for externally providing service by the monitored node with master-slave back-up relationship;Node is monitored, the operating status for obtaining monitored node in the monitored node group is based on the operating status, and control monitored node switches over.To provide more reliable High Availabitity Redis Database Systems, reading speed is slow caused by solving the problems, such as secondary positioning.

Description

A kind of Database Systems and monitoring method
Technical field
This application involves field of computer technology, and in particular to a kind of Database Systems.The application is related to a kind of use simultaneously In the monitoring method and device of Database Systems.
Background technology
As the Internet model and the development of service, web site performance are increasingly paid attention to by Internet enterprises.In order to optimize Web site performance promotes response speed, and high concurrent hot spot data is generally cached (Cache) in memory by website, rather than directly Data are read from back-end data base, for example, electric business promotes scene, movable meeting-place page is generally concentrated at the progress of some time point Promotion, the QPS (Query Per Second or per second query rates) of these pages is often especially high, can generally use big portion The request of point identical information is pressed in alleviates the pressure of database on caching, to support high concurrent to access;So-called QPS is mutual The index of server, i.e. server manageable request amount per second are weighed in networking.Wherein, Redis is common website Cache (caching) tool.So-called Redis is that a use ANSI C language increased income writes, supports network, can be based on memory Also can persistence log type, Key-Value databases.Redis clusters are often built by the way of cluster (i.e. in website Redis Cluster) to ensure the high availability of Redis, service is persistently provided, ensures that service can be used always.
Existing Redis officials Clustering scheme is tieed up using non-stop layer mode by interchange messages between node It is unified to protect cluster state, Fig. 3 shows the Redis officials aggregated structure schematic diagram of the prior art, wherein the circle in figure indicates Run the node of Redis.The program distributes number using the mode of Hash slot (hash slot, i.e., the slot distributed with hash algorithm) According to 16384 slot (it is marked as 0~16383) of the Redis clusters default allocation built, these slots are divided according to setting It is fitted on different Redis nodes, all Redis node mutual connections, inside optimizes transmission speed using binary protocol And bandwidth, client only need to connect any one enabled node in cluster, store true according to key according to following algorithm when data The node in the Hash slot section that store surely:CRC16 (key) %16384;CRC16 in the algorithmic formula is data communication It is used for a kind of debugging check code of data transmission error detection in field, i.e., polynomial computation is carried out to data, obtained result is attached Behind data packet, receiving device also executes similar algorithm, to ensure the correctness and integrality of data transmission.
There are the following problems in existing Redis officials Clustering scheme:Since Redis Cluster regard one as Whole, client connection wherein any one node is operated, and is connected when the key of client request inquiry is not stored in it Present node on when, needing secondary positioning to complete operation causes reading speed slow, i.e., present node returns to redirection information and refers to To correct node to complete to operate.
Invention content
The application provides a kind of Database Systems, and the key to solve client request operation is not stored in what it was connected The problem for needing secondary positioning to complete operation when on present node and causing reading speed slow.
In addition the application provides a kind of monitoring method for Database Systems.
The application also provides a kind of monitoring device for Database Systems.
The application provides a kind of Database Systems, including:Monitored node group, monitoring node;Wherein,
Monitored node group receives prison for externally providing service by the monitored node with master-slave back-up relationship Control condition monitoring and the switching control of node;
Node is monitored, the operating status for obtaining monitored node in the monitored node group is based on the operation State, control monitored node switch over.
Optionally, the monitored node group is two quilts with master-slave back-up relationship by running Redis databases Node composition is monitored, one is host node, and another is to keep heartbeat detection and data same from node, host node and between node Step.
Optionally, the monitoring node, including detection module, for according to long periodicity when setting to monitored node group Sense command is sent, command execution results is received, determines the operating status of the monitored node.
Optionally, the sense command is to write data command or reading data command.
Optionally, the detection module, including condition adjudgement submodule, for carrying out following processing:
If write data order return runs succeeded, judge the operating status of monitored node for normal condition;It is no Then, the operating status of monitored node is abnormality;Alternatively,
If the reading data command return runs succeeded, and read-out result is used for condition adjudgement with what is be previously written Fixed data it is consistent, then judge the operating status of monitored node for normal condition;Otherwise, the operating status of monitored node For abnormality.
Optionally, the monitoring node, including from node failure processing module, for carrying out from the abnormality of node Reason, including:
If the operating status of the host node be normal condition, and it is described from the operating status of node be abnormality, From node processes described in then terminating, and restart described from node processes.
Optionally, it from node processes described in the termination, and restarts described from node processes, including following processing:
It is described from node processes to execute kill orders termination;
It empties described from node perdurable data;
From node processes described in starting, the host node belonging to it is specified;
The host node data full dose is synchronized to described from node.
Optionally, the monitoring node, including VIP management modules, for being arranged and managing the monitored node group Virtual IP address;Wherein, the monitored node group externally provides service by virtual IP address.
Optionally, the monitoring node, including handover module are cut for carrying out principal and subordinate in the primary node status exception It changes, including following processing:
If the operating status of the host node be abnormality and it is described from the operating status of node be normal condition, It is promoted to new host node from node by described, deletes virtual IP address from former host node, and virtual IP address is floated to new host node.
Optionally, the handover module, including node bring back to life submodule, for bringing back to life abnormal former host node.
Optionally, described to bring back to life abnormal former host node, including following processing:
Judge that former host node Redis processes whether there is;
If so, executing kill terminates former host node Redis processes, former host node data are emptied;The former main section of restarting Redis processes on point, are set to new from node, and synchrodata is asked to new host node.
Optionally, the monitored node group, quantity are more than 1.
Optionally, the monitored node is additionally operable to receive and orders and return to implementing result, including:
Receive the sense command of the monitoring node, and return command implementing result;And/or
The switching command for receiving the monitoring node, executes corresponding operation.
The application also provides a kind of monitoring method for Database Systems, including:
Sense command is sent to monitored node group;
Command execution results are received, the fortune of monitored node in monitored node group is determined according to the command execution results Row state;
Based on the operating status, control monitored node switches over.
Optionally, the monitored node group is two quilts with master-slave back-up relationship by running Redis databases Node composition is monitored, one is host node, and another is to keep heartbeat detection and data same from node, host node and between node Step.
Optionally, described to send sense command, including following processing to monitored node group:
According to long periodicity when setting sense command is sent to monitored node group.
Optionally, the host node and described from node has been respectively configured for connect the management IP monitored, has passed through the pipe It manages IP and receives the sense command, and return command implementing result.
Optionally, the sense command is to write data command or reading data command.
Optionally, the operation shape that monitored node in monitored node group is determined according to the command execution results State, including:
If write data order return runs succeeded, judge the operating status of monitored node for normal condition;It is no Then, the operating status of monitored node is abnormality;Alternatively,
If the reading data command return runs succeeded, and read-out result is used for condition adjudgement with what is be previously written Fixed data it is consistent, then judge the operating status of monitored node for normal condition;Otherwise, the operating status of monitored node For abnormality.
Optionally, described to be based on the operating status, control monitored node switches over, including following processing:
If the operating status of the host node be normal condition, and it is described from the operating status of node be abnormality, From node processes described in then terminating, and restart described from node processes.
Optionally, it from node processes described in the termination, and restarts described from node processes, including following processing:
It is described from node processes to execute kill orders termination;
It empties described from node perdurable data;
From node processes described in starting, the host node belonging to it is specified;
The host node data full dose is synchronized to described from node.
Optionally, the monitored node group externally provides the mode of service, is to provide service by virtual IP address.
Optionally, described to be based on the operating status, control monitored node switches over, including following processing:
Judge the operating status of the host node for abnormality, and it is described from the operating status of node be normal shape State;It is then promoted to new host node from node by described, deletes virtual IP address from former host node, and virtual IP address is floated to new master Node.
Optionally, described to be based on the operating status, control monitored node switches over, including following processing:
Judge that former host node Redis processes whether there is;
If so, executing kill terminates former host node Redis processes, former host node data are emptied;The former main section of restarting Redis processes on point, are set to new from node, ask data full dose being synchronized to new host node new from section Point.
Optionally, the operation shape that monitored node in monitored node group is determined according to the command execution results State, after this step, including following processing:
If monitored node is host node, judge to whether there is VIP on the host node;If it is not, then giving the host node VIP is set.
Optionally, the monitored node group, quantity are more than 1.
The application also provides a kind of monitoring device for Database Systems, including:
Detection unit, for sending sense command to monitored node group;
Status determining unit determines monitored node for receiving command execution results according to the command execution results The operating status of monitored node in group;
Switch unit, for being based on the operating status, control monitored node switches over.
Compared with prior art, the application has the following advantages:
Database Systems provided by the present application, including:Monitored node group, for passing through the quilt with master-slave back-up relationship Monitoring node externally provides service, receives condition monitoring and the switching control of monitoring node;Node is monitored, for obtaining The operating status of monitored node in the monitored node group, is based on the operating status, and control monitored node is cut It changes.By making data be stored in each node without fragment, secondary positioning scene is not present, and work as main Redis node failures Automatically it will be upgraded to new main Redis nodes from Redis nodes, while bringing back to life the main Redis nodes of original of failure, ensure that entire cluster is held Continuous offer service provides more reliable High Availabitity Redis data to solve the problems, such as that reading speed caused by secondary positioning is slow Library system.
Description of the drawings
Fig. 1 is a kind of Database Systems configuration diagram provided by the embodiments of the present application;
Fig. 2 Redis high-availability systems structural schematic diagrams provided by the embodiments of the present application;
Fig. 3 is the Redis officials aggregated structure schematic diagram of the prior art;
Fig. 4 is a kind of process chart of monitoring method for Database Systems provided by the embodiments of the present application;
Fig. 5 is the Redis high-availability systems of monitoring method of the deployment provided by the embodiments of the present application for Database Systems Structural schematic diagram;
Fig. 6 is Redis high-availability systems master-slave swap process schematic provided by the embodiments of the present application;
Fig. 7 is a kind of monitoring device schematic diagram for Database Systems provided by the embodiments of the present application.
Specific implementation mode
Many details are elaborated in the following description in order to fully understand the application.But the application can be with Much implement different from other manner described here, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.
The application provides a kind of Database Systems.The application be related to simultaneously a kind of monitoring method for Database Systems and Device.It is described in detail one by one in the following embodiments.
The application one embodiment provides a kind of Database Systems.
A kind of Database Systems provided by the embodiments of the present application are illustrated below in conjunction with Fig. 1 to Fig. 2 and Fig. 6.Wherein, Fig. 1 is a kind of Database Systems configuration diagram provided by the embodiments of the present application;Fig. 2 Redis high provided by the embodiments of the present application Available system structural schematic diagram;Fig. 6 is Redis high-availability systems master-slave swap process schematic provided by the embodiments of the present application.
Database Systems shown in FIG. 1, including:Monitored node group 101 and monitoring node 102.
The monitored node group 101, for externally providing service by the monitored node with master-slave back-up relationship, Receive condition monitoring and the switching control of monitoring node.
Website often caches high concurrent hot spot data with Redis rather than is directly read from back-end data base, to promote sound Speed is answered, it is extremely important to build the offer service of High Availabitity Redis systems stays.Redis be one based on memory can also be lasting The Key-Value databases of change.
Database Systems provided by the embodiments of the present application, are Redis high-availability systems, configuration diagram referring to Fig. 2, this Kind framework mode data do not need fragment write-in, therefore secondary orientation problem is not present when reading.
The monitored node group 101, specifically two quilts with master-slave back-up relationship by running Redis databases Node composition is monitored, one is host node 101-1, another for from node 101-2, host node and from keeping heartbeat detection between node It is synchronous with data, service is really externally provided by host node, and from node as backup, the server sections of two operation Redis Point is also Redis nodes.The master-slave back-up relationship refers to keeping main Redis nodes synchronous with from Redis nodes, specific to It is same to configure Redis environment in the embodiment of the present application, from the Redis node initializing stages, connected from Redis nodes The data full dose on main Redis nodes is synchronized to from Redis nodes after to main Redis nodes, and is starting to work normally After carry out increment synchronization, the write operation of main Redis nodes will be synchronized to from Redis nodes, keep host node and two from node A database data is consistent.
In the embodiment of the present application, the monitored node group externally provides the mode of service, is provided by virtual IP address Service, i.e., client is communicated by the host node in virtual IP address and monitored node group, interactive information.Specifically, only At least needed in the case of one monitored node group give monitored node group plan three IP address, including two management IP and One virtual IP address for externally providing service connection client, i.e., the host node of the described monitored node group and from node are matched respectively The management IP for connecting monitoring node has been set, the monitoring of monitoring node is received by management IP;And client and monitored The communication of node group is to carry out socket communications by virtual IP address.The virtual IP address (or VIP, i.e. Virtual IP), is one Not with network interface card (or the NIC, i.e. Network in certain computer (or server) or a computer (or server) Interface Card) connected IP address, data packet is sent to this address VIP, but all data are still passed through True network interface forwarding;VIP is chiefly used in connecting redundancy, for example, in a computer or NIC event occurs for an address VIP Transfer to another optional computer or NIC that can still respond connection when barrier.By host node and from node duoble computer disaster-tolerance, externally use Virtual IP address provides service, avoids single machine failure risk, when a node breaks down, still can keep externally servicing, Improve the availability of whole system.
In the embodiment of the present application, the monitored node in monitored node group 101, also reception, which are ordered and returned, executes knot Fruit specifically includes:
Receive the sense command of the monitoring node, and return command implementing result;And/or
The switching command for receiving the monitoring node, executes corresponding operation.
The monitoring node 102, the operating status for obtaining monitored node in the monitored node group are based on institute Operating status is stated, control monitored node switches over.
In the embodiment of the present application, the monitoring node 102 is the server node of an operation monitoring tools, and effect is Control the master-slave swap of monitored node group.Specifically, the monitoring tools, are write using shell, when the event of main Redis nodes When barrier, new host node will be promoted to from Redis nodes automatically, and the main Redis nodes of original of failure are brought back to life, ensures entire number Service is provided according to library systems stay, ensures high availability;The monitoring tools that Shell writes can run directly in most service Device, good compatibility.
It should be noted that in actual deployment, monitoring tools can monitor multigroup principal and subordinate Redis nodes simultaneously, i.e., monitored The quantity of node group can be more than 1.For example, two groups of Redis main and subordinate nodes are as follows:
Monitored node group A:Main Redis 10.0.0.2slave redis 10.0.0.2vip 10.0.0.5
Monitored node group B:master redis 10.0.0.11slave redis 10.0.0.12vip 10.0.0.15;This two groups of Redis nodes are connected to the monitored tool monitoring of monitoring node.
Preferably, the monitoring node, including detection module, for according to long periodicity when setting to monitored node group Sense command is sent, command execution results is received, determines the operating status of the monitored node.For example, the embodiment of the present application In, the monitoring node issues detection life to host node and from node respectively by host node and from the respective management IP of node It enables, detects host node and the state from node.Wherein, the sense command is to write data command or reading data command.For example, Every five seconds for example writes 123456 data to host node and from one key fixed to one of node transmission;Alternatively, having prestored specific The corresponding values of the key can be read with save value 123456, every five seconds for example.
In the embodiment of the present application, the detection module, including condition adjudgement submodule, for writing data command according to above-mentioned Or read returning the result for data command and judge whether the operating status of each node normal, it specifically includes and carries out following processing:
If write data order return runs succeeded, judge the operating status of monitored node for normal condition;It is no Then, the operating status of monitored node is abnormality;Alternatively,
If the reading data command return runs succeeded, and read-out result is used for condition adjudgement with what is be previously written Fixed data it is consistent, then judge the operating status of monitored node for normal condition;Otherwise, the operating status of monitored node For abnormality.
For example, judging state according to write operation, 123456 are respectively written into main Redis nodes and from Redis nodes, KEY is key0, if the orders of SET key0 123456 return successfully, then it is assumed that operating status is normal;Judge shape according to read operation State, main Redis nodes and has been previously written 123456, KEY key0 from Redis nodes, if GET key0 orders return 123456, then it is assumed that operating status is normal.
In the embodiment of the present application, the monitoring node 102 further includes handover module, and the handover module is used in the master Master-slave swap, including following processing are carried out when node state exception:If the operating status of the host node be abnormality and It is described from the operating status of node be normal condition, then new host node is promoted to from node by described, from former host node delete Virtual IP address, and virtual IP address is floated to new host node.Specifically, in Redis high-availability systems provided by the embodiments of the present application Host node and from node handoff procedure schematic diagram referring to Fig. 6.For example, the IP of monitoring node is 192.168.0.1, main Redis (claims For R1) and from Redis nodes (be known as R2) same Redis environment, the management of main Redis and monitoring node communication are installed respectively IP is 192.168.0.2, and the management IP communicated from Redis and monitoring node is 192.168.0.3, externally provides the virtual of service IP (VIP) is 192.168.1.100, and port numbers (port) are 6000;Original state is that VIP can be configured to master by monitoring node On the network interface card of Redis nodes, i.e., client connects Redis by 192.168.1.100 and port 6000 and accesses service;Work as monitoring After monitoring nodes to host node R1 abnormal states, it will be promoted to new host node from node R 2, VIP is solved from former host node R1 Binding, reconfigures on new host node R2, continues externally to provide service.
In the embodiment of the present application, the handover module, including node bring back to life submodule, for bringing back to life the abnormal main section of original Point, the former host node brought back to life becomes new slave node, and is connected to new host node.For example, former host node is R1, it is former from node For R2, after carrying out a master-slave swap, R2 becomes new host node, and R1 becomes new slave node after being brought back to life.It is described to bring back to life Abnormal former host node, specifically includes following processing:
Judge that former host node Redis processes whether there is;
If so, executing kill terminates former host node Redis processes, former host node data are emptied;The former main section of restarting Redis processes on point, are set to new from node, and synchrodata is asked to new host node.
For example, after the completion of the Redis process initiations of new slave node, SYNC orders (i.e. synch command), host node are sent BGSAVE orders are executed by data persistence to disk after receiving the order and record performed write order, which executes To the write order for sending snapshot document from node and being recorded after finishing;Receive snapshot document from node, be loaded into receive it is fast According to execution carrys out the write order recorded above of autonomous node and synchronized to complete full dose;After completion full dose synchronizes, new main section Keep data consistent by increment synchronization between point and new slave node.
In the embodiment of the present application, the monitoring node 102 further includes from node failure processing module, for carrying out from section The abnormality processing of point, i.e.,:When the operating status of the host node is normal condition, and the operating status from node For abnormality, then termination is described from node processes, and restarts described from node processes.It is specific to bring back to life under operation includes State processing:
It is described from node processes to execute kill orders termination;
It empties described from node perdurable data;
From node processes described in starting, the host node belonging to it is specified;
The host node data full dose is synchronized to described from node.
Database Systems provided by the embodiments of the present application detect the operating status of each Redis nodes, as main Redis automatically It when node breaks down, automatically switches to from Redis nodes, continues to provide service to client, avoid the event of single machine Redis Hinder risk, ensure that the high availability of system.
Based on a kind of embodiment of Database Systems provided by the present application, present invention also provides one kind being used for data The embodiment of the monitoring method of library system.
Below in conjunction with fig. 4 to fig. 6, the monitoring method provided by the embodiments of the present application for Database Systems is said It is bright.Wherein, Fig. 4 is a kind of process chart of monitoring method for Database Systems provided by the embodiments of the present application;Fig. 5 is The Redis high-availability system structural schematic diagrams of monitoring method of the deployment provided by the embodiments of the present application for Database Systems;Fig. 6 It is Redis high-availability systems master-slave swap process schematic provided by the embodiments of the present application.
Since the present embodiment is based on above-described embodiment, so describing fairly simple, relevant part refers to State the corresponding explanation of embodiment.
Monitoring method shown in Fig. 4 for Database Systems, including:Step S401 to step S403.
Step S401 sends sense command to monitored node group.
For promoted response speed, website general cache high concurrent hot spot data rather than directly from back-end data base read, Redis as one based on memory can also persistence Key-Value databases, be a kind of common cache tools, build It is extremely important that High Availabitity Redis systems stays provide service.Monitoring side provided by the embodiments of the present application for Database Systems Method disposes configuration diagram referring to Fig. 2, including:Monitored node group and monitoring node;Wherein, state monitored node group, be by Two monitored nodes composition with master-slave back-up relationship of Redis databases is run, one is host node, and another is from section Point, host node and keeps heartbeat detection synchronouss with data between node, and host node really externally provides service, and from node work Server node for backup, two operation Redis is also Redis nodes;The monitoring node is third operation monitoring work The server node of tool, is specifically write using shell, and big multiserver, good compatibility can be directly run on;As main Redis When node failure, new host node will be promoted to from Redis nodes automatically, and the main Redis nodes of original of failure are brought back to life, ensured Entire Database Systems persistently provide service, ensure high availability, and under this mode, and data do not need fragment when being written, Secondary orientation problem is not present when therefore reading data.It should be noted that in actual deployment, monitoring tools can monitor more simultaneously Group principal and subordinate's Redis nodes, the i.e. quantity of monitored node group can be more than 1.
The master-slave back-up relationship refers to keeping main Redis nodes synchronous with from Redis nodes, for example, same configuration Redis environment is synchronized to from the Redis node initializing stages by the data full dose on main Redis nodes from Redis nodes, And increment synchronization is carried out after starting normal work, the write operation of main Redis nodes will be synchronized to from Redis nodes, be kept Host node is synchronous with from node data.In the embodiment of the present application, it is to pass through synchronization after connecting host node from node that full dose, which synchronizes, SYNC is ordered to replicate the total data of host node;The process of increment synchronization is that host node often executes a write order, at the same to from Node sends identical write order, and the write order received is received and executed from node, all to the database manipulation to host node It is automatically applied to, from node database, keep the data in two databases consistent.
In the embodiment of the present application, the monitored node group externally provides the mode of service, is provided by virtual IP address Service, i.e., client is communicated by the host node in virtual IP address and monitored node group, interactive information.Specifically, for The case where monitoring a monitored node group at least needs to plan that three IP address, including two management IP and one externally provide The virtual IP address of service connection client, i.e., the host node of the described monitored node group and from node have been respectively configured for connecting The management IP of monitoring receives the sense command, and return command implementing result by management IP;The monitoring node passes through These management IP is communicated with monitored node, transmitting order to lower levels;And the communication of client and monitored node group is to pass through void Quasi- IP carries out socket communications.The virtual IP address (or VIP, i.e. Virtual IP), be one not with certain computer (or service Device) or a computer (or server) in network interface card (or NIC, i.e. Network Interface Card) it is connected IP address, data packet is sent to this address VIP, but all data are still forwarded by true network interface;VIP It is chiefly used in connecting redundancy, for example, an address VIP transfers to another optional calculating when a computer or NIC break down Machine or NIC can still respond connection.By host node and from node duoble computer disaster-tolerance, service externally is provided using virtual IP address, is avoided Single machine failure risk still can keep externally servicing when a node breaks down, and improve the available of whole system Property.
The purpose of this step is to send sense command to each monitored node of monitored node group.Specific to the application In embodiment, the monitoring node issues inspection to host node and from node respectively by host node and from the respective management IP of node Order is surveyed, host node and the state from node are detected.Preferably, it is sent and is examined to monitored node group according to long periodicity when setting Survey order.Wherein, the sense command is to write data command or reading data command.For example, every five seconds for example is to host node and from node It sends a key fixed to one and writes 123456 data;It specifically can be with save value 123456, every five seconds for example alternatively, having prestored Read the corresponding values of the key.
Step S402 receives command execution results, is determined in monitored node group and supervised according to the command execution results Control the operating status of node.
The purpose of this step is to determine the operating status of each monitored node in monitored node group.
In the embodiment of the present application, the operation of monitored node in monitored node group is determined according to the command execution results State, these orders refer to the sense command issued in step S401, and specific condition adjudgement operation includes:
If write data order return runs succeeded, judge the operating status of monitored node for normal condition;It is no Then, the operating status of monitored node is abnormality;Alternatively,
If the reading data command return runs succeeded, and read-out result is used for condition adjudgement with what is be previously written Fixed data it is consistent, then judge the operating status of monitored node for normal condition;Otherwise, the operating status of monitored node For abnormality.
If for example, judging state by write operation, 123456 are respectively written into main Redis nodes and from Redis nodes, KEY is key0, if the orders of SET key0 123456 return successfully, then it is assumed that operating status is normal;If judging shape by read operation State, main Redis nodes and has been previously written 123456, KEY key0 from Redis nodes, if GET key0 orders return 123456, then it is assumed that operating status is normal.
In addition, the operating status that monitored node in monitored node group is determined according to the command execution results it Afterwards, monitoring node also carries out following processing:
If monitored node is host node, judge to whether there is VIP on the host node;If it is not, then giving the host node VIP is set.
In the embodiment of the present application, the monitoring node carries out aforesaid operations, the monitoring work by the monitoring tools of its operation Tool is write using shell.
Step S403, is based on the operating status, and control monitored node switches over.
The purpose of this step is that control monitored node carries out master-slave swap, and the node for ensureing externally to provide service is fortune The normal node of row state.
In the embodiment of the present application, the master-slave swap process of monitored node includes following processing:
Judge the operating status of the host node for abnormality, and it is described from the operating status of node be normal shape State;It is then promoted to new host node from node by described, deletes virtual IP address from former host node, and virtual IP address is floated to new master Node.Specifically, in Redis high-availability systems provided by the embodiments of the present application host node and from node handoff procedure schematic diagram join See Fig. 6.For example, the IP of monitoring node is 192.168.0.1, main Redis (being known as R1) and from Redis nodes (being known as R2) respectively Same Redis environment is installed, the management IP that main Redis and monitoring node communicate is 192.168.0.2, from Redis and monitoring The management IP of node communication is 192.168.0.3, and the virtual IP address (VIP) for externally providing service is 192.168.1.100, port numbers (port) it is 6000;Original state is that VIP can be configured on the network interface card of main Redis nodes by monitoring node, i.e., client passes through 192.168.1.100 connecting Redis with port 6000 accesses service;After monitoring monitoring nodes to host node R1 abnormal states, It will be promoted to new host node from node R 2, VIP is solved binding from former host node R1, reconfigured on new host node R2, Continue externally to provide service.
Preferably, after completing master-slave swap, the former host node for occurring abnormal is carried out bringing back to life operation, is become newly after bringing back to life From node.Continue above-mentioned example, former host node is R1, and original is R2 from node, and after carrying out a master-slave swap, R2 becomes new Host node, and R1 becomes new slave node after being brought back to life.Wherein, described to bring back to life operation and specifically include following processing:
Judge that former host node Redis processes whether there is;
If so, executing kill terminates former host node Redis processes, former host node data are emptied;The former main section of restarting Redis processes on point, are set to new from node, ask data full dose being synchronized to new host node new from section Point.
Specific to the embodiment of the present application, after the completion of the Redis process initiations of new slave node, SYNC orders are sent (i.e. together Step command), host node executes BGSAVE orders by data persistence to disk after receiving the order and records performed write Order, the order to from node send snapshot document and the write order that is recorded after being finished;Snapshot text is received from node Part is loaded into the snapshot received, executes and carrys out the write order recorded above of autonomous node to complete full dose synchronization;Complete full dose After synchronizing, keep data consistent by increment synchronization between new host node and new slave node.
In addition, in the embodiment of the present application, the monitoring node records each monitored node by the monitoring tools run The relevant informations such as running log information, such as memory, CPU, disk, process status, the analysis data as Operation and Maintenance.
In the embodiment of the present application, the monitoring node passes through the monitoring tools that are run, moreover it is possible to drift about and carry out to virtual IP address Control, and to carrying out recovery processing from node abnormality.For example, when former host node exception-triggered master-slave swap and to original After host node is brought back to life, by monitoring tools and monitored node log analysis, it is still desirable to keep former host node continue at To provide the host node of service, former host node is suspended to as new slave node on new host node after being brought back to life at this time, then is led to Cross a series of status checkouts of monitoring tools progress (i.e. new from node) to former host node and attended operation of the monitoring node Determine that the node software environment restores normal with hardware state afterwards, pressure resets to former host node (i.e. new from node) new Virtual IP address is floated to the node by host node from original from node (i.e. new host node), and by original from node reset to it is new from Node.Described refers to after monitoring nodal test to from node exception, to exception to carrying out recovery processing from node abnormality Slave node also carry out bringing back to life operation, i.e.,:If the operating status of the host node is normal condition, and the fortune from node Row state is abnormality, then termination is described from node processes, and is restarted described from node processes.Wherein, it specifically brings back to life Operation includes following processing:
It is described from node processes to execute kill orders termination;
It empties described from node perdurable data;
From node processes described in starting, the host node belonging to it is specified;
The host node data full dose is synchronized to described from node.
Corresponding with the embodiment of a kind of monitoring method for Database Systems provided by the present application, the application also provides A kind of monitoring device for Database Systems.
With reference to Fig. 7, it illustrates according to a kind of monitoring device schematic diagram for Database Systems provided by the present application.By It is substantially similar to embodiment of the method in device embodiment, so describing fairly simple, relevant part refers to method implementation The corresponding explanation of example.Device embodiment described below is only schematical.
The application provides a kind of monitoring device for Database Systems, including:
Detection unit 701, for sending sense command to monitored node group;
Status determining unit 702 determines monitored section for receiving command execution results according to the command execution results The operating status of monitored node in point group;
Switch unit 703, for being based on the operating status, control monitored node switches over.
Optionally, the monitored node group is two quilts with master-slave back-up relationship by running Redis databases Node composition is monitored, one is host node, and another is to keep heartbeat detection and data same from node, host node and between node Step.
Optionally, the inspection unit 701, including cycle detection submodule, for according to long periodicity when setting to quilt It monitors node group and sends sense command.
Optionally, the host node and described from node has been respectively configured for connect the management IP monitored, has passed through the pipe It manages IP and receives the sense command, and return command implementing result.
Optionally, the sense command is to write data command or reading data command.
Optionally, the status determining unit 702, including judgment sub-unit, it is monitored for being judged by following manner The operating status of node, including:
If write data order return runs succeeded, judge the operating status of monitored node for normal condition;It is no Then, the operating status of monitored node is abnormality;Alternatively,
If the reading data command return runs succeeded, and read-out result is used for condition adjudgement with what is be previously written Fixed data it is consistent, then judge the operating status of monitored node for normal condition;Otherwise, the operating status of monitored node For abnormality.
Optionally, the switch unit 703, including subelement is handled from node failure, for carrying out following processing:
If the operating status of the host node be normal condition, and it is described from the operating status of node be abnormality, From node processes described in then terminating, and restart described from node processes.
Optionally, described to handle subelement from node failure, specifically carry out following processing:
It is described from node processes to execute kill orders termination;
It empties described from node perdurable data;
From node processes described in starting, the host node belonging to it is specified;
The host node data full dose is synchronized to described from node.
Optionally, the monitored node group externally provides the mode of service, is to provide service by virtual IP address.
Optionally, the switch unit 703, including from liter boss's unit, for carrying out following processing:
Judge the operating status of the host node for abnormality, and it is described from the operating status of node be normal shape State;It is then promoted to new host node from node by described, deletes virtual IP address from former host node, and virtual IP address is floated to new master Node.
Optionally, the switch unit 703, including subelement is brought back to life, for carrying out following processing:
Judge that former host node Redis processes whether there is;
If so, executing kill terminates former host node Redis processes, former host node data are emptied;The former main section of restarting Redis processes on point, are set to new from node, ask data full dose being synchronized to new host node new from section Point.
Optionally, the switch unit 703, including virtual IP address manage subelement, for executing knot according to the order Fruit determines in monitored node group after the operating status of monitored node, carries out following processing:
If monitored node is host node, judge to whether there is VIP on the host node;If it is not, then giving the host node VIP is set.
Optionally, the monitored node group, quantity are more than 1.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
1, computer-readable medium can be by any side including permanent and non-permanent, removable and non-removable media Method or technology realize information storage.Information can be computer-readable instruction, data structure, the module of program or other numbers According to.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), fast flash memory bank or other memory techniques, CD-ROM are read-only Memory (CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or Other magnetic storage apparatus or any other non-transmission medium can be used for storage and can be accessed by a computing device information.According to Herein defines, and computer-readable medium does not include non-temporary computer readable media (transitory media), is such as modulated Data-signal and carrier wave.
2, it will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program production Product.Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application Form.It can be used in the computer that one or more wherein includes computer usable program code moreover, the application can be used The computer program product implemented on storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Form.
Although the application is disclosed as above with preferred embodiment, it is not for limiting the application, any this field skill Art personnel are not departing from spirit and scope, can make possible variation and modification, therefore the guarantor of the application Shield range should be subject to the range that the application claim defined.

Claims (10)

1. a kind of Database Systems, which is characterized in that including:Monitored node group, monitoring node;Wherein,
Monitored node group receives monitoring section for externally providing service by the monitored node with master-slave back-up relationship The condition monitoring of point and switching control;
Node is monitored, the operating status for obtaining monitored node in the monitored node group is based on the operating status, Control monitored node switches over.
2. Database Systems according to claim 1, which is characterized in that the monitored node group is by running Redis Two monitored nodes composition with master-slave back-up relationship of database, one is host node, another for from node, host node and Keep heartbeat detection synchronous with data between node.
3. Database Systems according to claim 2, which is characterized in that the monitoring node, including detection module are used for Sense command is sent to monitored node group according to long periodicity when setting, receives command execution results, is determined described monitored The operating status of node.
4. Database Systems according to claim 3, which is characterized in that the sense command, be write data command or Read data command.
5. Database Systems according to claim 4, which is characterized in that the monitoring node, including VIP management modules, Virtual IP address for being arranged and managing the monitored node group;Wherein, the monitored node group is externally carried by virtual IP address For service.
6. Database Systems according to claim 5, which is characterized in that the monitoring node, including handover module are used for Master-slave swap, including following processing are carried out in the primary node status exception:
If the operating status of the host node be abnormality and it is described from the operating status of node be normal condition, by institute It states and is promoted to new host node from node, delete virtual IP address from former host node, and virtual IP address is floated to new host node.
7. a kind of monitoring method for Database Systems, which is characterized in that including:
Sense command is sent to monitored node group;
Command execution results are received, the operation shape of monitored node in monitored node group is determined according to the command execution results State;
Based on the operating status, control monitored node switches over.
8. the method according to the description of claim 7 is characterized in that the monitored node group, is by running Redis databases Two monitored nodes composition with master-slave back-up relationship, one is host node, another for from node, host node and from node Between keep heartbeat detection it is synchronous with data.
9. according to the method described in claim 8, it is characterized in that, the monitored node group, externally provides the side of service Formula is to provide service by virtual IP address.
10. according to the method described in claim 9, it is characterized in that, described be based on the operating status, control monitored node It switches over, including following processing:
Judge the operating status of the host node for abnormality, and it is described from the operating status of node be normal condition;Then It is promoted to new host node from node by described, deletes virtual IP address from former host node, and virtual IP address is floated to new host node.
CN201810174967.XA 2018-03-02 2018-03-02 A kind of Database Systems and monitoring method Pending CN108628717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810174967.XA CN108628717A (en) 2018-03-02 2018-03-02 A kind of Database Systems and monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810174967.XA CN108628717A (en) 2018-03-02 2018-03-02 A kind of Database Systems and monitoring method

Publications (1)

Publication Number Publication Date
CN108628717A true CN108628717A (en) 2018-10-09

Family

ID=63706170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810174967.XA Pending CN108628717A (en) 2018-03-02 2018-03-02 A kind of Database Systems and monitoring method

Country Status (1)

Country Link
CN (1) CN108628717A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669820A (en) * 2018-12-24 2019-04-23 广州君海网络科技有限公司 Task monitoring and managing method and device based on Kettle
CN109739877A (en) * 2018-11-21 2019-05-10 比亚迪股份有限公司 Database Systems and data managing method
CN109977161A (en) * 2019-03-28 2019-07-05 上海中通吉网络技术有限公司 The monitoring system of presto cluster
CN110361979A (en) * 2019-07-19 2019-10-22 北京交大思诺科技股份有限公司 A kind of safety computer platform in railway signal field
CN110611603A (en) * 2019-09-09 2019-12-24 苏州浪潮智能科技有限公司 Cluster network card monitoring method and device
CN110674192A (en) * 2019-10-09 2020-01-10 浪潮云信息技术有限公司 Redis high-availability VIP (very important person) drifting method, terminal and storage medium
CN110971662A (en) * 2019-10-22 2020-04-07 烽火通信科技股份有限公司 Two-node high-availability implementation method and device based on Ceph
CN111106947A (en) * 2018-10-29 2020-05-05 北京金山云网络技术有限公司 Node downtime repairing method and device, electronic equipment and readable storage medium
CN111444062A (en) * 2020-04-01 2020-07-24 山东汇贸电子口岸有限公司 Method and device for managing master node and slave node of cloud database
CN111901415A (en) * 2020-07-27 2020-11-06 星辰天合(北京)数据科技有限公司 Data processing method and system, computer readable storage medium and processor
CN112019601A (en) * 2020-08-07 2020-12-01 烽火通信科技股份有限公司 Two-node implementation method and system based on distributed storage Ceph
CN112131310A (en) * 2020-08-28 2020-12-25 北京思特奇信息技术股份有限公司 Database processing method and system, electronic device and storage medium
CN113778744A (en) * 2021-01-05 2021-12-10 北京沃东天骏信息技术有限公司 Task processing method, device, system and storage medium
CN114760224A (en) * 2021-12-24 2022-07-15 中国银联股份有限公司 System, method, apparatus, and storage medium for monitoring status of network channels

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060391A (en) * 2007-05-16 2007-10-24 华为技术有限公司 Master and spare server switching method and system and master server and spare server
CN102467508A (en) * 2010-11-04 2012-05-23 中兴通讯股份有限公司 Method for providing database service and database system
CN104579791A (en) * 2015-01-26 2015-04-29 浪潮电子信息产业股份有限公司 Method for achieving automatic K-DB main and standby disaster recovery cluster switching
CN104965850A (en) * 2015-04-29 2015-10-07 云南电网有限责任公司 Database high-available implementation method based on open source technology
CN106254100A (en) * 2016-07-27 2016-12-21 腾讯科技(深圳)有限公司 A kind of data disaster tolerance methods, devices and systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060391A (en) * 2007-05-16 2007-10-24 华为技术有限公司 Master and spare server switching method and system and master server and spare server
CN102467508A (en) * 2010-11-04 2012-05-23 中兴通讯股份有限公司 Method for providing database service and database system
CN104579791A (en) * 2015-01-26 2015-04-29 浪潮电子信息产业股份有限公司 Method for achieving automatic K-DB main and standby disaster recovery cluster switching
CN104965850A (en) * 2015-04-29 2015-10-07 云南电网有限责任公司 Database high-available implementation method based on open source technology
CN106254100A (en) * 2016-07-27 2016-12-21 腾讯科技(深圳)有限公司 A kind of data disaster tolerance methods, devices and systems

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111106947A (en) * 2018-10-29 2020-05-05 北京金山云网络技术有限公司 Node downtime repairing method and device, electronic equipment and readable storage medium
CN109739877B (en) * 2018-11-21 2020-03-31 比亚迪股份有限公司 Database system and data management method
CN109739877A (en) * 2018-11-21 2019-05-10 比亚迪股份有限公司 Database Systems and data managing method
CN109669820A (en) * 2018-12-24 2019-04-23 广州君海网络科技有限公司 Task monitoring and managing method and device based on Kettle
CN109977161A (en) * 2019-03-28 2019-07-05 上海中通吉网络技术有限公司 The monitoring system of presto cluster
CN110361979A (en) * 2019-07-19 2019-10-22 北京交大思诺科技股份有限公司 A kind of safety computer platform in railway signal field
CN110611603A (en) * 2019-09-09 2019-12-24 苏州浪潮智能科技有限公司 Cluster network card monitoring method and device
CN110611603B (en) * 2019-09-09 2021-08-31 苏州浪潮智能科技有限公司 Cluster network card monitoring method and device
CN110674192A (en) * 2019-10-09 2020-01-10 浪潮云信息技术有限公司 Redis high-availability VIP (very important person) drifting method, terminal and storage medium
CN110971662A (en) * 2019-10-22 2020-04-07 烽火通信科技股份有限公司 Two-node high-availability implementation method and device based on Ceph
CN111444062A (en) * 2020-04-01 2020-07-24 山东汇贸电子口岸有限公司 Method and device for managing master node and slave node of cloud database
CN111444062B (en) * 2020-04-01 2023-09-19 山东汇贸电子口岸有限公司 Method and device for managing master node and slave node of cloud database
CN111901415A (en) * 2020-07-27 2020-11-06 星辰天合(北京)数据科技有限公司 Data processing method and system, computer readable storage medium and processor
CN111901415B (en) * 2020-07-27 2023-07-14 北京星辰天合科技股份有限公司 Data processing method and system, computer readable storage medium and processor
CN112019601A (en) * 2020-08-07 2020-12-01 烽火通信科技股份有限公司 Two-node implementation method and system based on distributed storage Ceph
CN112019601B (en) * 2020-08-07 2022-08-02 烽火通信科技股份有限公司 Two-node implementation method and system based on distributed storage Ceph
CN112131310A (en) * 2020-08-28 2020-12-25 北京思特奇信息技术股份有限公司 Database processing method and system, electronic device and storage medium
CN113778744A (en) * 2021-01-05 2021-12-10 北京沃东天骏信息技术有限公司 Task processing method, device, system and storage medium
CN114760224A (en) * 2021-12-24 2022-07-15 中国银联股份有限公司 System, method, apparatus, and storage medium for monitoring status of network channels

Similar Documents

Publication Publication Date Title
CN108628717A (en) A kind of Database Systems and monitoring method
US11734306B2 (en) Data replication method and storage system
US11086555B1 (en) Synchronously replicating datasets
US10108367B2 (en) Method for a source storage device sending data to a backup storage device for storage, and storage device
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
US7962803B2 (en) Apparatus, system, and method for multi-address space tracing
US8161321B2 (en) Virtual machine-based on-demand parallel disaster recovery system and the method thereof
US7627775B2 (en) Managing failures in mirrored systems
US9329949B2 (en) Comprehensive error management capabilities for disaster recovery operations
US7269611B2 (en) Storage system and storage system control method
US9189348B2 (en) High availability database management system and database management method using same
US11321291B2 (en) Persistent version control for data transfer between heterogeneous data stores
CN102088490B (en) Data storage method, device and system
US20120180070A1 (en) Single point, scalable data synchronization for management of a virtual input/output server cluster
US11221785B2 (en) Managing replication state for deleted objects
WO2021226905A1 (en) Data storage method and system, and storage medium
US11409711B2 (en) Barriers for dependent operations among sharded data stores
US20210165760A1 (en) Managing Dependent Delete Operations among Data Stores
WO2017181430A1 (en) Method and device for duplicating database in distributed system
WO2018157605A1 (en) Message transmission method and device in cluster file system
CN113849136B (en) Automatic FC block storage processing method and system based on domestic platform
WO2017122060A1 (en) Parallel recovery for shared-disk databases
US11079960B2 (en) Object storage system with priority meta object replication
CN115905413A (en) Data synchronization platform based on Python corotation and DataX
US11093465B2 (en) Object storage system with versioned meta objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181009

RJ01 Rejection of invention patent application after publication