CN111752758A

CN111752758A - Bifurcate-architecture InfluxDB high-availability system

Info

Publication number: CN111752758A
Application number: CN202010616116.3A
Authority: CN
Inventors: 赵山; 王阳; 厉颖
Original assignee: Inspur Cloud Information Technology Co Ltd
Current assignee: Inspur Cloud Information Technology Co Ltd
Priority date: 2020-07-01
Filing date: 2020-07-01
Publication date: 2020-10-09
Anticipated expiration: 2040-07-01
Also published as: CN111752758B

Abstract

The invention discloses an InfluxDB high-availability system with double main frameworks, which belongs to the field of computer databases and aims to solve the technical problem of ensuring the consistency of data, avoiding data loss and seamlessly performing fault transfer, and the technical scheme is as follows: the system comprises an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two InfluxDB nodes for use, the access module is used for executing a write request of a user on the two InfluxDB nodes at the same time so as to ensure the real-time consistency of the database nodes, and the access module alternately sends the read request of the user to the two InfluxDB nodes so as to realize load balance and improve query performance; the monitoring disaster recovery module is used for monitoring the available state of the back-end InfluxDB node, providing a state query interface for the access module, and automatically supplementing the data when the delay of the database data is found.

Description

Bifurcate-architecture InfluxDB high-availability system

Technical Field

The invention relates to the field of computer databases, in particular to an InfluxDB high-availability system with a double-main framework.

Background

InfluxDB is one of the most commonly used open-source time sequence databases in the market at present, and a relatively perfect open-source high-availability scheme is always lacked. The existing scheme usually adopts a timing master-slave synchronization mode to realize hot standby of data, and when a master node fails, the master node is switched to a slave node for use. Because the writing of the time sequence database is usually frequent, it is difficult to perform seamless switching when switching occurs due to a failure, and a part of data may be lost when switching is performed by using a timing synchronization mode.

Therefore, how to ensure the consistency of data and avoid data loss and seamlessly carry out fault transfer becomes a problem to be further solved in the InfluxDB high-availability scheme.

Patent document CN110659158A discloses an infiluxdb data backup method, device, and apparatus based on dual-computer hot standby environment, and a computer readable storage medium, including: detecting whether a host node of the InfluxDB fails or not periodically by using Keepalived; if the host node is not in fault, synchronizing data generated by the host node in a preset time period to the slave node of the InfluxDB every other preset time period; if the host node fails, upgrading the slave node to the current host node, and recording the failure time of the host node in the current host node; and after the host node is recovered, the host node is degraded to be the current slave node, and the data generated between the time of last data synchronization of the host node and the fault time is synchronized to the current host node. However, the technical scheme is based on a master-slave architecture, the slave nodes and the master node keep final consistency in a timing synchronization mode, data can have certain difference in two synchronization gaps, data can be lost during switching, and in addition, only one node of a cluster provides service at the same time, so that load balancing cannot be performed.

Disclosure of Invention

The technical task of the invention is to provide a bifurcate InfluxDB high-availability system to solve the problem of how to ensure the consistency of data, avoid data loss and seamlessly perform fault transfer.

The technical task of the invention is realized in the following way, the bifocal xdb high-availability system comprises an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two bifocal xdb nodes for use;

the access module is used for executing the write request of the user on the two InfluxDB nodes at the same time to ensure the real-time consistency of the database nodes, and simultaneously, the access module alternately sends the read request of the user to the two InfluxDB nodes to realize load balance and improve the query performance;

the monitoring disaster recovery module is used for monitoring the available state of the back-end InfluxDB node, providing a state query interface for the access module, and automatically supplementing the data when the delay of the database data is found.

Preferably, the access module serves as a proxy layer and provides an access protocol interface which is the same as that of the InfluxDB to the outside, and a user is connected with and accesses the database through the InfluxDB client or the Http client.

More preferably, the access protocol interface includes a GET/query interface and a POST/write interface.

More preferably, the specific processing logic of the GET/query interface is as follows:

(1) inquiring the node A of the InfluxDB database which should be accessed at this time, and executing the step (2) next;

(2) inquiring whether the node A is available or not from the monitoring disaster recovery module, and executing the step (3) next;

(3) judging whether the state of the node A is available:

if yes, executing the step (4) next;

② if not, skipping to the step (7)

(4) Sending the request to the node A and receiving a response, and then executing the step (5);

(5) and judging whether the node B is in an available state (active state):

if yes, executing the step (6) next;

if not, jumping to the step (10);

(6) setting the node B as a next access node, and executing the step (10) next time;

(7) judging whether the node B is available:

if yes, executing the step (8) next;

if not, executing the step (10) next step;

(8) marking the node A as an unavailable state (inactive state), starting a background asynchronous thread to update the state of the node A, and executing the step (9) next;

(9) setting the node B as a next access node, and jumping to the step (2) next;

(10) and returning the response to the client.

Preferably, after the background asynchronous thread is started in the step (8), the monitoring disaster recovery module is requested to update the node a state every 5 minutes, and when the node a state is updated to the available state, the background asynchronous thread exits.

Preferably, the processing logic of the POST/write interface is as follows:

receiving/write interface request;

querying all available InfluxDB nodes at the rear end;

(III) simultaneously sending the request to each database node;

(IV) whether any node requests success:

firstly, if any node returns success, a response is returned to the client;

and if all the nodes fail to request, responding to the request failure to the client.

Preferably, the monitoring disaster recovery modules are provided with two sets, and one node is taken as a main node and the other node is taken as a replication node respectively, so that the bidirectional consistency of the two nodes is ensured; the specific processing logic of the monitoring disaster recovery module is as follows:

s1, setting the InfluxDB node A as a main node and setting the node B as a copy node;

s2, checking the state of the node A;

s3, judging whether the node A is available (such as network interruption, database process crash, server crash and the like):

if yes, go to step S4;

if not, marking the node A as an unavailable state (inactive state), and jumping to the step S2 at an interval of 20 seconds;

s4, checking whether the data of the node A is consistent with the data of the node B:

if not, go to step S5;

s5, checking whether the data in node a lags behind node B:

if yes, go to step S6;

if not, jumping to step S8;

s6, marking the node A as an unavailable state;

s7, synchronizing missing data from the node B to the node A;

s8, mark node a as available (active state), and jump to step S2 at 20 seconds intervals.

More preferably, the step S5 checks whether the data in the node a lags behind the node B in units of schema and measurement.

The bifurcate-based InfluxDB high-availability system has the following advantages:

compared with other common InfluxDB high-availability system schemes in the market, the technical scheme provided by the invention can provide better data consistency characteristics and good database access performance, and meanwhile, single-node faults are not sensed by users, and the method has the automatic recovery capability of the fault nodes, reduces the maintenance complexity and shortens the fault time;

the invention supports two simultaneously read-write InfluxDB databases, one part of data can be simultaneously written into two database nodes, and the data can be obtained from any available database node in a polling way when the data is read, so as to achieve the effect of load balancing, provide the high availability characteristic of the database, ensure the availability of the InfluxDB and avoid the unavailability of the database caused by single-point failure;

and simultaneously, the access module is used as a proxy to automatically select the available database nodes, so that the access of the client is not interrupted.

Drawings

The invention is further described below with reference to the accompanying drawings.

Fig. 1 is a schematic structural diagram of an infiluxdb high-availability system with a dual-master architecture;

FIG. 2 is a block flow diagram of the specific processing logic of the/query interface;

FIG. 3 is a block flow diagram of processing logic for the/write interface;

fig. 4 is a block diagram of a specific processing logic of the monitoring disaster recovery module.

Detailed Description

The bifurcate-based infiluxdb high-availability system according to the present invention is described in detail below with reference to the drawings and the specific embodiments of the present application.

Example (b):

as shown in fig. 1, the bifurcate-based infiluxdb high-availability system of the present invention includes two sets of access modules and two sets of monitoring disaster recovery modules, where the access modules and the monitoring disaster recovery modules are used in cooperation with two infiluxdb nodes, and the two sets of monitoring disaster recovery modules respectively use one node as a master node and the other node as a replication node, so as to ensure bidirectional consistency between the two nodes. The access module executes the write request of the user on the two InfluxDB nodes at the same time, so that the real-time consistency of the database nodes is ensured, and simultaneously the access module alternately sends the read request of the user to the two InfluxDB nodes, so that the load balance is realized, and the query performance is improved; the monitoring disaster recovery module monitors the available state of the back-end InfluxDB node, provides a state query interface for the access module, and automatically supplements data when finding that the database data has delay, which is specifically as follows:

when any InfluxDB node fails or data delay exists, the monitoring disaster recovery module informs the access module to block read-write operation on the node, tries to connect the failed node and automatically completes data; and when the completion of the data completion, informing the access module to recover the read-write operation of the node.

The access module serves as a proxy layer and provides an access protocol interface which is the same as that of the InfluxDB to the outside, and a user is connected with and accesses the database through an InfluxDB client or an Http client. The access protocol interface comprises a GET/query interface and a POST/write interface.

As shown in fig. 2, the specific processing logic of the GET/query interface is as follows:

(3) judging whether the state of the node A is available:

if yes, executing the step (4) next;

② if not, skipping to the step (7)

(5) and judging whether the node B is in an available state (active state):

if yes, executing the step (6) next;

if not, jumping to the step (10);

(7) judging whether the node B is available:

if yes, executing the step (8) next;

if not, executing the step (10) next step;

(8) marking the node A as an unavailable state (inactive state), starting a background asynchronous thread to update the state of the node A, and executing the step (9) next; after the background asynchronous thread is started, requesting the monitoring disaster recovery module to update the state of the node A every 5 minutes, and when the state of the node A is updated to be an available state, exiting the background asynchronous thread;

(9) setting the node B as a next access node, and jumping to the step (2) next;

(10) and returning the response to the client.

As shown in FIG. 3, the processing logic of the POST/write interface is as follows:

receiving/write interface request;

querying all available InfluxDB nodes at the rear end;

(III) simultaneously sending the request to each database node;

(IV) whether any node requests success:

firstly, if any node returns success, a response is returned to the client;

As shown in fig. 4, the specific processing logic of the monitoring disaster recovery module is as follows:

s2, checking the state of the node A;

if yes, go to step S4;

if not, go to step S5;

s5, checking whether the data in the node A lags behind the node B by taking the schema and the measurement as units:

if yes, go to step S6;

if not, jumping to step S8;

s6, marking the node A as an unavailable state;

s7, synchronizing missing data from the node B to the node A;

The above process is executed in the background at regular time, so as to update the node state in real time and maintain the data consistency. The invention is based on a double-main framework, ensures the real-time consistency of two nodes by a double-write mode, can realize the read load balance of the two nodes and improves the query performance.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. An InfluxDB high-availability system with a double-main framework is characterized by comprising an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two InfluxDB nodes for use;

2. The bifocal xdb high-availability system according to claim 1, wherein the access module serves as a proxy layer and provides the same access protocol interface as the bifocal xdb, and a user connects to and accesses the database through the bifocal xdb client or the Http client.

3. The bifurcate infiluxdb high availability system according to claim 2, wherein the access protocol interface comprises GET/query interface and POST/write interface.

4. The bifurcate-architecture infiluxdb high-availability system according to claim 3, wherein the GET/query interface specifically processes logic as follows:

(3) judging whether the state of the node A is available:

if yes, executing the step (4) next;

② if not, skipping to the step (7)

(5) judging whether the node B is in an available state:

if yes, executing the step (6) next;

if not, jumping to the step (10);

(7) judging whether the node B is available:

if yes, executing the step (8) next;

if not, executing the step (10) next step;

(8) marking the node A as an unavailable state, starting a background asynchronous thread to update the state of the node A, and executing the step (9) next;

(9) setting the node B as a next access node, and jumping to the step (2) next;

(10) and returning the response to the client.

5. The bifurcate infiluxdb high availability system according to claim 4, wherein after the background asynchronous thread is started in step (8), the monitoring disaster recovery module is requested every 5 minutes to update the node a status, and when the node a status is updated to the available status, the background asynchronous thread exits.

6. The bifurcate InfluxDB high availability system according to claim 3, 4 or 5, wherein the POST/write interface has the following processing logic:

receiving/write interface request;

querying all available InfluxDB nodes at the rear end;

(III) simultaneously sending the request to each database node;

(IV) whether any node requests success:

firstly, if any node returns success, a response is returned to the client;

7. The bifurcate-architecture infiluxdb high-availability system according to claim 1, wherein the monitoring disaster recovery module is provided with two sets, and one node is a master node and the other node is a replication node, respectively, so as to ensure bidirectional consistency between the two nodes; the specific processing logic of the monitoring disaster recovery module is as follows:

s2, checking the state of the node A;

s3, judging whether the node A is available:

if yes, go to step S4;

if not, marking the node A as an unavailable state, and jumping to the step S2;

if not, go to step S5;

s5, checking whether the data in node a lags behind node B:

if yes, go to step S6;

if not, jumping to step S8;

s6, marking the node A as an unavailable state;

s7, synchronizing missing data from the node B to the node A;

s8, marking the node A as the available state, and jumping to the step S2.

8. The bifurcate infiluxdb high availability system according to claim 7, wherein the step S5 is to check whether the data in node a lags behind node B in units of schema and measurement.