CN114116277A

CN114116277A - InfluxDB high-availability cluster implementation system and method

Info

Publication number: CN114116277A
Application number: CN202111287679.3A
Authority: CN
Inventors: 高翔宇; 曹博; 吴楠
Original assignee: Inspur Cloud Information Technology Co Ltd
Current assignee: Inspur Cloud Information Technology Co Ltd
Priority date: 2021-11-02
Filing date: 2021-11-02
Publication date: 2022-03-01

Abstract

The invention discloses a system and a method for realizing an InfluxDB high-availability cluster, and belongs to the technical field of big data storage. The InfluxDB high-availability cluster implementation system comprises a buffer son unit, a health state check unit and a reverse proxy unit; the buffer son unit is used for an asynchronous HTTP proxy of internal buffering; the health state checking unit is used for monitoring the profile state of each node and automatically traversing and deleting the InfluxDB node when recovering data; the reverse proxy unit is supported by Nginx to limit the number of HTTP requests that a client can make per unit time. The InfluxDB high-availability cluster implementation system writes the index into any number of InfluxDB nodes, distributes the inquired high-availability service among all the nodes, and has good popularization and application values.

Description

InfluxDB high-availability cluster implementation system and method

Technical Field

The invention relates to the technical field of big data storage, and particularly provides a system and a method for realizing an InfluxDB high-availability cluster.

Background

Currently, after the version InfluxDBv0.9, a user cannot create an InfluxDB high-availability cluster from an open-source free version. Only commercial versions are currently available with infiluxdb Enterprise. This causes a number of inconveniences for the infiluxdb user, especially in professional settings, who think it is the company behind infiluxdb-infiluxata that tries to leverage OSS solutions to gain profits.

This situation is also immaterial to the InfluxData company, but the cost of commercial versions of InfluxDB is really a minor burden for many users. This is a significant cost to businesses or organizations that rely heavily on InfluxDB.

Although the solution of infilux Relay was later released by infiluxdata, it was not widely accepted because of the many unsolved problems. Therefore, a solution is needed to truly realize a high available architecture and solve the problems existing in the market.

Disclosure of Invention

The technical task of the present invention is to provide a system and a method for implementing an infiluxdb high-availability cluster, which write an index into any number of infiluxdb nodes and distribute a query high-availability service among all the nodes, in view of the above-mentioned existing problems.

In order to achieve the purpose, the invention provides the following technical scheme:

an InfluxDB high-availability cluster implementation system comprises a buffer son unit, a health state check unit and a reverse proxy unit;

the buffer son unit is used for an asynchronous HTTP proxy of internal buffering;

the health state checking unit is used for monitoring the profile state of each node and automatically traversing and deleting the InfluxDB node when recovering data;

the reverse proxy unit is supported by Nginx to limit the number of HTTP requests that a client can make per unit time.

Preferably, the Bufferson unit provides temporary high-availability storage using queues, and provides a simple proxy function for asynchronously buffered HTTP processing.

Preferably, the buffer unit comprises a Replay-component and a Recover-component, the Replay-component forwards the HTTP request directly to each upstream node, puts the failed request into a buffer, and the Recover-component continuously processes the queue and attempts to deliver the buffered request.

Preferably, when the request is sent to the Bufferson unit, the health status check unit forwards to the infixdb instance by means of a load balancing mechanism.

InfluxDB support/ping, which may facilitate verifying whether a service is running, but actually needs to ensure that it does not process any queries when a node recovers from a temporary failure and the cached data is still refreshing. It is not possible to rely entirely on the call/ping interface to verify that the node is healthy. The health check is run locally and the load balancer is used to put the node in an on/off state for queries.

Preferably, a local daemon is run on each InfluxDB instance, the InfluxDB instance performs two checks of checking and calling a/ping node of the InfluxDB and a Bufferson judgment node that data is not recovered, and the two checks are successful and then return to success.

Preferably, in the reverse proxy unit, reasonable load distribution is realized by passing all traffic through Nginx and Nginx http limit req module.

Some clients have extreme access patterns and it is desirable to ensure reasonable load distribution to avoid clustering problems. By passing all traffic through Nginx and Nginx http limit req module. Therefore, the stability of the cluster in the extreme access mode can be ensured to a greater extent.

The index is written to any number of InfluxDB nodes and queries are distributed among all nodes to provide high availability services. If a tool can be built to run reliable health checks on a single node, a standard load balancer is sufficient to solve the latter. For the former, we must establish a mechanism to forward the write or copy data.

Techniques for adding high availability and failover to InfluxDB include:

1. the problem of high write availability is solved by using indexes repeatedly written into a plurality of independent nodes;

2. resolving temporary faults with cache area payloads;

3. the problem of permanent faults is solved by utilizing backup restoration and a cache area effective load;

4. the traffic peaking problem is addressed with global and single database rate limiting.

A Bufferson unit, a health check unit, and a reverse proxy unit are added to illustrate the monitoring of the stack storage tier supported by infiluxdb.

The invention relates to an InfluxDB high-availability cluster realization method, which is realized by the InfluxDB high-availability cluster realization system, indexes are written into any number of InfluxDB nodes, query is distributed among all the nodes, a buffer unit is used for asynchronous HTTP proxy of internal buffering, a health state check unit is used for monitoring the status of each node, the InfluxDB nodes are automatically deleted in a traversing manner when data is recovered, and a reverse proxy unit is supported by Nginx so as to limit the number of HTTP requests which can be sent by a client in unit time.

Preferably, when the timing task is operated, the Rsync is used for timing backup of data, a temporary fault occurs, the Bufferson-recovery continuously extracts data from the buffer area, the node is delivered to operate when the node is available again, the instance is started, the instance is added to the Bufferson, the backup is restored, the infiluxdb is started, and the Bufferson starts to transmit a request for restoring the backup in the buffer area.

Compared with the prior art, the method for realizing the InfluxDB high-availability cluster has the following outstanding beneficial effects: the method for realizing the InfluxDB high-availability cluster realizes the InfluxDB high-availability cluster, increases the stability and the safety, and has good popularization and application values.

Drawings

Fig. 1 is a topology diagram of an infiluxdb high availability cluster implementation system according to the present invention.

Detailed Description

The system and method for implementing an infiluxdb high-availability cluster according to the present invention will be described in detail with reference to the accompanying drawings and embodiments.

Examples

As shown in fig. 1, the system for implementing an infiluxdb high-availability cluster of the present invention includes a Bufferson unit, a health status check unit, and a reverse proxy unit.

The Bufferson unit is used for internally buffered asynchronous HTTP proxy.

The buffer unit includes a Replay-component that forwards HTTP requests directly to each upstream node, places failed requests into a buffer, and a Recover-component that continuously processes the queue and attempts to deliver buffered requests. The Bufferson unit uses queues to provide temporary high availability storage, providing a simple proxy function for asynchronously buffered HTTP processing.

The health state checking unit is used for monitoring the profile state of each node and automatically traversing and deleting the InfluxDB node when the data is recovered.

When a request is sent to the Bufferson unit, the health status check unit forwards to the infiluxdb instance through a load balancing mechanism.

InfluxDB support/ping, which may facilitate verifying whether a service is running, but actually needs to ensure that it does not process any queries when a node recovers from a temporary failure and the cached data is still refreshing. It is not possible to rely entirely on the call/ping interface to verify that the node is healthy. The health check is run locally and the load balancer is used to put the node in an on/off state for queries. And running a local daemon program on each InfluxDB instance, wherein the InfluxDB instance executes two checks of checking and calling a/ping node of the InfluxDB and a Bufferson judgment node, wherein the data which is not recovered by the node is checked twice, and the two checks are successful and then return to success.

In the reverse proxy unit, reasonable load distribution is realized by passing all the traffic through Nginx and Nginx http limit req module.

Wherein techniques for adding high availability and failover to InfluxDB include:

2. resolving temporary faults with cache area payloads;

The method for realizing the InfluxDB high-availability cluster is realized by the InfluxDB high-availability cluster realization system. The index is written into InfluxDB nodes of any number, query is distributed among all the nodes, the Bufferson unit is used for asynchronous HTTP proxy of internal buffering, the health state checking unit is used for monitoring the status of each node, the InfluxDB nodes are automatically deleted in a traversing mode when data is recovered, and the reverse proxy unit is supported by Nginx so as to limit the number of HTTP requests which can be sent by a client in unit time.

When a timing task is operated, using Rsync to backup data at a timing, generating a temporary fault, continuously extracting data from the buffer by using a buffer-recovery, delivering the data to operate when the node is available again, starting an instance, adding the instance to the buffer, restoring the backup, starting the InfluxDB, and beginning transferring a request for recovering the backup in the buffer by using the buffer.

The above-described embodiments are merely preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims

1. An InfluxDB high-availability cluster implementation system is characterized in that: the system comprises a buffer son unit, a health state checking unit and a reverse proxy unit;

2. The InfluxDB high availability cluster implementation system of claim 1, wherein: the Bufferson unit provides temporary high-availability storage using queues, and provides a simple proxy function for asynchronously buffered HTTP processing.

3. The InfluxDB high availability cluster implementation system of claim 2, wherein: the buffer unit comprises a Replay-component and a recovery-component, wherein the Replay-component directly forwards the HTTP request to each upstream node, the failed request is placed into a buffer area, and the recovery-component continuously processes the queue and tries to transmit the buffer request.

4. The InfluxDB high availability cluster implementation system of claim 3, wherein: when a request is sent to the Bufferson unit, the health status check unit forwards to the infiluxdb instance through a load balancing mechanism.

5. The InfluxDB high availability cluster implementation system of claim 4, wherein: and running a local daemon program on each InfluxDB instance, wherein the InfluxDB instance executes two checks of checking and calling a/ping node of the InfluxDB and a Bufferson judgment node, wherein the data which is not recovered by the node is checked twice, and the two checks are successful and then return to success.

6. The InfluxDB high availability cluster implementation system of claim 5, wherein: in the reverse proxy unit, reasonable load distribution is realized by passing all the traffic through Nginx and Nginx http limit req module.

7. A method for realizing InfluxDB high-availability cluster is characterized in that: the method is implemented by the system for implementing the InfluxDB high-availability cluster according to any one of claims 1 to 6, indexes are written into InfluxDB nodes of any number, query is distributed among all the nodes, a buffer son unit is used for asynchronous HTTP proxy of internal buffering, a health state check unit is used for monitoring the profile state of each node, the InfluxDB nodes are automatically deleted in a traversing manner when data is recovered, and a reverse proxy unit is supported by Nginx to limit the number of HTTP requests which can be sent by a client in unit time.

8. The InfluxDB high availability cluster implementation method of claim 7, wherein: when a timing task is operated, using Rsync to backup data at a timing, generating a temporary fault, continuously extracting data from the buffer by using a buffer-recovery, delivering the data to operate when the node is available again, starting an instance, adding the instance to the buffer, restoring the backup, starting the InfluxDB, and beginning transferring a request for recovering the backup in the buffer by using the buffer.