CN111752758A - Bifurcate-architecture InfluxDB high-availability system - Google Patents

Bifurcate-architecture InfluxDB high-availability system Download PDF

Info

Publication number
CN111752758A
CN111752758A CN202010616116.3A CN202010616116A CN111752758A CN 111752758 A CN111752758 A CN 111752758A CN 202010616116 A CN202010616116 A CN 202010616116A CN 111752758 A CN111752758 A CN 111752758A
Authority
CN
China
Prior art keywords
node
influxdb
nodes
executing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010616116.3A
Other languages
Chinese (zh)
Other versions
CN111752758B (en
Inventor
赵山
王阳
厉颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202010616116.3A priority Critical patent/CN111752758B/en
Publication of CN111752758A publication Critical patent/CN111752758A/en
Application granted granted Critical
Publication of CN111752758B publication Critical patent/CN111752758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Hardware Redundancy (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an InfluxDB high-availability system with double main frameworks, which belongs to the field of computer databases and aims to solve the technical problem of ensuring the consistency of data, avoiding data loss and seamlessly performing fault transfer, and the technical scheme is as follows: the system comprises an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two InfluxDB nodes for use, the access module is used for executing a write request of a user on the two InfluxDB nodes at the same time so as to ensure the real-time consistency of the database nodes, and the access module alternately sends the read request of the user to the two InfluxDB nodes so as to realize load balance and improve query performance; the monitoring disaster recovery module is used for monitoring the available state of the back-end InfluxDB node, providing a state query interface for the access module, and automatically supplementing the data when the delay of the database data is found.

Description

Bifurcate-architecture InfluxDB high-availability system
Technical Field
The invention relates to the field of computer databases, in particular to an InfluxDB high-availability system with a double-main framework.
Background
InfluxDB is one of the most commonly used open-source time sequence databases in the market at present, and a relatively perfect open-source high-availability scheme is always lacked. The existing scheme usually adopts a timing master-slave synchronization mode to realize hot standby of data, and when a master node fails, the master node is switched to a slave node for use. Because the writing of the time sequence database is usually frequent, it is difficult to perform seamless switching when switching occurs due to a failure, and a part of data may be lost when switching is performed by using a timing synchronization mode.
Therefore, how to ensure the consistency of data and avoid data loss and seamlessly carry out fault transfer becomes a problem to be further solved in the InfluxDB high-availability scheme.
Patent document CN110659158A discloses an infiluxdb data backup method, device, and apparatus based on dual-computer hot standby environment, and a computer readable storage medium, including: detecting whether a host node of the InfluxDB fails or not periodically by using Keepalived; if the host node is not in fault, synchronizing data generated by the host node in a preset time period to the slave node of the InfluxDB every other preset time period; if the host node fails, upgrading the slave node to the current host node, and recording the failure time of the host node in the current host node; and after the host node is recovered, the host node is degraded to be the current slave node, and the data generated between the time of last data synchronization of the host node and the fault time is synchronized to the current host node. However, the technical scheme is based on a master-slave architecture, the slave nodes and the master node keep final consistency in a timing synchronization mode, data can have certain difference in two synchronization gaps, data can be lost during switching, and in addition, only one node of a cluster provides service at the same time, so that load balancing cannot be performed.
Disclosure of Invention
The technical task of the invention is to provide a bifurcate InfluxDB high-availability system to solve the problem of how to ensure the consistency of data, avoid data loss and seamlessly perform fault transfer.
The technical task of the invention is realized in the following way, the bifocal xdb high-availability system comprises an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two bifocal xdb nodes for use;
the access module is used for executing the write request of the user on the two InfluxDB nodes at the same time to ensure the real-time consistency of the database nodes, and simultaneously, the access module alternately sends the read request of the user to the two InfluxDB nodes to realize load balance and improve the query performance;
the monitoring disaster recovery module is used for monitoring the available state of the back-end InfluxDB node, providing a state query interface for the access module, and automatically supplementing the data when the delay of the database data is found.
Preferably, the access module serves as a proxy layer and provides an access protocol interface which is the same as that of the InfluxDB to the outside, and a user is connected with and accesses the database through the InfluxDB client or the Http client.
More preferably, the access protocol interface includes a GET/query interface and a POST/write interface.
More preferably, the specific processing logic of the GET/query interface is as follows:
(1) inquiring the node A of the InfluxDB database which should be accessed at this time, and executing the step (2) next;
(2) inquiring whether the node A is available or not from the monitoring disaster recovery module, and executing the step (3) next;
(3) judging whether the state of the node A is available:
if yes, executing the step (4) next;
② if not, skipping to the step (7)
(4) Sending the request to the node A and receiving a response, and then executing the step (5);
(5) and judging whether the node B is in an available state (active state):
if yes, executing the step (6) next;
if not, jumping to the step (10);
(6) setting the node B as a next access node, and executing the step (10) next time;
(7) judging whether the node B is available:
if yes, executing the step (8) next;
if not, executing the step (10) next step;
(8) marking the node A as an unavailable state (inactive state), starting a background asynchronous thread to update the state of the node A, and executing the step (9) next;
(9) setting the node B as a next access node, and jumping to the step (2) next;
(10) and returning the response to the client.
Preferably, after the background asynchronous thread is started in the step (8), the monitoring disaster recovery module is requested to update the node a state every 5 minutes, and when the node a state is updated to the available state, the background asynchronous thread exits.
Preferably, the processing logic of the POST/write interface is as follows:
receiving/write interface request;
querying all available InfluxDB nodes at the rear end;
(III) simultaneously sending the request to each database node;
(IV) whether any node requests success:
firstly, if any node returns success, a response is returned to the client;
and if all the nodes fail to request, responding to the request failure to the client.
Preferably, the monitoring disaster recovery modules are provided with two sets, and one node is taken as a main node and the other node is taken as a replication node respectively, so that the bidirectional consistency of the two nodes is ensured; the specific processing logic of the monitoring disaster recovery module is as follows:
s1, setting the InfluxDB node A as a main node and setting the node B as a copy node;
s2, checking the state of the node A;
s3, judging whether the node A is available (such as network interruption, database process crash, server crash and the like):
if yes, go to step S4;
if not, marking the node A as an unavailable state (inactive state), and jumping to the step S2 at an interval of 20 seconds;
s4, checking whether the data of the node A is consistent with the data of the node B:
if not, go to step S5;
s5, checking whether the data in node a lags behind node B:
if yes, go to step S6;
if not, jumping to step S8;
s6, marking the node A as an unavailable state;
s7, synchronizing missing data from the node B to the node A;
s8, mark node a as available (active state), and jump to step S2 at 20 seconds intervals.
More preferably, the step S5 checks whether the data in the node a lags behind the node B in units of schema and measurement.
The bifurcate-based InfluxDB high-availability system has the following advantages:
compared with other common InfluxDB high-availability system schemes in the market, the technical scheme provided by the invention can provide better data consistency characteristics and good database access performance, and meanwhile, single-node faults are not sensed by users, and the method has the automatic recovery capability of the fault nodes, reduces the maintenance complexity and shortens the fault time;
the invention supports two simultaneously read-write InfluxDB databases, one part of data can be simultaneously written into two database nodes, and the data can be obtained from any available database node in a polling way when the data is read, so as to achieve the effect of load balancing, provide the high availability characteristic of the database, ensure the availability of the InfluxDB and avoid the unavailability of the database caused by single-point failure;
and simultaneously, the access module is used as a proxy to automatically select the available database nodes, so that the access of the client is not interrupted.
Drawings
The invention is further described below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of an infiluxdb high-availability system with a dual-master architecture;
FIG. 2 is a block flow diagram of the specific processing logic of the/query interface;
FIG. 3 is a block flow diagram of processing logic for the/write interface;
fig. 4 is a block diagram of a specific processing logic of the monitoring disaster recovery module.
Detailed Description
The bifurcate-based infiluxdb high-availability system according to the present invention is described in detail below with reference to the drawings and the specific embodiments of the present application.
Example (b):
as shown in fig. 1, the bifurcate-based infiluxdb high-availability system of the present invention includes two sets of access modules and two sets of monitoring disaster recovery modules, where the access modules and the monitoring disaster recovery modules are used in cooperation with two infiluxdb nodes, and the two sets of monitoring disaster recovery modules respectively use one node as a master node and the other node as a replication node, so as to ensure bidirectional consistency between the two nodes. The access module executes the write request of the user on the two InfluxDB nodes at the same time, so that the real-time consistency of the database nodes is ensured, and simultaneously the access module alternately sends the read request of the user to the two InfluxDB nodes, so that the load balance is realized, and the query performance is improved; the monitoring disaster recovery module monitors the available state of the back-end InfluxDB node, provides a state query interface for the access module, and automatically supplements data when finding that the database data has delay, which is specifically as follows:
when any InfluxDB node fails or data delay exists, the monitoring disaster recovery module informs the access module to block read-write operation on the node, tries to connect the failed node and automatically completes data; and when the completion of the data completion, informing the access module to recover the read-write operation of the node.
The access module serves as a proxy layer and provides an access protocol interface which is the same as that of the InfluxDB to the outside, and a user is connected with and accesses the database through an InfluxDB client or an Http client. The access protocol interface comprises a GET/query interface and a POST/write interface.
As shown in fig. 2, the specific processing logic of the GET/query interface is as follows:
(1) inquiring the node A of the InfluxDB database which should be accessed at this time, and executing the step (2) next;
(2) inquiring whether the node A is available or not from the monitoring disaster recovery module, and executing the step (3) next;
(3) judging whether the state of the node A is available:
if yes, executing the step (4) next;
② if not, skipping to the step (7)
(4) Sending the request to the node A and receiving a response, and then executing the step (5);
(5) and judging whether the node B is in an available state (active state):
if yes, executing the step (6) next;
if not, jumping to the step (10);
(6) setting the node B as a next access node, and executing the step (10) next time;
(7) judging whether the node B is available:
if yes, executing the step (8) next;
if not, executing the step (10) next step;
(8) marking the node A as an unavailable state (inactive state), starting a background asynchronous thread to update the state of the node A, and executing the step (9) next; after the background asynchronous thread is started, requesting the monitoring disaster recovery module to update the state of the node A every 5 minutes, and when the state of the node A is updated to be an available state, exiting the background asynchronous thread;
(9) setting the node B as a next access node, and jumping to the step (2) next;
(10) and returning the response to the client.
As shown in FIG. 3, the processing logic of the POST/write interface is as follows:
receiving/write interface request;
querying all available InfluxDB nodes at the rear end;
(III) simultaneously sending the request to each database node;
(IV) whether any node requests success:
firstly, if any node returns success, a response is returned to the client;
and if all the nodes fail to request, responding to the request failure to the client.
As shown in fig. 4, the specific processing logic of the monitoring disaster recovery module is as follows:
s1, setting the InfluxDB node A as a main node and setting the node B as a copy node;
s2, checking the state of the node A;
s3, judging whether the node A is available (such as network interruption, database process crash, server crash and the like):
if yes, go to step S4;
if not, marking the node A as an unavailable state (inactive state), and jumping to the step S2 at an interval of 20 seconds;
s4, checking whether the data of the node A is consistent with the data of the node B:
if not, go to step S5;
s5, checking whether the data in the node A lags behind the node B by taking the schema and the measurement as units:
if yes, go to step S6;
if not, jumping to step S8;
s6, marking the node A as an unavailable state;
s7, synchronizing missing data from the node B to the node A;
s8, mark node a as available (active state), and jump to step S2 at 20 seconds intervals.
The above process is executed in the background at regular time, so as to update the node state in real time and maintain the data consistency. The invention is based on a double-main framework, ensures the real-time consistency of two nodes by a double-write mode, can realize the read load balance of the two nodes and improves the query performance.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. An InfluxDB high-availability system with a double-main framework is characterized by comprising an access module and a monitoring disaster recovery module, wherein the access module and the monitoring disaster recovery module are matched with two InfluxDB nodes for use;
the access module is used for executing the write request of the user on the two InfluxDB nodes at the same time to ensure the real-time consistency of the database nodes, and simultaneously, the access module alternately sends the read request of the user to the two InfluxDB nodes to realize load balance and improve the query performance;
the monitoring disaster recovery module is used for monitoring the available state of the back-end InfluxDB node, providing a state query interface for the access module, and automatically supplementing the data when the delay of the database data is found.
2. The bifocal xdb high-availability system according to claim 1, wherein the access module serves as a proxy layer and provides the same access protocol interface as the bifocal xdb, and a user connects to and accesses the database through the bifocal xdb client or the Http client.
3. The bifurcate infiluxdb high availability system according to claim 2, wherein the access protocol interface comprises GET/query interface and POST/write interface.
4. The bifurcate-architecture infiluxdb high-availability system according to claim 3, wherein the GET/query interface specifically processes logic as follows:
(1) inquiring the node A of the InfluxDB database which should be accessed at this time, and executing the step (2) next;
(2) inquiring whether the node A is available or not from the monitoring disaster recovery module, and executing the step (3) next;
(3) judging whether the state of the node A is available:
if yes, executing the step (4) next;
② if not, skipping to the step (7)
(4) Sending the request to the node A and receiving a response, and then executing the step (5);
(5) judging whether the node B is in an available state:
if yes, executing the step (6) next;
if not, jumping to the step (10);
(6) setting the node B as a next access node, and executing the step (10) next time;
(7) judging whether the node B is available:
if yes, executing the step (8) next;
if not, executing the step (10) next step;
(8) marking the node A as an unavailable state, starting a background asynchronous thread to update the state of the node A, and executing the step (9) next;
(9) setting the node B as a next access node, and jumping to the step (2) next;
(10) and returning the response to the client.
5. The bifurcate infiluxdb high availability system according to claim 4, wherein after the background asynchronous thread is started in step (8), the monitoring disaster recovery module is requested every 5 minutes to update the node a status, and when the node a status is updated to the available status, the background asynchronous thread exits.
6. The bifurcate InfluxDB high availability system according to claim 3, 4 or 5, wherein the POST/write interface has the following processing logic:
receiving/write interface request;
querying all available InfluxDB nodes at the rear end;
(III) simultaneously sending the request to each database node;
(IV) whether any node requests success:
firstly, if any node returns success, a response is returned to the client;
and if all the nodes fail to request, responding to the request failure to the client.
7. The bifurcate-architecture infiluxdb high-availability system according to claim 1, wherein the monitoring disaster recovery module is provided with two sets, and one node is a master node and the other node is a replication node, respectively, so as to ensure bidirectional consistency between the two nodes; the specific processing logic of the monitoring disaster recovery module is as follows:
s1, setting the InfluxDB node A as a main node and setting the node B as a copy node;
s2, checking the state of the node A;
s3, judging whether the node A is available:
if yes, go to step S4;
if not, marking the node A as an unavailable state, and jumping to the step S2;
s4, checking whether the data of the node A is consistent with the data of the node B:
if not, go to step S5;
s5, checking whether the data in node a lags behind node B:
if yes, go to step S6;
if not, jumping to step S8;
s6, marking the node A as an unavailable state;
s7, synchronizing missing data from the node B to the node A;
s8, marking the node A as the available state, and jumping to the step S2.
8. The bifurcate infiluxdb high availability system according to claim 7, wherein the step S5 is to check whether the data in node a lags behind node B in units of schema and measurement.
CN202010616116.3A 2020-07-01 2020-07-01 Bifocal-architecture InfluxDB high-availability system Active CN111752758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010616116.3A CN111752758B (en) 2020-07-01 2020-07-01 Bifocal-architecture InfluxDB high-availability system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010616116.3A CN111752758B (en) 2020-07-01 2020-07-01 Bifocal-architecture InfluxDB high-availability system

Publications (2)

Publication Number Publication Date
CN111752758A true CN111752758A (en) 2020-10-09
CN111752758B CN111752758B (en) 2022-05-31

Family

ID=72678366

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010616116.3A Active CN111752758B (en) 2020-07-01 2020-07-01 Bifocal-architecture InfluxDB high-availability system

Country Status (1)

Country Link
CN (1) CN111752758B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282604A (en) * 2021-07-14 2021-08-20 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254031A (en) * 2011-08-03 2011-11-23 无锡浙潮科技有限公司 Batch processing request-based Microsoft SQL server database cluster
CN102882927A (en) * 2012-08-29 2013-01-16 华南理工大学 Cloud storage data synchronizing framework and implementing method thereof
CN106407264A (en) * 2016-08-25 2017-02-15 成都索贝数码科技股份有限公司 High-availability and high-consistency database cluster system and command processing method thereof
WO2017041616A1 (en) * 2015-09-08 2017-03-16 中兴通讯股份有限公司 Data reading and writing method and device, double active storage system and realization method thereof
CN110019346A (en) * 2017-12-29 2019-07-16 北京京东尚科信息技术有限公司 A kind of data processing method and device based on double primary databases
CN110659158A (en) * 2019-09-04 2020-01-07 苏州浪潮智能科技有限公司 Influx DB data backup method based on dual-computer hot standby environment
CN110719311A (en) * 2018-07-13 2020-01-21 深圳兆日科技股份有限公司 Distributed coordination service method, system and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254031A (en) * 2011-08-03 2011-11-23 无锡浙潮科技有限公司 Batch processing request-based Microsoft SQL server database cluster
CN102882927A (en) * 2012-08-29 2013-01-16 华南理工大学 Cloud storage data synchronizing framework and implementing method thereof
WO2017041616A1 (en) * 2015-09-08 2017-03-16 中兴通讯股份有限公司 Data reading and writing method and device, double active storage system and realization method thereof
CN106407264A (en) * 2016-08-25 2017-02-15 成都索贝数码科技股份有限公司 High-availability and high-consistency database cluster system and command processing method thereof
CN110019346A (en) * 2017-12-29 2019-07-16 北京京东尚科信息技术有限公司 A kind of data processing method and device based on double primary databases
CN110719311A (en) * 2018-07-13 2020-01-21 深圳兆日科技股份有限公司 Distributed coordination service method, system and computer readable storage medium
CN110659158A (en) * 2019-09-04 2020-01-07 苏州浪潮智能科技有限公司 Influx DB data backup method based on dual-computer hot standby environment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282604A (en) * 2021-07-14 2021-08-20 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue
CN113282604B (en) * 2021-07-14 2021-10-22 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue

Also Published As

Publication number Publication date
CN111752758B (en) 2022-05-31

Similar Documents

Publication Publication Date Title
EP2648114B1 (en) Method, system, token conreoller and memory database for implementing distribute-type main memory database system
WO2017177941A1 (en) Active/standby database switching method and apparatus
CN111581284B (en) Database high availability method, device, system and storage medium
US20140244578A1 (en) Highly available main memory database system, operating method and uses thereof
US20050289553A1 (en) Storage system and storage system control method
CN102394914A (en) Cluster brain-split processing method and device
CN107404394B (en) IPTV system disaster tolerance method and IPTV disaster tolerance system
CN106156318B (en) System and method for realizing high availability of multi-node database
CN102088490B (en) Data storage method, device and system
CN105069160A (en) Autonomous controllable database based high-availability method and architecture
EP4213038A1 (en) Data processing method and apparatus based on distributed storage, device, and medium
CN103294701A (en) Distributed file system and data processing method
CN103856760A (en) Longitudinal virtualization device between video surveillance devices
CN103118093A (en) Large-scale distributed network examination method based on multi-level cache
CN106850255A (en) A kind of implementation method of multi-computer back-up
JP2019191843A (en) Connection control program, connection control method, and connection control device
CN111752758B (en) Bifocal-architecture InfluxDB high-availability system
CN113254275A (en) MySQL high-availability architecture method based on distributed block device
CN103428288A (en) Method for synchronizing copies on basis of partition state tables and coordinator nodes
CN112783694B (en) Long-distance disaster recovery method for high-availability Redis
CN112887367A (en) Method, system and computer readable medium for realizing high availability of distributed cluster
CN105007172A (en) Method for realizing HDFS high-availability scheme
JPH08314875A (en) Cooperative distributed processing method, distributed shared memory monitoring device, distributed shared memory network tracking device and distributed shared memory network setting supporting device
CN107404511B (en) Method and device for replacing servers in cluster
CN113905054B (en) RDMA (remote direct memory access) -based Kudu cluster data synchronization method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant