CN115883547A - High-availability NiFi deployment method and system based on DRBD - Google Patents

High-availability NiFi deployment method and system based on DRBD Download PDF

Info

Publication number
CN115883547A
CN115883547A CN202211423470.XA CN202211423470A CN115883547A CN 115883547 A CN115883547 A CN 115883547A CN 202211423470 A CN202211423470 A CN 202211423470A CN 115883547 A CN115883547 A CN 115883547A
Authority
CN
China
Prior art keywords
drbd
nifi
pacemaker
node
instance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211423470.XA
Other languages
Chinese (zh)
Inventor
孙亮亮
张栋
李国涛
胡清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202211423470.XA priority Critical patent/CN115883547A/en
Publication of CN115883547A publication Critical patent/CN115883547A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Electrotherapy Devices (AREA)

Abstract

The invention relates to the technical field of high-availability deployment, in particular to a high-availability deployment method of NiFi based on DRBD, which comprises the following steps: deploying a Pacemaker tube-replacing type cluster consisting of three computing instances, respectively installing NiFi and DRBD on a main instance and a standby instance, and installing arbitration equipment in a third instance; installing and configuring DRBD and NiFi; configuring a Pacemaker nanotube cluster service; the beneficial effects are that: the DRBD-based NiFi high-availability deployment method and the system improve the safety and reliability of NiFi configuration data, ensure that the data cannot be lost, and ensure the continuous operation and the service reliability of a NiFi task; the method has the advantages that simple and convenient operation and maintenance management is realized, the stability of the platform product is enhanced, and the high availability of the service product is guaranteed; the architecture has the characteristics of high-efficiency management, automatic fault transfer and stable service operation, and has the advantages of automatic main and standby synchronization, data safety and reliability guarantee and low storage cost; prevent split brain, avoid data damage and prevent system confusion.

Description

High-availability NiFi deployment method and system based on DRBD
Technical Field
The invention relates to the technical field of high-availability deployment, in particular to a high-availability deployment method and system for NiFi based on a DRBD.
Background
With the development of big data technology, distributed data storage systems are increasing, big data applications generally need to integrate a plurality of different data storage systems to build data warehouses of different applications, and ETL is used to describe the process of extracting (extract), converting (transform) and loading (load) data from a source data warehouse to a target data warehouse. In general, the ETL tool is used to take charge of scheduling control of the system running program and allocation of resources. Apache NiFi is an easy to use, powerful and reliable system for processing and distributing data.
In the prior art, in the development of a big data project, if an NIFI node goes down, loses connection or fails, the NIFI task is terminated, thereby affecting the service processing. If the processing of the stream file is not completed in the NiFi cluster mode, data loss can be caused if a disconnected node is down.
Disclosure of Invention
The present invention aims to provide a method and a system for deploying NiFi based on DRBD with high availability, so as to solve the problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: a high-availability NiFi deployment method based on DRBD comprises the following steps:
deploying a Pacemaker tube-type cluster consisting of three computing instances, respectively installing NiFi and DRBD on a main instance and a standby instance, and installing arbitration equipment in a third instance;
installing and configuring DRBD and NiFi;
and configuring the Pacemaker nanotube cluster service.
Preferably, in the double-node cluster, the active nodes can be determined only by one voting; if two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes; the arbitration device serves as an arbiter and elects a unique running node in a voting mode; under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a majority vote mechanism.
Preferably, the Pacemaker is installed and configured to be responsible for the full life cycle management of software services in the cluster, the Pacemaker is installed on the main instance node and the standby instance node, the arbitration device of the Pacemaker is installed on the arbitration device node to perform Pacemaker-managed cluster configuration, the three calculation instance nodes are accessed into the node resource management of the Pacemaker, and the node state is online.
Preferably, the DRBD is initialized and configured, DRBD copying from the main instance node to the standby instance node is set, a diskless arbitration mode for configuring the DRBD is installed at an arbitration node, the arbitration setting needs at least three DRBD nodes, and after the DRBD setting is completed, the disk for data synchronization of the DRBD is mounted to the data storage file system directory of the NiFi, so that the data of the NiFi can be synchronized to the standby instance node through the DRBD.
Preferably, the cluster services DRBD, niFi, and VIP are accessed into the resource management configuration of the Pacemaker according to the resource access specification requirement of the Pacemaker, and are selected by the Pacemaker to uniformly manage and schedule the instance nodes of the DRBD, niFi, and VIP.
A kind of NiFi based on DRBD highly available deploys the system, this system is by deploying module, building module and managing the module to form;
the deployment module is used for deploying a Pacemaker tube-type cluster consisting of three computing instances, installing NiFi and DRBD on the main instance and the standby instance respectively, and installing arbitration equipment in the third instance;
the building module is used for installing and configuring the DRBD and the NiFi;
and the management module is used for configuring the Pacemaker nano-tube cluster service.
Preferably, in the deployment module, in the dual-node cluster, the active node can be determined only by one voting; if two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes; the arbitration device serves as an arbiter and elects a unique running node in a voting mode; under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a majority vote mechanism.
Preferably, in the deployment module, a deployment module is installed and configured with a Pacemaker and is responsible for the full life cycle management of software services in the cluster, the Pacemaker is installed on the active and standby instance nodes, the arbitration device of the Pacemaker is installed on the arbitration device node, the configuration of the Pacemaker-managed cluster is performed, the three calculation instance nodes are accessed into the node resource management of the Pacemaker, and the node state is online.
Preferably, in the building module, the DRBD is initialized and configured, DRBD copying from the main instance node to the standby instance node is set, a diskless arbitration mode for configuring the DRBD is installed at the arbitration node, at least three DRBD nodes are required for arbitration setting, and after the DRBD setting is completed, the disk for data synchronization of the DRBD is mounted to the directory of the data storage file system of the NiFi, so that the data of the NiFi can be synchronized to the standby instance node through the DRBD.
Preferably, in the management module, the cluster services DRBD, niFi, and VIP are accessed to the resource management configuration of the Pacemaker according to the resource access specification requirement of the Pacemaker, and the Pacemaker manages and schedules DRBD, niFi, and VIP instance node selection in a unified manner.
Compared with the prior art, the invention has the beneficial effects that:
the DRBD-based NiFi high-availability deployment method and the system improve the safety and reliability of NiFi configuration data, ensure that the data cannot be lost, and ensure the continuous operation and the service reliability of a NiFi task; the simple and convenient operation and maintenance management is realized, the stability of a platform product is enhanced, and the high availability of a service product is ensured; the framework has the characteristics of high-efficiency management, automatic fault transfer and stable service operation, and has the advantages of automatic main and standby synchronization, data safety and reliability guarantee and low storage cost; prevent split brain, avoid data damage and prevent system confusion.
Drawings
FIG. 1 is a schematic diagram of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clear and fully described, embodiments of the present invention are further described in detail below with reference to the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of some embodiments of the invention and are not limiting of the invention, and that all other embodiments obtained by those of ordinary skill in the art without the exercise of inventive faculty are within the scope of the invention.
Example one
Referring to fig. 1, the present invention provides a technical solution: a high-availability NiFi deployment method based on DRBD comprises the following steps:
deploying a Pacemaker tube-type cluster consisting of three computing instances, respectively installing NiFi and DRBD on a main instance and a standby instance, and installing arbitration equipment in a third instance;
installing and configuring DRBD and NiFi;
and configuring the Pacemaker nanotube cluster service.
Specific operations are, 1) cluster environment preparation:
a. a Pacemaker surrogate tubular cluster consisting of three compute instances is deployed. NiFi and DRBD are installed on the main instance and the standby instance, respectively. In a third example, an arbitration device is installed.
b. In the cluster, each node will vote to select the active node it considers ideal, i.e. the node running NiFi. In a dual-node cluster, active nodes can be determined by only one voting. In this case, clustering behavior may lead to split-brain (split-brain) problems or outages. The split brain problem occurs when both nodes gain control, as only one vote is required in a two node scenario. If two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes.
This can be avoided by configuring the arbitration device. The arbitration device acts as an arbitrator and elects the only operation node by means of voting. Under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a majority vote mechanism and avoid split brain.
2) Mounting and configuring a Pacemaker:
the Pacemaker realizes fault detection and resource recovery of node and resource levels, thereby ensuring high availability of cluster services to the maximum extent. From a logical function, the placemaker is responsible for the full-life-cycle management of the software services in the cluster, driven by the resource rules defined by the cluster administrator, and the management even includes the whole software system and the interaction of the software systems with each other. The Pacemaker can manage clusters of any size in practical application, and because the Pacemaker has a strong resource dependency model, a cluster administrator can accurately describe and express the relationship among cluster resources (including the relationship such as the sequence and the position of the resources).
And installing a Pacemaker on the active and standby instance nodes, and installing an arbitration device of the Pacemaker on the arbitration device node. And carrying out the configuration of the Pacemaker agent cluster. And accessing the three calculation example nodes into the node resource management of the Pacemaker, wherein the node state is online.
3) Installation configuration DRBD and NiFi:
and performing initialization configuration on the DRBD, and setting DRBD copying of the DRBD from the main instance node to the standby instance node. And installing a diskless arbitration mode for configuring the DRBD at the arbitration node. The arbitration setup requires at least three DRBD nodes, but DRBD replication requires only two nodes, so an unstored (diskless) arbitration device can be built at the third node.
And after the DRBD is set, mounting the data synchronous disk of the DRBD under a data storage file system directory of the NiFi. So that the data of the NiFi can be synchronized to the standby instance node through the DRBD.
4) Configuring a Pacemaker nanotube cluster service:
and accessing the cluster services DRBD, niFi and VIP into the resource management configuration of the Pacemaker according to the resource access specification requirements of the Pacemaker, and uniformly managing and scheduling the selection of instance nodes of the DRBD, the NiFi and the VIP by the Pacemaker to ensure that the services are uniformly scheduled to operate at the same node. The VIP is mainly used for network address translation, network fault tolerance and mobility. In order to improve the high availability of external services of the system, the high availability configuration is carried out by adopting a main standby mode. A VIP is configured to connect the active and standby example nodes, when the main node is down, the VIP floats to the standby node and continues to provide services, so that the NiFi services are provided to the outside in a unified mode, single-point failures are prevented, and service availability is improved.
Thus, a high-availability deployment scheme of the NiFi service is formed by taking the Pacemaker as a cluster management tool, combining the DRBD and the VIP and introducing a third arbitration device node to prevent brain fragmentation. The accuracy and safety of the NiFi service data are guaranteed through the DRBD, the maximum availability of resources is guaranteed through automatic transfer of the service fault through the Pacemaker, operation and maintenance management is facilitated, and the high availability of the service is improved by combining with the VIP guarantee NiFi service access unified entry.
Example two
A kind of NiFi based on DRBD highly available deploys the system, this system is by deploying module, building module and managing the module to form;
the deployment module is used for deploying a Pacemaker tube-type cluster consisting of three computing instances, installing NiFi and DRBD on the main instance and the standby instance respectively, and installing arbitration equipment in the third instance; in the double-node cluster, the active nodes can be determined only by voting once; if two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes; the arbitration device serves as an arbiter and elects a unique running node in a voting mode; under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a mechanism of majority vote; the method comprises the steps of installing and configuring a Pacemaker to be responsible for full life cycle management of software services in a cluster, installing the Pacemaker on main and standby example nodes, installing arbitration equipment of the Pacemaker on arbitration equipment nodes, carrying out Pacemaker-managed cluster configuration, and accessing three calculation example nodes into node resource management of the Pacemaker, wherein the node state is online;
the building module is used for installing and configuring the DRBD and the NiFi; after the DRBD is set, mounting a disk with synchronous data of the DRBD to a data storage file system directory of the NiFi, so that the data of the NiFi can be synchronized to a standby instance node through the DRBD;
the management module is used for configuring the Pacemaker nano-tube cluster service; and accessing the cluster services DRBD, niFi and VIP into the resource management configuration of the Pacemaker according to the resource access specification requirements of the Pacemaker, and uniformly managing and scheduling the DRBD, niFi and VIP instance node selection by the Pacemaker.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A high-availability deployment method of NiFi based on DRBD is characterized by comprising the following steps:
deploying a Pacemaker tube-type cluster consisting of three computing instances, respectively installing NiFi and DRBD on a main instance and a standby instance, and installing arbitration equipment in a third instance;
installing and configuring DRBD and NiFi;
and configuring the Pacemaker nanotube cluster service.
2. The method of claim 1, wherein the method comprises: in the double-node cluster, the active nodes can be determined only by voting once; if two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes; the arbitration device serves as an arbiter and elects a unique running node in a voting mode; under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a majority vote mechanism.
3. The method of claim 2, wherein the method comprises: the method comprises the steps of installing and configuring a Pacemaker to be responsible for full life cycle management of software services in a cluster, installing the Pacemaker on a main instance node and a standby instance node, installing arbitration equipment of the Pacemaker on the arbitration equipment nodes to conduct Pacemaker-managed cluster configuration, and accessing three calculation instance nodes into node resource management of the Pacemaker, wherein the node state is online.
4. The method of claim 1, wherein the method comprises: the DRBD is initialized and configured, DRBD copying from a main instance node to a standby instance node is set, a diskless arbitration mode for configuring the DRBD is installed on an arbitration node, the arbitration setting at least needs three DRBD nodes, and after the DRBD setting is completed, a disk of the DRBD with data synchronization is mounted to a data storage file system directory of the NiFi, so that the data of the NiFi can be synchronized to the standby instance node through the DRBD.
5. The method of claim 1, wherein the method comprises: and accessing the cluster services DRBD, niFi and VIP into the resource management configuration of the Pacemaker according to the resource access specification requirement of the Pacemaker, and uniformly managing and scheduling the instance nodes of the DRBD, the NiFi and the VIP by the Pacemaker for selection.
6. A DRBD-based NiFi high availability deployment system as claimed in any of the above claims 1-5, characterized by: the system is composed of a deployment module, a building module and a management module;
the deployment module is used for deploying a Pacemaker tube-type cluster consisting of three computing instances, installing NiFi and DRBD on the main instance and the standby instance respectively, and installing arbitration equipment in the third instance;
the building module is used for installing and configuring the DRBD and the NiFi;
and the management module is used for configuring the Pacemaker nano-tube cluster service.
7. The DRBD-based NiFi high availability deployment system of claim 6, wherein: in the deployment module, in the double-node cluster, the active nodes can be determined only by one voting; if two nodes lose connection with each other, there is a risk that a plurality of cluster nodes treat them as active nodes; the arbitration device serves as an arbiter and elects a unique running node in a voting mode; under the condition that the main and standby examples can not communicate, the arbitration node can communicate with the main and standby examples to achieve a majority vote mechanism.
8. The DRBD-based NiFi high availability deployment system of claim 6, wherein: the deployment module is provided with a Pacemaker for managing the whole life cycle of software services in the cluster, the Pacemaker is arranged on the main instance node and the standby instance node, the arbitration device of the Pacemaker is arranged on the arbitration device node, the Pacemaker manages the cluster configuration by the proxy, the three calculation instance nodes are accessed into the node resource management of the Pacemaker, and the node state is online.
9. The DRBD-based NiFi high availability deployment system of claim 6, wherein: in the building module, the DRBD is initialized and configured, DRBD copying from a main instance node to a standby instance node is set, a diskless arbitration mode for configuring the DRBD is installed at an arbitration node, the arbitration setting at least needs three DRBD nodes, and after the DRBD setting is completed, a disk of the DRBD with data synchronization is mounted to a data storage file system directory of the NiFi, so that the data of the NiFi can be synchronized to the standby instance node through the DRBD.
10. The DRBD-based NiFi high availability deployment system of claim 6, wherein: in the management module, cluster services DRBD, niFi and VIP are accessed into the resource management configuration of the Pacemaker according to the resource access specification requirements of the Pacemaker, and the Pacemaker uniformly manages and schedules the instance node selection of the DRBD, niFi and VIP.
CN202211423470.XA 2022-11-15 2022-11-15 High-availability NiFi deployment method and system based on DRBD Pending CN115883547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211423470.XA CN115883547A (en) 2022-11-15 2022-11-15 High-availability NiFi deployment method and system based on DRBD

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211423470.XA CN115883547A (en) 2022-11-15 2022-11-15 High-availability NiFi deployment method and system based on DRBD

Publications (1)

Publication Number Publication Date
CN115883547A true CN115883547A (en) 2023-03-31

Family

ID=85759866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211423470.XA Pending CN115883547A (en) 2022-11-15 2022-11-15 High-availability NiFi deployment method and system based on DRBD

Country Status (1)

Country Link
CN (1) CN115883547A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942128A (en) * 2014-04-29 2014-07-23 浪潮电子信息产业股份有限公司 Double-computer reinforcing method for high-performance job scheduling management node
CN104679907A (en) * 2015-03-24 2015-06-03 新余兴邦信息产业有限公司 Realization method and system for high-availability and high-performance database cluster
CN110134518A (en) * 2019-05-21 2019-08-16 浪潮软件集团有限公司 A kind of method and system improving big data cluster multinode high application availability
CN110674080A (en) * 2019-09-23 2020-01-10 浪潮软件股份有限公司 Method and system for collecting large-data-volume non-structural files based on NiFi
CN112003716A (en) * 2019-12-12 2020-11-27 军事科学院系统工程研究院网络信息研究所 Data center dual-activity implementation method
CN114610545A (en) * 2022-03-22 2022-06-10 西安超越申泰信息科技有限公司 Method, system, device and medium for reducing single point of failure of private cloud computing
CN114697328A (en) * 2022-03-25 2022-07-01 浪潮云信息技术股份公司 Method and system for realizing NiFi high-availability cluster mode

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942128A (en) * 2014-04-29 2014-07-23 浪潮电子信息产业股份有限公司 Double-computer reinforcing method for high-performance job scheduling management node
CN104679907A (en) * 2015-03-24 2015-06-03 新余兴邦信息产业有限公司 Realization method and system for high-availability and high-performance database cluster
CN110134518A (en) * 2019-05-21 2019-08-16 浪潮软件集团有限公司 A kind of method and system improving big data cluster multinode high application availability
CN110674080A (en) * 2019-09-23 2020-01-10 浪潮软件股份有限公司 Method and system for collecting large-data-volume non-structural files based on NiFi
CN112003716A (en) * 2019-12-12 2020-11-27 军事科学院系统工程研究院网络信息研究所 Data center dual-activity implementation method
CN114610545A (en) * 2022-03-22 2022-06-10 西安超越申泰信息科技有限公司 Method, system, device and medium for reducing single point of failure of private cloud computing
CN114697328A (en) * 2022-03-25 2022-07-01 浪潮云信息技术股份公司 Method and system for realizing NiFi high-availability cluster mode

Similar Documents

Publication Publication Date Title
CN111522628B (en) Kubernetes cluster building deployment method, framework and storage medium based on OpenStack
CN102325192B (en) Cloud computing implementation method and system
US9367410B2 (en) Failover mechanism in a distributed computing system
CN102402395B (en) Quorum disk-based non-interrupted operation method for high availability system
US8949828B2 (en) Single point, scalable data synchronization for management of a virtual input/output server cluster
US9880827B2 (en) Managing software version upgrades in a multiple computer system environment
CN108306955B (en) Large-scale interconnection clustering method for vehicle-mounted terminals
CN110134518B (en) Method and system for improving high availability of multi-node application of big data cluster
US20120179798A1 (en) Autonomous primary node election within a virtual input/output server cluster
US20050060608A1 (en) Maximizing processor utilization and minimizing network bandwidth requirements in throughput compute clusters
US9201747B2 (en) Real time database system
US20150149814A1 (en) Failure recovery resolution in transplanting high performance data intensive algorithms from cluster to cloud
CN111949444A (en) Data backup and recovery system and method based on distributed service cluster
CN105630589A (en) Distributed process scheduling system and process scheduling and execution method
CN105069152B (en) data processing method and device
CN107995043B (en) Application disaster recovery system based on hybrid cloud platform
US20200112499A1 (en) Multiple quorum witness
CN106850269A (en) A kind of management system of cloud platform
CN104077199A (en) Shared disk based high availability cluster isolation method and system
CN111541599B (en) Cluster software system and method based on data bus
CN104917827A (en) Method for realizing oracle load balancing cluster
CN114138568A (en) Scheduling method and system for client fault transfer in Redis sentinel mode
Li et al. Kubernetes-container-cluster-based architecture for an energy management system
CN112073499A (en) Dynamic service method of multi-machine type cloud physical server
CN115883547A (en) High-availability NiFi deployment method and system based on DRBD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination