CN114338370A - Highly available method, system, apparatus, electronic device and storage medium for Ambari - Google Patents

Highly available method, system, apparatus, electronic device and storage medium for Ambari Download PDF

Info

Publication number
CN114338370A
CN114338370A CN202210021964.9A CN202210021964A CN114338370A CN 114338370 A CN114338370 A CN 114338370A CN 202210021964 A CN202210021964 A CN 202210021964A CN 114338370 A CN114338370 A CN 114338370A
Authority
CN
China
Prior art keywords
central node
node
central
distributed coordinator
ambari
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210021964.9A
Other languages
Chinese (zh)
Inventor
张世龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202210021964.9A priority Critical patent/CN114338370A/en
Publication of CN114338370A publication Critical patent/CN114338370A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The application relates to a high-availability method, a system, a device, electronic equipment and a storage medium of Ambari, which are applied to the technical field of service management, wherein the method comprises the following steps: when the abnormality of the current main central node is monitored, deleting first equipment information written in the distributed coordinator by the current main central node, wherein the current main central node is one of at least two central nodes; sending a writable signal to at least one central node, so that the at least one central node writes own second equipment information into the distributed coordinator according to the writable signal; determining a target central node in which the second equipment information is successfully written in at least one central node; and sending a switching signal to the target central node so that the target central node switches the main state from the standby state to the main state according to the switching signal.

Description

Highly available method, system, apparatus, electronic device and storage medium for Ambari
Technical Field
The present application relates to the field of service management technologies, and in particular, to a method, a system, an apparatus, an electronic device, and a storage medium for Ambari with high availability.
Background
Ambari is a Web-based tool that supports the provisioning, management, and monitoring of Apache Hadoop clusters.
Generally, Ambari is divided into two roles, namely a master and a slave, wherein the master is a central node, and the master manages the slave node and controls the slave to execute commands. Generally, a master is in a single-point mode, the fault tolerance rate is low, the problem of single-point fault can occur, and once the master fails, the project cannot run normally; and, high availability is not supported.
In the related art, Ambari high availability schemes are mainly DNS-based cold-standby schemes. Specifically, when the master fails, the operation and maintenance personnel modify the configuration of the DNS to make the slave resolve the ip of the new master, so as to connect to the new master. However, this approach requires manual intervention to recover from the failure, and service is not available until manual intervention.
Disclosure of Invention
The application provides a high-availability method, a high-availability system, a high-availability device, an electronic device and a storage medium for Ambari, and aims to solve the problems that in the prior art, a fault can be recovered only by manual intervention, and service is unavailable before manual intervention.
In a first aspect, an embodiment of the present application provides a high availability method of Ambari, applied to a distributed coordinator, where the distributed coordinator is linked with at least two central nodes, and the method includes:
when the abnormality of the current main central node is monitored, deleting first equipment information written in the distributed coordinator by the current main central node; the current main central node is one of the at least two central nodes;
sending a writable signal to at least one central node, so that the at least one central node writes second equipment information of the central node into the distributed coordinator according to the writable signal;
determining a target central node in at least one central node, wherein the target central node is successfully written in the second equipment information;
and sending a switching signal to the target central node so that the target central node switches the main and standby states from the standby state to the main state according to the switching signal.
Optionally, the monitoring that the current master center node is abnormal includes:
and monitoring that the time length of the abnormal link with the current main central node exceeds a first preset time length.
Optionally, the first preset time is longer than a second preset time, and the second preset time is longer than a heartbeat period of the distributed coordinator linked with the current main central node.
Optionally, the distributed coordinator includes a master node and at least two first slave nodes; the determining of the target central node in which the second device information is successfully written in the central node includes:
after the second device information of the central node is written into the master node, the second device information is sequentially synchronized to the at least two first slave nodes through the master node;
and determining the center node with the number of the first synchronized first slave nodes reaching a preset value as the target center node which is successfully written in the second equipment information.
Optionally, the preset value is half of the total number of the first slave nodes.
Optionally, the distributed coordinator is zookeeper or etcd.
In a second aspect, an embodiment of the present application provides a high availability method for Ambari, applied to a central node, including:
acquiring a writable signal sent by a distributed coordinator, wherein the writable signal is sent by the distributed coordinator after first equipment information written by a current main central node in a target node of the distributed coordinator is deleted;
writing second equipment information of the distributed coordinator into the distributed coordinator according to the writable signal;
acquiring a switching signal sent by the distributed coordinator;
and switching the main state and the standby state from the standby state to the main state according to the switching signal.
Optionally, the method further includes:
acquiring an access request sent by a second slave node;
and if the main/standby state is the main state, responding to the access request and establishing a link with the second slave node.
In a third aspect, an embodiment of the present application provides a high availability system for Ambari, including: the distributed coordinator comprises a distributed coordinator and at least two central nodes, wherein the distributed coordinator is linked with the at least two central nodes;
the distributed coordinator is used for deleting first equipment information written by the current main central node in a target node of the distributed coordinator after monitoring that the current main central node is abnormal; and sending a writeable signal to at least one of said central nodes; the current main central node is one of the at least two central nodes;
the central node is used for sending second equipment information of the central node to the distributed coordinator according to the writable signal;
the distributed coordinator is further configured to determine a target central node, in the central nodes, into which the second device information is successfully written;
and the central node is also used for switching the main state and the standby state from the standby state to the main state according to the switching signal.
Optionally, the method further includes: at least one second slave node;
the second slave node is used for sending an access request to the central node;
the central node is further configured to acquire an access request sent by the second slave node; and when the master/standby state is the master state, responding to the access request and establishing a link with the second slave node.
In a fourth aspect, an embodiment of the present application provides a high availability apparatus for Ambari, including:
the deleting module is used for deleting first equipment information written in the distributed coordinator by the current main central node after monitoring that the current main central node is abnormal; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
a first sending module, configured to send a writable signal to at least one central node, so that the at least one central node writes second device information of itself into the distributed coordinator according to the writable signal;
a determining module, configured to determine a target central node, in the at least one central node, where the second device information is successfully written;
and the second sending module is used for sending a switching signal to the target central node so that the target central node switches the main and standby states from the standby state to the main state according to the switching signal.
In a fifth aspect, an embodiment of the present application provides a high availability apparatus for Ambari, including:
the first obtaining module is used for obtaining a writable signal sent by a distributed coordinator, wherein the writable signal is sent by the distributed coordinator after first equipment information written by the current main central node in a target node of the distributed coordinator is deleted; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
the writing module is used for writing second equipment information of the writing module into the distributed coordinator according to the writable signal;
a second obtaining module, configured to obtain a handover signal sent by the distributed coordinator;
and the switching module is used for switching the main state and the standby state from the standby state to the main state according to the switching signal.
In a sixth aspect, an embodiment of the present application provides an electronic device, including: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory for storing a computer program;
the processor is configured to execute the program stored in the memory to implement the high-availability method of Ambari according to the first aspect or the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium storing a computer program, which when executed by a processor implements the high availability method of Ambari according to the first or second aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: according to the method provided by the embodiment of the application, after the abnormality of the current main central node is monitored, first equipment information written in the distributed coordinator by the current main central node is deleted; the current main central node is one of at least two central nodes; sending a writable signal to at least one central node, so that the at least one central node writes own second equipment information into the distributed coordinator according to the writable signal; determining a target central node in which the second equipment information is successfully written in at least one central node; and sending a switching signal to the target central node so that the target central node switches the main state from the standby state to the main state according to the switching signal. Therefore, the central node is managed through the distributed coordinator, after the abnormality of the current main central node is monitored, the first equipment information written in by the current main central node is deleted, so that other central nodes write own second equipment information into the distributed coordinator, and the master-standby state of the central node successfully written in the second equipment information is switched to the master state, so that the central node actively writes in the second equipment information after receiving a write-in signal without manual participation, and further determines a new main central node, and the application is more convenient.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a block diagram of a high availability system for Ambari as provided by an embodiment of the present application;
FIG. 2 is a flow chart of a highly available method of Ambari provided in an embodiment of the present application;
FIG. 3 is a flow chart of a highly available method of Ambari provided in another embodiment of the present application;
FIG. 4 is a block diagram of a high availability system for Ambari as provided in another embodiment of the present application;
FIG. 5 is a block diagram of a high availability apparatus for Ambari according to an embodiment of the present application;
FIG. 6 is a block diagram of a high availability apparatus for Ambari according to another embodiment of the present application;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Before further detailed description of the embodiments of the present invention, terms and expressions referred to in the embodiments of the present invention are described, and the terms and expressions referred to in the embodiments of the present invention are applicable to the following explanations.
High availability: ha (high availability) is one of the factors that must be considered in the architecture design of a distributed system, and it generally means that the time during which the system cannot provide service is reduced by design.
Ambari: the method is a Web-based tool and supports the supply, management and monitoring of Apache Hadoop clusters. Ambari has supported most Hadoop components including HDFS, MapReduce, Hive, Pig, Hbase, Zookeeper, Sqoop, and Hcatalog, among others.
zookeeper: zookeeper is a distributed, open source distributed application coordination service.
LB: a load balancing service.
The etcd is a distributed, highly available and consistent key-value storage database, is implemented based on the Go language, and is mainly used for shared configuration and service discovery. The etcd is a key value warehouse with high availability and strong consistency, is widely applied to a plurality of distributed system architectures, and the most classical use scenario is service discovery.
According to an embodiment of the present application, there is provided an Ambari high availability system, and fig. 1 is a schematic structural diagram of an alternative Ambari high availability system according to an embodiment of the present application, and as shown in fig. 1, the system may include: a distributed coordinator and at least two central nodes (masters). Wherein:
the distributed coordinator is used for deleting first equipment information written by the current main central node in a target node of the distributed coordinator after monitoring that the current main central node is abnormal; and sending a writeable signal to at least one central node;
the central node is used for sending own second equipment information to the distributed coordinator according to the writable signal;
the distributed coordinator is also used for determining a target central node in the central nodes, wherein the target central node is successfully written with the second equipment information;
and the central node is also used for switching the main state and the standby state from the standby state to the main state according to the switching signal.
Optionally, the Ambari high availability system further comprises: at least one second slave node (slave);
the second slave node is used for sending an access request to the central node;
the central node is also used for acquiring an access request sent by the second slave node; and when the main/standby state is the main state, responding to the access request and establishing a link with the second slave node.
An embodiment of the present application provides a highly available method of Ambari, which may be applied to any form of electronic device, such as a terminal and a server. As shown in fig. 2, the Ambari high availability method is applied to a distributed coordinator, and the distributed coordinator is linked with at least two central nodes, where the active/standby state of one of the central nodes is a main state, and the active/standby states of other central nodes are standby states. There are various kinds of distributed coordinators, for example, and may be, but not limited to, zookeeper or etcd.
In particular, the process Ambari, which is highly useful, comprises:
step 201, after monitoring that the current master center node is abnormal, deleting the first device information written in the distributed coordinator by the current master center node.
In some embodiments, the current master central node is one of the at least two central nodes, and the current master central node is the central node (master) that most recently responded to the second slave node (slave) access request. The master-slave state of the master is the master state. In the normal service process of the service, the current main central node establishes a link with the distributed coordinator by adopting a heartbeat mechanism. The current master hub node exception may be a current master hub node and distributed coordinator link exception.
In an optional embodiment, the monitoring that the current master center node is abnormal may specifically include:
and the time for monitoring the link abnormality with the current main central node exceeds a first preset time.
By monitoring the link condition of the distributed coordinator and the current main central node, when the link abnormal time length of the distributed coordinator and the current main central node exceeds a first preset time length, the abnormality of the current main central node is determined.
It should be noted that, after the time length of the abnormal link between the current main central node and the distributed coordinator is monitored to be greater than or equal to a second preset time length, the main/standby state of the current main central node is switched from the main state to the standby state. And the second preset time length is greater than the heartbeat period of the distributed coordinator and the current main central node.
Further, in order to avoid the subsequent occurrence of two main central nodes, the first preset time length is set to be longer than the second preset time length. Therefore, after the current main central node is switched to the standby state (namely all the central nodes are in the standby state), the first device information written by the current main central node is deleted, so that the central nodes in the standby state can write own second device information into the distributed coordinator.
The first device information and the second device information may be, but not limited to, IP (Internet Protocol) of a central node.
Step 202, sending a writable signal to at least one central node, so that the at least one central node writes its own second device information to the distributed coordinator according to the writable signal.
In some embodiments, the distributed coordinator sends a writable signal to at least one central node after deleting the first device information, so as to notify the central node in the system of the standby state that the device information can be written. And after receiving the writable signal, the central node writes own second device information into the distributed coordinator.
The device information may be preset to be written into a certain node in the distributed coordinator, for example: active/master. The device information written by the central node is written into the node.
It can be understood that, after the current master central node switches the master/standby state to the standby state, if the current master central node establishes a link with the distributed coordinator again, the current master central node may also write its own device information into the distributed coordinator.
Step 203, determining a target central node in the at least one central node, into which the second device information is successfully written.
In some embodiments, since all the central nodes linked to the distributed coordinator in the system write their own device information, a "first come first served" mechanism is adopted in the present application, that is, the central node that preferentially writes its own device information is the main central node.
In the distributed coordinator including a master node and at least two first slave nodes, in an optional embodiment, determining a target center node in the center nodes, where the second device information is successfully written, includes:
after the second equipment information of the central node is written into the master node, the second equipment information is sequentially synchronized to the first slave nodes through the master node according to at least two first slave nodes; and determining the central node with the number of the first synchronous first slave nodes reaching the preset value as a target central node for successfully writing the second equipment information.
In some embodiments, the central node writes the second device information into a master node in the distributed coordinator, the master node of the distributed coordinator synchronizes the second device information to the first slave nodes, because there are multiple first slave nodes, the master node synchronizes the second device information to the first slave nodes in sequence, and among all the first slave nodes, the number of the first slave nodes that synchronize the second device information of a certain central node reaches a preset value first, and the central node is determined as a target central node.
The preset value may be set according to actual conditions, for example, set to a value smaller than the total number of the first slave nodes. Preferably, it may be, but is not limited to, set to half the total number.
Preferably, in order to avoid the situation that the number of the second device information of the two center nodes in the first slave node reaches the preset value simultaneously in the synchronization process, the total number of the first slave nodes is set to be an odd number (2N +1, N is a positive integer), and the preset value is set to be N + 1.
And step 204, sending a switching signal to the target central node, so that the target central node switches the main/standby state from the standby state to the main state according to the switching signal.
In some embodiments, after determining the target central node, the distributed coordinator sends a switching signal to the target central node, so that the target central node switches the active/standby state from the standby state to the main state according to the switching signal, thereby recovering the service.
According to the Ambari high-availability method, after the current main central node is abnormal, manual participation is not needed, after the abnormal linking time of the current main central node and the distributed coordinator reaches a second preset value, the current main central node switches the main/standby state from the main state to the standby state, and after the abnormal linking time reaches the second preset value, the distributed coordinator deletes first equipment information written by the current central node and sends a writable signal to the central node in the system. After receiving the writable signal, each central node writes second device information of the central node into the distributed coordinator, the distributed coordinator determines a target central node according to the written second device information and sends a switching signal to the target central node, and the target central node switches the main state and the standby state into the main state according to the switching signal, so that service is recovered.
In an embodiment of the present application, another highly available method of Ambari is provided, and specific implementation of the method may be described in the above description of the method embodiment, and repeated details are not repeated. The method can be applied to any form of electronic equipment, such as a terminal and a server. As shown in fig. 3, the high availability method of Ambari, applied to the central node, includes:
step 301, a writable signal sent by the distributed coordinator is obtained, where the writable signal is sent by the distributed coordinator after the first device information written by the current master central node in the target node of the distributed coordinator is deleted.
And step 302, writing own second device information into the distributed coordinator according to the writable signal.
And step 303, acquiring a switching signal sent by the distributed coordinator.
And step 304, switching the main/standby state from the standby state to the main state according to the switching signal.
In an alternative embodiment, the highly available method of Ambari further comprises:
acquiring an access request sent by a second slave node; and if the main/standby state is the main state, responding to the access request and establishing a link with the second slave node.
In some embodiments, the second slave node is a slave node corresponding to the central node. Each second slave node slave is configured with addresses of all master nodes, and each second slave node slave is configured with addresses of a plurality of master nodes.
And the slave accesses all configured masters according to a preset access strategy. If the master responds to the request normally, then slave considers the master to be master and continues to access the master at a later time. If a master does not respond to requests normally, then the slave continues to try to access other masters. Specifically, there are many cases where the master does not normally respond to a request, for example, the master does not respond; as another example, the master responds, but the master feedback itself is not dominant.
The access policy may be sequential access according to the order of the master, or may be random access.
In an embodiment of the present application, referring to fig. 4, after a master is started, its active/standby state is a standby state, when a certain master1 writes its own device information (i.e. the above device information) on a certain node (/ active/master) of zookeeper, it is determined as a master, and switches its active/standby state to a master state. After the master1 master/slave state changes to master, it can respond to the slave request normally. The slave accesses all configured masters according to a random access policy. If the master responds to the request normally, then slave considers the master to be master and continues to access the master at a later time. Otherwise, the slave needs to be told to be the standby center node, and the request is not accepted. After the master1 is abnormal and the length of the link abnormality exceeds a second preset length of time T2, the master/standby state is switched to the standby state. In this case, all the masters are in the standby state, and thus the problem that a plurality of masters become the master does not occur. During this period, no master is available and the service is temporarily unavailable, but in a very short time it will automatically resume.
The device information written in the node of the zookeeper is temporary information, and after the time length of the link exception exceeds a first preset time length T1(T2< T1), the temporary information is deleted by the zookeeper. The other master may attempt to write its own information to the zookeeper node and the master that successfully writes becomes the master.
Based on the same concept, the embodiment of the present application provides a high-availability apparatus for Ambari, and the specific implementation of the apparatus may refer to the description of the method embodiment, and repeated details are not repeated, as shown in fig. 5, the apparatus mainly includes:
a deleting module 501, configured to delete, after it is monitored that a current master center node is abnormal, first device information written in the distributed coordinator by the current master center node; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
a first sending module 502, configured to send a writable signal to at least one central node, so that the at least one central node writes second device information of itself into the distributed coordinator according to the writable signal;
a determining module 503, configured to determine a target central node, in which the second device information is successfully written, in the at least one central node;
a second sending module 504, configured to send a switching signal to the target central node, so that the target central node switches the active/standby state from the standby state to the main state according to the switching signal.
Based on the same concept, the embodiment of the present application provides a high-availability apparatus for Ambari, and the specific implementation of the apparatus may refer to the description of the method embodiment, and repeated details are not repeated, as shown in fig. 6, the apparatus mainly includes:
a first obtaining module 601, configured to obtain a writable signal sent by a distributed coordinator, where the writable signal is sent by the distributed coordinator after deleting first device information written by a current master center node in a target node of the distributed coordinator; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
a writing module 602, configured to write second device information of itself to the distributed coordinator according to the writable signal;
a second obtaining module 603, configured to obtain a handover signal sent by the distributed coordinator;
the switching module 604 is configured to switch the active/standby state from the standby state to the main state according to the switching signal.
Based on the same concept, an embodiment of the present application further provides an electronic device, as shown in fig. 7, the electronic device mainly includes: a processor 701, a memory 702, and a communication bus 703, wherein the processor 701 and the memory 702 communicate with each other via the communication bus 703. The memory 702 stores a program executable by the processor 701, and the processor 701 executes the program stored in the memory 702 to implement the following steps:
when the abnormality of the current main central node is monitored, deleting first equipment information written in the distributed coordinator by the current main central node, wherein the current main central node is one of at least two central nodes;
sending a writable signal to at least one central node, so that the at least one central node writes own second equipment information into the distributed coordinator according to the writable signal;
determining a target central node in which the second equipment information is successfully written in at least one central node;
and sending a switching signal to the target central node so that the target central node switches the main state from the standby state to the main state according to the switching signal. Or the like, or, alternatively,
acquiring a writable signal sent by a distributed coordinator, wherein the writable signal is sent by the distributed coordinator after deleting first equipment information written by a current main central node in a target node of the distributed coordinator;
writing own second equipment information into the distributed coordinator according to the writable signal;
acquiring a switching signal sent by a distributed coordinator;
and switching the main state and the standby state from the standby state to the main state according to the switching signal.
The communication bus 703 mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 703 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.
The Memory 702 may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor 701.
The Processor 701 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like, or may be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic devices, discrete gates or transistor logic devices, and discrete hardware components.
In yet another embodiment of the present application, there is also provided a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform the high availability method of Ambari described in the above embodiment.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions according to the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes, etc.), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (13)

1. A highly available method of Ambari, applied to a distributed coordinator linked to at least two central nodes, the method comprising:
when the abnormality of the current main central node is monitored, deleting first equipment information written in the distributed coordinator by the current main central node; the current main central node is one of the at least two central nodes;
sending a writable signal to at least one central node, so that the at least one central node writes second equipment information of the central node into the distributed coordinator according to the writable signal;
determining a target central node in the at least one central node, into which the second device information is successfully written;
and sending a switching signal to the target central node so that the target central node switches the main and standby states from the standby state to the main state according to the switching signal.
2. The Ambari high availability method of claim 1, wherein monitoring for a current primary hub node anomaly comprises:
and monitoring that the time length of the abnormal link with the current main central node exceeds a first preset time length.
3. The Ambari high availability method according to claim 2, wherein the first preset duration is greater than a second preset duration, the second preset duration being greater than a heartbeat cycle of the distributed coordinator linked with the current master central node.
4. The high availability method of Ambari according to claim 1, characterized in that the distributed coordinator comprises a master node and at least two first slave nodes; the determining a target central node, in the at least one central node, to which the second device information is successfully written, includes:
after the second device information of the central node is written into the master node, the second device information is sequentially synchronized to the at least two first slave nodes through the master node;
and determining the center node with the number of the first synchronized first slave nodes reaching a preset value as the target center node which is successfully written in the second equipment information.
5. The high availability method of Ambari according to claim 4, characterized in that the preset value is half of the total number of the first slave nodes.
6. A highly available method of Ambari, applied to a central node, comprising:
acquiring a writable signal sent by a distributed coordinator, wherein the writable signal is sent by the distributed coordinator after first equipment information written by a current main central node in a target node of the distributed coordinator is deleted;
writing second equipment information of the distributed coordinator into the distributed coordinator according to the writable signal;
acquiring a switching signal sent by the distributed coordinator;
and switching the main state and the standby state from the standby state to the main state according to the switching signal.
7. The highly available process of Ambari according to claim 6, further comprising:
acquiring an access request sent by a second slave node;
and if the main/standby state is the main state, responding to the access request and establishing a link with the second slave node.
8. A highly available system of Ambari, comprising: the distributed coordinator comprises a distributed coordinator and at least two central nodes, wherein the distributed coordinator is linked with the at least two central nodes;
the distributed coordinator is used for deleting first equipment information written by the current main central node in a target node of the distributed coordinator after monitoring that the current main central node is abnormal; and sending a writeable signal to at least one of said central nodes; the current main central node is one of the at least two central nodes;
the central node is used for sending second equipment information of the central node to the distributed coordinator according to the writable signal;
the distributed coordinator is further configured to determine a target central node, in the central nodes, into which the second device information is successfully written;
and the central node is also used for switching the main state and the standby state from the standby state to the main state according to the switching signal.
9. The Ambari high availability system of claim 8, further comprising: at least one second slave node;
the second slave node is used for sending an access request to the central node;
the central node is further configured to acquire an access request sent by the second slave node; and when the master/standby state is the master state, responding to the access request and establishing a link with the second slave node.
10. A highly available apparatus for Ambari, comprising:
the deleting module is used for deleting first equipment information written in the distributed coordinator by the current main central node after monitoring that the current main central node is abnormal; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
a first sending module, configured to send a writable signal to at least one central node, so that the at least one central node writes second device information of itself into the distributed coordinator according to the writable signal;
a determining module, configured to determine a target central node, in the at least one central node, where the second device information is successfully written;
and the second sending module is used for sending a switching signal to the target central node so that the target central node switches the main and standby states from the standby state to the main state according to the switching signal.
11. A highly available apparatus for Ambari, comprising:
the first obtaining module is used for obtaining a writable signal sent by a distributed coordinator, wherein the writable signal is sent by the distributed coordinator after first equipment information written by the current main central node in a target node of the distributed coordinator is deleted; the distributed coordinator is linked with at least two central nodes, and the current main central node is one of the at least two central nodes;
the writing module is used for writing second equipment information of the writing module into the distributed coordinator according to the writable signal;
a second obtaining module, configured to obtain a handover signal sent by the distributed coordinator;
and the switching module is used for switching the main state and the standby state from the standby state to the main state according to the switching signal.
12. An electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory for storing a computer program;
the processor, executing a program stored in the memory, implementing the Ambari high availability method of any of claims 1-5 or 6-7.
13. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the Ambari high availability method of any of claims 1-5 or 6-7.
CN202210021964.9A 2022-01-10 2022-01-10 Highly available method, system, apparatus, electronic device and storage medium for Ambari Pending CN114338370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210021964.9A CN114338370A (en) 2022-01-10 2022-01-10 Highly available method, system, apparatus, electronic device and storage medium for Ambari

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210021964.9A CN114338370A (en) 2022-01-10 2022-01-10 Highly available method, system, apparatus, electronic device and storage medium for Ambari

Publications (1)

Publication Number Publication Date
CN114338370A true CN114338370A (en) 2022-04-12

Family

ID=81026378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210021964.9A Pending CN114338370A (en) 2022-01-10 2022-01-10 Highly available method, system, apparatus, electronic device and storage medium for Ambari

Country Status (1)

Country Link
CN (1) CN114338370A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104811325A (en) * 2014-01-24 2015-07-29 华为技术有限公司 Cluster node controller monitoring method, related device and controller
CN105934929A (en) * 2014-12-31 2016-09-07 华为技术有限公司 Post-cluster brain split quorum processing method and quorum storage device and system
CN109101196A (en) * 2018-08-14 2018-12-28 北京奇虎科技有限公司 Host node switching method, device, electronic equipment and computer storage medium
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN111787511A (en) * 2020-07-13 2020-10-16 重庆大学 Zigbee network and node switching method thereof
CN112860787A (en) * 2019-11-27 2021-05-28 上海哔哩哔哩科技有限公司 Method for switching master nodes in distributed master-slave system, master node device and storage medium
CN112866314A (en) * 2019-11-27 2021-05-28 上海哔哩哔哩科技有限公司 Method for switching slave nodes in distributed master-slave system, master node device and storage medium
CN113760468A (en) * 2021-01-19 2021-12-07 北京沃东天骏信息技术有限公司 Distributed election method, device, system and medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104811325A (en) * 2014-01-24 2015-07-29 华为技术有限公司 Cluster node controller monitoring method, related device and controller
CN105934929A (en) * 2014-12-31 2016-09-07 华为技术有限公司 Post-cluster brain split quorum processing method and quorum storage device and system
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN109101196A (en) * 2018-08-14 2018-12-28 北京奇虎科技有限公司 Host node switching method, device, electronic equipment and computer storage medium
CN112860787A (en) * 2019-11-27 2021-05-28 上海哔哩哔哩科技有限公司 Method for switching master nodes in distributed master-slave system, master node device and storage medium
CN112866314A (en) * 2019-11-27 2021-05-28 上海哔哩哔哩科技有限公司 Method for switching slave nodes in distributed master-slave system, master node device and storage medium
CN111787511A (en) * 2020-07-13 2020-10-16 重庆大学 Zigbee network and node switching method thereof
CN113760468A (en) * 2021-01-19 2021-12-07 北京沃东天骏信息技术有限公司 Distributed election method, device, system and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
任乐乐;何灵敏;: "一种改进的主从节点选举算法用于实现集群负载均衡", 中国计量学院学报, no. 03 *
周晓垣: "《区块链时代 数字货币意味着什么》", 30 November 2018, 天津人民出版社, pages: 227 *

Similar Documents

Publication Publication Date Title
US10979286B2 (en) Method, device and computer program product for managing distributed system
CN109951331B (en) Method, device and computing cluster for sending information
US9984140B1 (en) Lease based leader election system
WO2018036148A1 (en) Server cluster system
GB2407887A (en) Automatically modifying fail-over configuration of back-up devices
US9367261B2 (en) Computer system, data management method and data management program
CN109101196A (en) Host node switching method, device, electronic equipment and computer storage medium
US11445013B2 (en) Method for changing member in distributed system and distributed system
CN106960060B (en) Database cluster management method and device
US8533525B2 (en) Data management apparatus, monitoring apparatus, replica apparatus, cluster system, control method and computer-readable medium
CN110958151B (en) Keep-alive detection method, keep-alive detection device, node, storage medium and communication system
WO2017097006A1 (en) Real-time data fault-tolerance processing method and system
CN106230622B (en) Cluster implementation method and device
CN114138732A (en) Data processing method and device
CN111026807A (en) Distributed lock synchronization method and device, computer equipment and readable storage medium
CN107071189B (en) Connection method of communication equipment physical interface
CN111865632B (en) Switching method of distributed data storage cluster and switching instruction sending method and device
CN103428288A (en) Method for synchronizing copies on basis of partition state tables and coordinator nodes
CN108509296B (en) Method and system for processing equipment fault
CN113079098B (en) Method, device, equipment and computer readable medium for updating route
WO2015196692A1 (en) Cloud computing system and processing method and apparatus for cloud computing system
CN114338370A (en) Highly available method, system, apparatus, electronic device and storage medium for Ambari
CN114598711B (en) Data migration method, device, equipment and medium
CN115225464B (en) Network switching method, device, equipment and storage medium
CN116260827A (en) Election method, election system and related device of leader in cluster

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination