WO2018036148A1 - Système de grappe de serveurs - Google Patents

Système de grappe de serveurs Download PDF

Info

Publication number
WO2018036148A1
WO2018036148A1 PCT/CN2017/077631 CN2017077631W WO2018036148A1 WO 2018036148 A1 WO2018036148 A1 WO 2018036148A1 CN 2017077631 W CN2017077631 W CN 2017077631W WO 2018036148 A1 WO2018036148 A1 WO 2018036148A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
service
subsystem
master node
election
Prior art date
Application number
PCT/CN2017/077631
Other languages
English (en)
Chinese (zh)
Inventor
周光明
李岩
张丛喆
Original Assignee
东方网力科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东方网力科技股份有限公司 filed Critical 东方网力科技股份有限公司
Publication of WO2018036148A1 publication Critical patent/WO2018036148A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/46Cluster building

Definitions

  • the present invention relates to the field of computer technology, and in particular to a server cluster system.
  • a service cluster is a group of computer systems that provide a set of network resources to a user as a whole. These individual computer systems are the compute nodes of the service cluster.
  • the service cluster usually sets the master and slave nodes according to certain policies, and then the master and slave nodes work according to their respective tasks.
  • the master node monitors the working state of the slave nodes.
  • the primary node in the service cluster may have a single point of failure during operation.
  • a plurality of computers that can be used as the primary node are usually set in the server cluster, and each of the nodes in the server cluster system is externally set.
  • the service address of the network connection After the primary node is replaced, the user of the server cluster system also needs to modify the service address of the secondary node according to the service address preset by the replaced primary node, so that the secondary node can connect to the changed primary node to complete the service cluster.
  • the user of the service cluster also modifies the service address of the secondary node according to the service address set in the replaced primary node, so that the secondary node can connect to the changed primary node to continue the previous function.
  • the content that needs to be modified is more, which increases the maintenance difficulty of the service cluster and reduces the availability of the service cluster.
  • an object of embodiments of the present invention is to provide a server cluster system to improve the availability of a server cluster system.
  • an embodiment of the present invention provides a server cluster system, including: each computing node that performs data interaction with each other, where each computing node includes: one master node and multiple slave nodes;
  • the master node is configured to set a preset service address and run preset service software; when a fault occurs, a new master node is elected together with the multiple slave nodes; and when a new master node is elected, Deleting the set preset service address;
  • the plurality of slave nodes are configured to monitor an active state of the master node, and when the master node fails, elect a new master node together with the master node.
  • each of the computing nodes includes: an election subsystem, a configuration subsystem, and a guard subsystem that perform data interaction with each other. ;
  • the election subsystem is configured to: when the primary node fails, obtain an election address from the configuration subsystem, and perform an election operation according to the obtained election address, and select a new primary node from the multiple computing nodes;
  • the configuration subsystem is configured to store the preset election address, the preset service address, and configuration information for running the service software
  • the guard subsystem is configured to monitor the election subsystem to switch the computing node from the primary node to the secondary node or from the secondary node to the primary node.
  • the embodiment of the present invention provides a second possible implementation manner of the first aspect, wherein: when the computing node is the primary node, the election subsystem includes:
  • a service address processing module configured to acquire and set the preset service address from the configuration subsystem
  • a broadcast module configured to broadcast the identifier of the computing node as an election result to other computing nodes except the primary node, so that the other computing nodes obtain an election result
  • the sending module is configured to send the primary node switching information to the guarding subsystem.
  • the service address processing module, the result broadcast module, and the sending module may use a central processing unit (CPU), a digital signal processor (DSP, Digital Singnal Processor), or programmable logic when performing processing.
  • Array FPGA, Field-Programmable Gate Array
  • the embodiment of the present invention provides a third possible implementation manner of the first aspect, wherein: when the computing node is the primary node, the configuration subsystem includes:
  • a service address processing module configured to return the preset service address to the election subsystem when the election subsystem obtains the preset service address
  • a configuration information sending module configured to send, to the guarding subsystem, preset configuration information for running the service software.
  • the service address processing module and the configuration information sending module may be implemented by using a CPU, a DSP, or an FPGA when performing processing.
  • the embodiment of the present invention provides a fourth possible implementation manner of the first aspect, wherein: when the computing node is the primary node, the guarding subsystem includes:
  • the state switching module is configured to switch the state as the computing node from the slave node to the master node according to the master node switching information sent by the election subsystem;
  • the service software running module is configured to acquire and run the service software according to the configuration information that is sent by the configuration subsystem to run the service software;
  • a configuration information synchronization module configured to synchronize configuration information of the service software to the slave node after running the service software
  • a service software monitoring module configured to monitor the running service software when The primary node is restarted when the service software fails.
  • the state switching module, the service software running module, the configuration information synchronization module, and the service software monitoring module may be implemented by using a CPU, a DSP, or an FPGA when performing processing.
  • the embodiment of the present invention provides a fifth possible implementation manner of the first aspect, wherein: when the computing node is the slave node, the election subsystem includes:
  • the service address judging module is configured to determine whether the preset service address is set in the slave node when receiving the election result sent by the master node;
  • And deleting the module configured to delete the preset service address set in the slave node if the judgment result obtained by the service address determining module is YES.
  • the service address judging module and the deleting module may be implemented by using a CPU, a DSP, or an FPGA when performing processing.
  • the embodiment of the present invention provides a sixth possible implementation manner of the first aspect, wherein: when the computing node is the slave node, the configuration subsystem includes:
  • a copy obtaining module configured to acquire configuration information of the service software synchronized by the primary node
  • a copy storage module configured to store the obtained configuration information of the service software.
  • the copy obtaining module and the copy storage module may be implemented by using a CPU, a DSP, or an FPGA when performing processing.
  • the embodiment of the present invention provides a seventh possible implementation manner of the first aspect, wherein: when the computing node is the slave node, the guarding subsystem includes:
  • a software operation judging module configured to determine whether the service software is running
  • the module is closed, and is configured to close the running service software if the judgment result obtained by the software running judgment module is YES.
  • the software operation judging module and the shutdown module may be implemented by using a CPU, a DSP, or an FPGA when performing processing.
  • an embodiment of the present invention provides an eighth possible implementation manner of the first aspect, wherein: the server cluster system includes a universal interface, and each of the computing nodes in the server cluster system passes the The universal interface acquires configuration information of the service software that is synchronized by the primary node.
  • an embodiment of the present invention provides a ninth possible implementation manner of the first aspect, wherein: the configuration subsystems in the respective computing nodes together constitute a distributed storage system.
  • a server cluster system provided by an embodiment of the present invention includes: each computing node that performs data interaction with each other, where each computing node includes: a master node and multiple slave nodes; and the master node is configured to set a preset Service address and running preset service software; when a failure occurs, a new master node is elected together with the plurality of slave nodes; and when a new master node is elected, the set preset service address is deleted;
  • the plurality of slave nodes are configured to monitor an active state of the master node, and when the master node fails, elect a new master node together with the master node.
  • the preset service address is set in the computing node as the master node, and after the master node is replaced, the preset service address that is set as the master node is deleted, and replaced with the prior art.
  • the slave node can be connected to the changed master node to ensure that the service address is unchanged, even if the service address is changed.
  • the master node and the slave node can still connect to the replaced master node through the preset service address, and enable the master and slave nodes to continue to perform the function of completing the service cluster without modifying the server cluster system, thereby avoiding the service to the slave node.
  • the address is modified.
  • the maintenance of the server cluster system is reduced, and the availability of the server cluster system is improved. Therefore, the service can be pulled from any computing node in the server cluster system to avoid single point of failure. At the same time, the computing node that pulls up the service can set the default service address.
  • the server cluster common external service address.
  • FIG. 1 is a schematic structural diagram of any computing node in a server cluster system according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a server cluster system before replacing a primary node in a server cluster system according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a server cluster system after replacing a primary node in a server cluster system according to an embodiment of the present invention.
  • a server cluster system including: computing nodes that perform data interaction with each other, where each computing node includes: a master node and multiple slave nodes;
  • the above-mentioned master node is configured to set a preset service address and run preset service software; and when a fault occurs, a new master node is elected together with multiple slave nodes; and when a new master node is elected, the setting is deleted.
  • Default service address ;
  • the plurality of slave nodes are configured to monitor the working state of the master node, and when the master node fails, the new master node is elected together with the master node.
  • the preset service address is a preset fixed IP address, which is pre-stored in each computing node. Only the computing node acting as the master node can read and set the preset service address, and the slave node cannot access the server cluster system. The default service address is set.
  • the above service software is configured to implement the functions provided by the server cluster system.
  • the master-slave node in the server cluster system can interact with the external network through the preset service address set by the master node, so that the service software performs the functions required to be completed.
  • a slave node when a slave node finds that it is currently unable to connect to the master node, it determines that the master node fails, and sends a master node fault information to other compute nodes in the server cluster system, and initiates a new election. The process of the master node, so that all compute nodes in the server cluster system conduct elections for the new master node.
  • an existing master node such as the paxos election algorithm may be elected to elect the server cluster system.
  • the other existing election algorithms can also be used to elect the master node of the server cluster system, which will not be repeated here.
  • the above server cluster system may also adopt an existing multi-network structure.
  • the server cluster system sets a preset service address in the computing node as the master node, and deletes the preset service address set by the computing node that is the master node before replacing the master node.
  • the service address of the slave node needs to be modified according to the service address preset by the replaced master node, so that the slave node can be connected to the changed master node to ensure the service.
  • the address remains unchanged. Even if the master node is replaced, the slave node can still connect to the replaced master node through the preset service address.
  • the master slave node can continue to perform the function of completing the service cluster without modifying the server cluster system.
  • the modification of the service address in the slave node reduces the maintenance difficulty of the server cluster system and improves the availability of the server cluster system, so that it can be pulled up from any computing node in the server cluster system. Service to avoid single point of failure problems while allowing the compute node to pull up the service Set default service address, the server cluster to achieve a unified external service address.
  • each of the foregoing computing nodes includes: an election subsystem, a configuration subsystem, and a guard subsystem that perform data interaction with each other;
  • the election subsystem is configured to obtain an election address from the configuration subsystem when the primary node fails, and perform an election operation according to the obtained election address, and elect a new primary node from multiple computing nodes;
  • the configuration subsystem is configured to store a preset election address, a preset service address, and configuration information of running the service software.
  • the daemon subsystem is configured to monitor the election subsystem to switch the compute node from the master node to the slave node or from the slave node to the master node.
  • the functions of the election subsystem, the configuration subsystem, and the daemon subsystem described above are general functions.
  • the following uses a computing node as a master node or a slave node as an example to elect an subsystem in each computing node.
  • the functions of the configuration subsystem and the daemon subsystem are further described.
  • the above election subsystem includes:
  • a service address processing module configured to obtain and set a preset service address from the configuration subsystem
  • the result broadcast module is configured to broadcast the identifier of the computing node as an election result to other computing nodes except the master node, so that other computing nodes obtain the election result;
  • the sending module is configured to send the primary node switching information to the guarding subsystem.
  • the identifier of the above computing node is preset in each computing node.
  • any computing node in the server cluster system determines that it is the master node after passing the election, it will obtain the identity of its own computing node, and then obtain the The identity of the computing node is broadcast to other computing nodes to end the election process of the primary node.
  • the above configuration subsystem includes:
  • the service address processing module is configured to return a preset service address to the election subsystem when the election subsystem obtains the preset service address;
  • the configuration information sending module is configured to send configuration information of the preset running service software to the guarding subsystem.
  • the configuration information of the running service software is configured to form a running environment of the service software in the master node before running the service software, so that the software and hardware of the master node meet the operating conditions of the service software; Run the service software in the operating environment.
  • the above guardian subsystem includes:
  • the state switching module is configured to switch the state of the computing node from the slave node to the master node according to the master node switching information sent by the election subsystem;
  • the service software running module is configured to acquire and run the service software according to the configuration information of the running service software sent by the configuration subsystem;
  • the configuration information synchronization module is configured to synchronize the configuration information of the service software to the slave node after running the service software;
  • the service software monitoring module is configured to monitor the running service software and restart the primary node when the service software fails.
  • the configuration information synchronization module can synchronize the configuration information of the service software to the slave node, so that the configuration information of the service software stored between the master node and the slave node is a copy of each other, thereby ensuring the reliability of the service software running.
  • the foregoing election subsystem includes:
  • the service address judging module is configured to determine whether a preset service address is set in the slave node when receiving the election result sent by the master node;
  • the module is deleted, and is configured to delete the preset service address set in the node if the judgment result obtained by the service address judging module is YES.
  • the slave node should delete the preset service address from the node. .
  • the above configuration subsystem includes:
  • the copy obtaining module is configured to obtain configuration information of the service software synchronized by the master node
  • a copy storage module configured to store configuration information of the obtained service software.
  • the configuration subsystem of the slave node can store the configuration information of the service software synchronized by the master node, so that the service software stored in the master node and the slave node
  • the configuration information is consistent and mutually replica.
  • the new primary node that is elected only needs to configure the configuration information of the service software in the pre-existing configuration subsystem.
  • the service software can be run in the elected new primary node, without maintenance personnel performing on-site debugging of the new primary node or enabling the new primary node to obtain service software from the previous primary node.
  • the configuration information can make the service software run smoothly, further reducing the maintenance difficulty of the server cluster system and improving the availability of the server cluster system.
  • guardian subsystem When the compute node is acting as a slave node, the above guardian subsystem includes:
  • the software running judgment module is configured to determine whether the service software is running
  • the module is closed, and is configured to close the running service software if the judgment result obtained by the software running judgment module is YES.
  • the slave node should close the service software to be run from the node. .
  • the server cluster system further includes a universal interface, and the slave node obtains the configuration information of the service software synchronized by the master node through the universal interface.
  • the configuration subsystems in the various compute nodes together form a distributed storage system.
  • the configuration subsystem internal storage can provide configuration management functions for the service software in addition to the election address. When the service software needs the necessary configuration to work, you can get the corresponding configuration through the configuration subsystem.
  • the server cluster system does not need to be configured or the required configuration data has high availability capability, the high availability problem of the server cluster system itself can be realized without any development; If the server cluster system needs to configure data, only modify the method of accessing the configuration data to use this solution to achieve high system availability.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some communication interface, device or unit, and may be electrical, mechanical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including A number of instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods of various embodiments of the present invention Step.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
  • the preset service address is set in the computing node as the master node, and after the master node is replaced, the preset service address that is set as the master node is deleted, and replaced with the prior art.
  • the slave node can be connected to the changed master node to ensure that the service address is unchanged, even if the service address is changed.
  • the master node and the slave node can still connect to the replaced master node through the preset service address, and enable the master and slave nodes to continue to perform the function of completing the service cluster without modifying the server cluster system, thereby avoiding the service to the slave node.
  • the address is modified.
  • the maintenance of the server cluster system is reduced, and the availability of the server cluster system is improved. Therefore, the service can be pulled from any computing node in the server cluster system to avoid single point of failure. At the same time, the computing node that pulls up the service can set the default service address.
  • the server cluster common external service address.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Hardware Redundancy (AREA)
  • Computer And Data Communications (AREA)

Abstract

La présente invention concerne, dans certains modes de réalisation, un système de grappe de serveurs, comportant: des nœuds informatiques servant à faire interagir des données entre elles; chaque nœud informatique comporte un nœud maître et une pluralité de nœuds esclaves; le nœud maître est configuré pour spécifier une adresse de service prédéfinie et exécuter un logiciel de service prédéfini, élire un nouveau nœud maître conjointement à la pluralité de nœuds esclaves lorsqu'une panne survient et supprimer l'adresse de service prédéfinie lorsque le nouveau nœud maître est élu; la pluralité de nœuds esclaves est configurée pour surveiller l'état de fonctionnement du nœud maître et élire un nouveau nœud maître conjointement au nœud maître lorsqu'une panne du nœud maître survient. Avec le système de grappe de serveurs selon la présente invention, la disponibilité du système de grappe de serveurs peut être améliorée.
PCT/CN2017/077631 2016-08-23 2017-03-22 Système de grappe de serveurs WO2018036148A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610710853.3A CN106331098B (zh) 2016-08-23 2016-08-23 一种服务器集群系统
CN201610710853.3 2016-08-23

Publications (1)

Publication Number Publication Date
WO2018036148A1 true WO2018036148A1 (fr) 2018-03-01

Family

ID=57742248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/077631 WO2018036148A1 (fr) 2016-08-23 2017-03-22 Système de grappe de serveurs

Country Status (2)

Country Link
CN (1) CN106331098B (fr)
WO (1) WO2018036148A1 (fr)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717379A (zh) * 2018-05-08 2018-10-30 平安证券股份有限公司 电子装置、分布式任务调度方法及存储介质
CN111917576A (zh) * 2020-07-28 2020-11-10 星辰天合(北京)数据科技有限公司 存储集群的控制方法和装置
CN112214280A (zh) * 2020-09-16 2021-01-12 中国科学院计算技术研究所 一种电力系统仿真的云化方法及系统
CN112214377A (zh) * 2020-10-21 2021-01-12 新华三信息安全技术有限公司 一种设备管理方法及系统
CN112395269A (zh) * 2020-11-16 2021-02-23 中国工商银行股份有限公司 MySQL高可用组的搭建方法及装置
CN112492030A (zh) * 2020-11-27 2021-03-12 北京青云科技股份有限公司 数据存储方法、装置、计算机设备和存储介质
CN112769634A (zh) * 2020-12-09 2021-05-07 航天信息股份有限公司 一种基于Zookeeper的可横向扩展的分布式系统及开发方法
CN112988882A (zh) * 2019-12-12 2021-06-18 阿里巴巴集团控股有限公司 数据的异地灾备系统、方法及装置、计算设备
CN113162735A (zh) * 2021-03-30 2021-07-23 北京城建设计发展集团股份有限公司 基于通用服务器的增强型信号控制系统及方法
CN113220421A (zh) * 2021-05-31 2021-08-06 深圳市恒扬数据股份有限公司 一种服务器集群的管理方法、管理服务器及管理系统
CN114143182A (zh) * 2021-11-18 2022-03-04 新华三大数据技术有限公司 一种配置分布式搜索引擎集群的节点的方法和装置
CN114172792A (zh) * 2021-12-13 2022-03-11 武汉众邦银行股份有限公司 一种保证服务高可用的序号生成方法的实现方法及装置
CN114640417A (zh) * 2022-03-31 2022-06-17 苏州浪潮智能科技有限公司 一种时钟同步方法、装置、设备及存储介质
CN115277379A (zh) * 2022-07-08 2022-11-01 北京城市网邻信息技术有限公司 分布式锁容灾处理方法、装置、电子设备及存储介质
CN115665159A (zh) * 2022-12-14 2023-01-31 中国华能集团清洁能源技术研究院有限公司 一种大数据环境下的元数据管理方法及系统
CN115883575A (zh) * 2022-11-23 2023-03-31 紫光云技术有限公司 一种基于b树的高可用集群优化方法

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106331098B (zh) * 2016-08-23 2020-01-21 东方网力科技股份有限公司 一种服务器集群系统
CN107026762B (zh) * 2017-05-24 2020-07-03 郑州云海信息技术有限公司 一种基于分布式集群的容灾系统及方法
WO2018227365A1 (fr) * 2017-06-13 2018-12-20 深圳市伊特利网络科技有限公司 Procédé et système de sélection d'un dispositif principal dans une liaison de réseau
CN107329830A (zh) * 2017-06-28 2017-11-07 郑州云海信息技术有限公司 一种分布式集群系统优化的方法及装置
CN107547257B (zh) * 2017-07-14 2021-08-24 新华三技术有限公司 一种服务器集群实现方法及装置
CN109995835A (zh) * 2017-12-29 2019-07-09 浙江宇视科技有限公司 主节点选举方法、装置和分布式存储系统
CN110198325B (zh) * 2018-02-26 2022-04-12 北京京东尚科信息技术有限公司 通信方法、装置和系统、网络服务器和存储介质
CN108446163A (zh) * 2018-02-28 2018-08-24 山东乾云启创信息科技股份有限公司 基于openstack的dhcp-server高可用的实现方法及系统
CN109062923B (zh) * 2018-06-04 2022-04-19 创新先进技术有限公司 一种集群状态切换方法及装置
CN108989391B (zh) * 2018-06-19 2021-09-07 北京百悟科技有限公司 一种一致性处理的方法及系统
CN111131361B (zh) * 2018-10-31 2023-03-24 北京国双科技有限公司 集群查询系统中连接节点的处理方法及装置
CN111193601A (zh) * 2018-11-15 2020-05-22 宝沃汽车(中国)有限公司 车载音频网络的配置方法和装置,车辆
CN111355600B (zh) * 2018-12-21 2023-05-02 杭州海康威视数字技术股份有限公司 一种主节点确定方法和装置
CN109818785B (zh) * 2019-01-15 2020-04-03 无锡华云数据技术服务有限公司 一种数据处理方法、服务器集群及存储介质
CN109951331B (zh) * 2019-03-15 2021-08-20 北京百度网讯科技有限公司 用于发送信息的方法、装置和计算集群
CN110018932B (zh) * 2019-03-26 2023-12-01 中国联合网络通信集团有限公司 一种容器磁盘的监控方法及装置
CN110764918A (zh) * 2019-11-04 2020-02-07 浪潮云信息技术有限公司 一种容器集群中主节点管理方法
CN111181779A (zh) * 2019-12-20 2020-05-19 苏州浪潮智能科技有限公司 一种集群故障转移性能的测试方法、设备以及存储介质
CN111866094B (zh) * 2020-07-01 2023-10-31 天津联想超融合科技有限公司 一种定时任务处理方法、节点及计算机可读存储介质
CN112596893B (zh) * 2020-11-23 2021-10-08 中标慧安信息技术股份有限公司 用于多节点边缘计算设备的监控方法和系统
CN113542052A (zh) * 2021-06-07 2021-10-22 新华三信息技术有限公司 一种节点故障确定方法、装置和服务器
CN113840395B (zh) * 2021-08-26 2024-08-16 杭州涂鸦信息技术有限公司 任务设备自组网、协作工作方法及相关设备
CN113794765B (zh) * 2021-09-10 2024-10-01 奇安信科技集团股份有限公司 基于文件传输的网闸负载均衡方法及装置
CN114363350B (zh) * 2021-12-14 2024-04-16 中科曙光南京研究院有限公司 一种服务治理系统及方法
CN115002116A (zh) * 2022-05-30 2022-09-02 紫光建筑云科技(重庆)有限公司 一种云平台上分布式redis集群与可靠性检测的方法
CN115580645A (zh) * 2022-11-10 2023-01-06 北京青云科技股份有限公司 一种服务切换方法、装置、电子设备和存储介质
CN118101441B (zh) * 2024-04-28 2024-07-23 北京腾达泰源科技有限公司 业务调度方法、装置、设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629906A (zh) * 2012-03-30 2012-08-08 浪潮电子信息产业股份有限公司 一种将集群管理节点做双机实现提高集群业务可用性的设计办法
CN104283710A (zh) * 2014-08-18 2015-01-14 四川长虹电器股份有限公司 数据库集群的故障处理方法和管理服务器
CN104679604A (zh) * 2015-02-12 2015-06-03 大唐移动通信设备有限公司 一种主节点和备节点切换的方法和装置
CN104679907A (zh) * 2015-03-24 2015-06-03 新余兴邦信息产业有限公司 高可用高性能数据库集群的实现方法及系统
CN105024855A (zh) * 2015-07-13 2015-11-04 浪潮(北京)电子信息产业有限公司 分布式集群管理系统和方法
CN105159798A (zh) * 2015-08-28 2015-12-16 浪潮集团有限公司 一种虚拟机的双机热备方法、双机热备管理服务器和系统
CN106331098A (zh) * 2016-08-23 2017-01-11 东方网力科技股份有限公司 一种服务器集群系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100446495C (zh) * 2006-06-28 2008-12-24 华为技术有限公司 一种动态共享连接的方法和系统
CN101702721B (zh) * 2009-10-26 2011-08-31 北京航空航天大学 一种多集群系统的可重组方法
CN102904752B (zh) * 2012-09-25 2016-06-29 新浪网技术(中国)有限公司 一种节点选举方法、节点设备及系统
CN103118084B (zh) * 2013-01-21 2016-08-17 浪潮(北京)电子信息产业有限公司 一种主节点的选举方法及节点
CN103312809A (zh) * 2013-06-24 2013-09-18 北京汉柏科技有限公司 云平台中服务的分布式管理方法
CN104753994B (zh) * 2013-12-27 2019-04-02 杭州海康威视系统技术有限公司 基于集群服务器系统的数据同步方法及其装置
CN105338028B (zh) * 2014-07-30 2018-12-07 浙江宇视科技有限公司 一种分布式服务器集群中主从节点选举方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629906A (zh) * 2012-03-30 2012-08-08 浪潮电子信息产业股份有限公司 一种将集群管理节点做双机实现提高集群业务可用性的设计办法
CN104283710A (zh) * 2014-08-18 2015-01-14 四川长虹电器股份有限公司 数据库集群的故障处理方法和管理服务器
CN104679604A (zh) * 2015-02-12 2015-06-03 大唐移动通信设备有限公司 一种主节点和备节点切换的方法和装置
CN104679907A (zh) * 2015-03-24 2015-06-03 新余兴邦信息产业有限公司 高可用高性能数据库集群的实现方法及系统
CN105024855A (zh) * 2015-07-13 2015-11-04 浪潮(北京)电子信息产业有限公司 分布式集群管理系统和方法
CN105159798A (zh) * 2015-08-28 2015-12-16 浪潮集团有限公司 一种虚拟机的双机热备方法、双机热备管理服务器和系统
CN106331098A (zh) * 2016-08-23 2017-01-11 东方网力科技股份有限公司 一种服务器集群系统

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717379B (zh) * 2018-05-08 2023-07-25 平安证券股份有限公司 电子装置、分布式任务调度方法及存储介质
CN108717379A (zh) * 2018-05-08 2018-10-30 平安证券股份有限公司 电子装置、分布式任务调度方法及存储介质
CN112988882A (zh) * 2019-12-12 2021-06-18 阿里巴巴集团控股有限公司 数据的异地灾备系统、方法及装置、计算设备
CN112988882B (zh) * 2019-12-12 2024-01-23 阿里巴巴集团控股有限公司 数据的异地灾备系统、方法及装置、计算设备
CN111917576A (zh) * 2020-07-28 2020-11-10 星辰天合(北京)数据科技有限公司 存储集群的控制方法和装置
CN111917576B (zh) * 2020-07-28 2023-05-16 北京星辰天合科技股份有限公司 存储集群的控制方法、装置、计算机可读存储介质、处理器
CN112214280A (zh) * 2020-09-16 2021-01-12 中国科学院计算技术研究所 一种电力系统仿真的云化方法及系统
CN112214280B (zh) * 2020-09-16 2023-09-12 中国科学院计算技术研究所 一种电力系统仿真的云化方法及系统
CN112214377B (zh) * 2020-10-21 2022-09-27 新华三信息安全技术有限公司 一种设备管理方法及系统
CN112214377A (zh) * 2020-10-21 2021-01-12 新华三信息安全技术有限公司 一种设备管理方法及系统
CN112395269A (zh) * 2020-11-16 2021-02-23 中国工商银行股份有限公司 MySQL高可用组的搭建方法及装置
CN112395269B (zh) * 2020-11-16 2023-08-29 中国工商银行股份有限公司 MySQL高可用组的搭建方法及装置
CN112492030B (zh) * 2020-11-27 2024-03-15 北京青云科技股份有限公司 数据存储方法、装置、计算机设备和存储介质
CN112492030A (zh) * 2020-11-27 2021-03-12 北京青云科技股份有限公司 数据存储方法、装置、计算机设备和存储介质
CN112769634A (zh) * 2020-12-09 2021-05-07 航天信息股份有限公司 一种基于Zookeeper的可横向扩展的分布式系统及开发方法
CN112769634B (zh) * 2020-12-09 2023-11-07 航天信息股份有限公司 一种基于Zookeeper的可横向扩展的分布式系统及开发方法
CN113162735A (zh) * 2021-03-30 2021-07-23 北京城建设计发展集团股份有限公司 基于通用服务器的增强型信号控制系统及方法
CN113220421A (zh) * 2021-05-31 2021-08-06 深圳市恒扬数据股份有限公司 一种服务器集群的管理方法、管理服务器及管理系统
CN114143182A (zh) * 2021-11-18 2022-03-04 新华三大数据技术有限公司 一种配置分布式搜索引擎集群的节点的方法和装置
CN114143182B (zh) * 2021-11-18 2024-02-23 新华三大数据技术有限公司 一种配置分布式搜索引擎集群的节点的方法和装置
CN114172792A (zh) * 2021-12-13 2022-03-11 武汉众邦银行股份有限公司 一种保证服务高可用的序号生成方法的实现方法及装置
CN114172792B (zh) * 2021-12-13 2023-07-28 武汉众邦银行股份有限公司 一种保证服务高可用的序号生成方法的实现方法及装置
CN114640417A (zh) * 2022-03-31 2022-06-17 苏州浪潮智能科技有限公司 一种时钟同步方法、装置、设备及存储介质
CN115277379A (zh) * 2022-07-08 2022-11-01 北京城市网邻信息技术有限公司 分布式锁容灾处理方法、装置、电子设备及存储介质
CN115277379B (zh) * 2022-07-08 2023-08-01 北京城市网邻信息技术有限公司 分布式锁容灾处理方法、装置、电子设备及存储介质
CN115883575A (zh) * 2022-11-23 2023-03-31 紫光云技术有限公司 一种基于b树的高可用集群优化方法
CN115665159B (zh) * 2022-12-14 2023-04-28 中国华能集团清洁能源技术研究院有限公司 一种大数据环境下的元数据管理方法及系统
CN115665159A (zh) * 2022-12-14 2023-01-31 中国华能集团清洁能源技术研究院有限公司 一种大数据环境下的元数据管理方法及系统

Also Published As

Publication number Publication date
CN106331098B (zh) 2020-01-21
CN106331098A (zh) 2017-01-11

Similar Documents

Publication Publication Date Title
WO2018036148A1 (fr) Système de grappe de serveurs
US10560315B2 (en) Method and device for processing failure in at least one distributed cluster, and system
US10020980B2 (en) Arbitration processing method after cluster brain split, quorum storage apparatus, and system
US20200175036A1 (en) Fault-tolerant key management system
WO2021184587A1 (fr) Procédé et appareil de surveillance de nuage privé basée sur prometheus et dispositif informatique et support de stockage
CN102035862B (zh) Svc集群中配置节点的故障移交方法和系统
US9489230B1 (en) Handling of virtual machine migration while performing clustering operations
US10728099B2 (en) Method for processing virtual machine cluster and computer system
US10038593B2 (en) Method and system for recovering virtual network
US20150026125A1 (en) System and method for synchronizing data between communication devices in a networked environment without a central server
WO2017107827A1 (fr) Procédé et appareil pour isoler un environnement
CN107918570B (zh) 一种双活系统共享仲裁逻辑盘的方法
CN102420820B (zh) 一种集群系统中的隔离方法和装置
CN104158707A (zh) 一种检测并处理集群脑裂的方法和装置
WO2018157605A1 (fr) Procédé et dispositif de transmission de messages dans un système de fichiers en grappe
WO2017012383A1 (fr) Procédé d'enregistrement de service, procédé d'utilisation et appareil correspondant
CN107071189B (zh) 一种通讯设备物理接口的连接方法
CN105490847B (zh) 一种私有云存储系统中节点故障实时检测及处理方法
CN111865632A (zh) 分布式数据存储集群的切换方法及切换指令发送方法和装置
CN111309515A (zh) 一种容灾控制方法、装置及系统
WO2024055669A1 (fr) Procédé d'accès au centre de données en nuage pour station périphérique, et plateforme de gestion en nuage
CN115766405B (zh) 一种故障处理方法、装置、设备和存储介质
JP2015114952A (ja) ネットワークシステム、監視制御装置およびソフトウェア検証方法
CN116668269A (zh) 一种用于双活数据中心的仲裁方法、装置及系统
US9798633B2 (en) Access point controller failover system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17842574

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17842574

Country of ref document: EP

Kind code of ref document: A1