WO2020233001A1 - Distributed storage system comprising dual-control architecture, data reading method and device, and storage medium - Google Patents

Distributed storage system comprising dual-control architecture, data reading method and device, and storage medium Download PDF

Info

Publication number
WO2020233001A1
WO2020233001A1 PCT/CN2019/117349 CN2019117349W WO2020233001A1 WO 2020233001 A1 WO2020233001 A1 WO 2020233001A1 CN 2019117349 W CN2019117349 W CN 2019117349W WO 2020233001 A1 WO2020233001 A1 WO 2020233001A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
distributed node
data
distributed
standby
Prior art date
Application number
PCT/CN2019/117349
Other languages
French (fr)
Chinese (zh)
Inventor
王新
王欣
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020233001A1 publication Critical patent/WO2020233001A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • This application relates to the field of distributed storage technology, and in particular to a dual-control architecture distributed storage system, electronic device, data reading method, device, and computer-readable storage medium.
  • CEPH distributed file system is a distributed storage system with large capacity, high performance and strong reliability.
  • multiple storage nodes are provided and multiple copies of data can be stored.
  • the data stored in the server is also inaccessible at the same time.
  • the ceph system needs to restore the relevant data on this server before it can be accessed. It takes a long time to restore data on a server, which will affect the performance of the cluster.
  • a single disk can usually have a storage capacity of 6T or 8T, so the data stored on a server may be dozens of terabytes. The huge amount of data makes the above impact on the system more obvious.
  • the main purpose of this application is to provide a dual-control architecture distributed storage system, electronic device, data reading method, device, and computer-readable storage medium, aiming to adopt the dual-control design of storage nodes, in the case of only server failure Avoid copying large amounts of data and reduce the probability of node data recovery by the ceph system.
  • this application proposes an electronic device which is respectively connected to the active server and the standby server of multiple distributed nodes in a distributed system, and the active server and the standby server in the same distributed node
  • the servers are connected through a network.
  • the main server in each distributed node is in working state
  • the standby server in each distributed node is in standby state
  • each distributed node corresponds to a shared disk.
  • the main server and the backup server are respectively connected to the same shared disk corresponding to the distributed node where they are located.
  • the electronic device includes a memory and a processor.
  • the memory stores a data reading program, and the data reading program is
  • the processor implements the following steps when executing:
  • Initial access steps real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
  • Data reading step After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • Data reading step After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • this application also proposes a data reading method.
  • the electronic device is respectively communicatively connected with the active server and the standby server of multiple distributed nodes in the distributed system, and the main server in the same distributed node
  • the server and the backup server are connected through a network.
  • the main server in each distributed node is in working state
  • the backup server in each distributed node is in standby state
  • each distributed node corresponds to a shared disk.
  • the main server and the backup server in the distributed node are respectively connected to the same shared disk corresponding to the distributed node in communication, and the method includes the steps:
  • Initial access steps real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
  • Data reading step After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • the present application also provides a distributed storage system, including electronic devices, multiple active servers of multiple distributed nodes, and multiple backup servers in communication connection, and the active server in the same distributed node It is connected to the standby server through a network, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the same distributed node
  • the main server and the backup server are respectively in communication connection with the same shared disk corresponding to the distributed node where they are located.
  • the electronic device includes a memory and a processor, and a data reading program is stored on the memory, and the data reading program When executed by the processor, the following steps are implemented:
  • Initial access steps real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
  • Data reading step After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • this application also provides a computer-readable storage medium that stores a data reading program, and the data reading program can be executed by at least one processor to enable all The at least one processor executes the steps of the data reading program method described in any one of the above.
  • the dual-control architecture distributed storage system, electronic device, and computer-readable storage medium proposed in this application are designed with dual-control architecture at each distributed storage node, that is, equipped with two Two servers, respectively the active server and standby server.
  • the two servers adopt A/P (Active/Passive) mode, the active server of the distributed node is in working state, and the standby server of the distributed node is in standby state.
  • the same virtual IP address is dynamically configured through the cluster resource manager, that is, the two servers appear as the same virtual node to the entire ceph system.
  • Two actual physical hosts (servers) can access the same shared storage hard disk in the same way in working mode. Under normal circumstances, the storage disk under the distributed node is taken over by the active server in a working state.
  • the distributed system accesses the storage disk through the virtual IP address on the active server.
  • the cluster resource manager switches the virtual IP address on the current active server to the standby server, and the standby server of the distributed node is switched to working state and can access the storage disk data. Since most of the data information on the current shared storage disk is still valid, for the ceph system, the above-mentioned access mode is equivalent to the short-term disconnection of the distributed storage node and then reconnection.
  • the probability that the previous ceph system must recover large-scale data when the storage node fails can be greatly reduced.
  • the system is also designed with a monitoring module that can monitor whether data is being written when the working status of the two servers is switched. If data is written during the period, it will detect the data version number and synchronize this part of the data information to the current Server in working state. At the same time, by manually shutting down a physical host, it is also possible to upgrade the storage node configuration (such as adding memory sticks) when the ceph system is online.
  • Figure 1 is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of this application.
  • Figure 2 is a schematic diagram of the storage distribution relationship of the distributed storage system of this application.
  • FIG. 3 is a schematic diagram of the operating environment of the first embodiment of the data reading program of this application.
  • FIG. 5 is a schematic flowchart of the first embodiment of the data reading program of this application.
  • FIG. 1 is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of the present application.
  • the distributed storage system includes multiple distributed storage nodes.
  • Each distributed storage node has two physical hosts.
  • the two physical hosts are a primary server and a backup server.
  • the active server and the standby server in the distributed nodes are connected through a network, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, and each distributed node corresponds to
  • the active server and backup server in the same distributed node are respectively connected to the same shared disk corresponding to the distributed node, and the shared disk can be accessed in the same way.
  • an electronic device 1 is also provided in the distributed storage system, and the electronic device is in communication connection with the main server and the backup server in each distributed node (for example, communication connection via the network 2).
  • the above-mentioned electronic device 1 is set independently of the distributed storage system, and is in communication connection with the distributed storage system (for example, communication connection via the network 2).
  • This application proposes a data reading program.
  • FIG. 1 is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of this application.
  • the data reading program 10 is installed and run in the electronic device 1.
  • the electronic device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a server.
  • the electronic device 1 may include, but is not limited to, a memory 11 and a processor 12 that communicate with each other through a program bus.
  • FIG. 3 only shows the electronic device 1 with the components 11 and 12, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk or a memory of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, for example, a plug-in hard disk equipped on the electronic device 1, a smart media card (SMC), and a secure digital (SD) Card, Flash Card, etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 is used to store application software and various data installed in the electronic device 1, for example, the program code of the data writing program 10. The memory 11 can also be used to temporarily store data that has been output or will be output.
  • the processor 12 may be a central processing unit (CPU), a microprocessor or other data processing chip, which is used to run program codes or process data stored in the memory 11, for example to perform data writing Procedure 10 etc.
  • CPU central processing unit
  • microprocessor microprocessor or other data processing chip
  • FIG. 4 is a program module diagram of the first embodiment of the data reading program 10.
  • the data reading program 10 can be divided into one or more modules, one or more modules are stored in the memory 11, and are run by one or more processors (in this embodiment, the processor 12) Executed to complete this application.
  • the data reading program 10 can be divided into an initial access module 101, a state transition module 102, and a data reading module 103.
  • the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is more suitable than a program to describe the execution process of the data reading program 10 in the electronic device 1, where:
  • Initial access module real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently in working state in the distributed node to read the corresponding Data on the shared disk.
  • a host ie, a server
  • the storage disk communicating with the server is not actually damaged, and the data on the storage disk is still valid. If you can immediately start another server and take over the data at this time, and replace the broken faulty server, then for the entire distributed system, the node is still running normally, so there is no need for a large amount of data at this time Recovery, the entire system will naturally be restored to normal operation soon.
  • FIG. 2 is a schematic diagram of the storage distribution relationship of a distributed storage system.
  • each distributed node in the system is configured with two servers, and the two servers are in working and standby states respectively, according to the initial state
  • the difference is that the two servers in node A serve as the active server (A1) and the standby server (A2) respectively in the working state, that is, A1 is in the working state at this time, and A2 is ready in the standby state.
  • the primary server A1 and the backup server A2 are respectively connected to the same shared disk and can access the shared disk in the same way.
  • node B configures server B1 and server B2, and communicates with the shared disk. Among them, B1 is in working state and B2 is in standby state.
  • Node C configures server C1 and server C2, and communicates with the shared disk. Among them, C1 is in working state and C2 is in standby state. When the distributed system in this embodiment accesses the distributed node A, the system will first access the active server A1 in the node A that is currently working.
  • State conversion module if the main server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state.
  • FIG. 3 is a schematic diagram of the operating environment of the first embodiment of the data reading program.
  • the main server A1 and the backup server A2 are in the working state and the standby state, respectively, they appear as node A for the entire system.
  • the active server A1 When the system accesses the storage disk in the current node through the active server A1, when the active server is working normally and there is no failure, the active server A1 will directly access the storage disk connected to the communication, if the server A1 in the working state fails Can't access normally, originally in the standby state A2 will start the application process to start, and enter the working state from the standby state.
  • the switching of the working status between the active server A1 and the standby server A2 is equivalent to the node A being disconnected and connected again for a short time.
  • the standby server A2 replaces the active server A1 and plays the role of node A in the system cluster.
  • Data reading module After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • the standby server A2 that has transitioned from the standby state to the working state reads the data on the shared disk of the communication connection.
  • the primary server A1 and the backup server A2 are respectively connected to the same shared disk in communication, and the two servers can access the shared disk in exactly the same way.
  • the main server A1 fails, the system accesses the shared disk through the standby server A2 that is converted to working state.
  • the program further includes a monitoring module (not shown in the figure), which is used to implement the following steps when performing the state transition step:
  • Writing detection step if the main server in the distributed node fails and cannot be accessed, the monitoring module detects whether data is being written to the corresponding share through the main server in the distributed node Disk.
  • the monitoring module in the Ceph distributed system can detect whether data is being written. When no data is being written, the distributed system can automatically end the subsequent steps after the write detection step.
  • Step of synchronizing data if data is being written to the corresponding shared disk through the active server in the distributed node, after the standby server in the distributed node is successfully switched to the working state, the standby server The written data is written to the corresponding shared disk through the standby server in the distributed node.
  • the distributed system can recover the data changes that may be written in the system during the state transition between the primary server A1 and the backup server A2. Therefore, the system is still in a normal operating state instead of the previous one.
  • the deployment architecture takes a long time to recover a large amount of data, which affects the overall performance of the system.
  • the monitoring module detects that data is being written to the active server A1 when the active server A1 and the backup server A2 in the distributed node are in transition
  • the distributed system will resynchronize the written data To the standby server A2 in the distributed node that has been converted to a working state.
  • the same virtual address is configured on the active server and the standby server in the distributed node.
  • the active server A1 and the backup server A2 in the distributed node can behave as the same node A to the distributed system as a whole.
  • the virtual IP address is dynamically configured on the active server A1 or the standby server A2 that is in the working state.
  • the virtual addresses of the two servers are both set to 160.1.0.0.
  • the following steps are further implemented when the data reading program is executed by the processor:
  • Virtual access step access the main server in the distributed node through the virtual address.
  • the system accesses the current node A through the virtual address 160.1.0.0 configured by the primary server A1.
  • Address conversion step when the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node .
  • the virtual address 160.1.0.0 originally configured on the active server A1 will be dynamically configured on the standby server A2, and the standby server A2 originally in the standby state will be converted to Working status.
  • Address access step access the standby server in the distributed node through the virtual address.
  • the system accesses the current node A through the virtual address 160.1.0.0 configured on the standby server A2 at this time.
  • the electronic device further has a cluster resource manager for configuring the virtual address.
  • the configuration of the virtual IP address can be implemented by using pacemaker resource agent. For example, run the script Ocf::heartbeat::IPaddr2 in the current system. When the server A2 takes over from the server A1 and the working status is switched, implement a simple script Ocf::heartbeat::ceph to start the ceph related process, such as mon or osd can be used to dynamically configure the same virtual address on two servers.
  • a simple script Ocf::heartbeat::ceph to start the ceph related process, such as mon or osd can be used to dynamically configure the same virtual address on two servers.
  • this application also proposes a data reading method.
  • FIG. 5 is a schematic flowchart of the first embodiment of the data reading method of this application.
  • the data reading method of this embodiment is applicable to electronic devices, which are respectively connected to the active server and the standby server of multiple distributed nodes in a distributed system, and the active server and the standby server in the same distributed node
  • the main server in each distributed node is in working state
  • the standby server in each distributed node is in standby state
  • each distributed node corresponds to a shared disk
  • the main server in the same distributed node is
  • the server and the standby server are respectively connected to the same shared disk corresponding to the distributed node where they are located, and the method includes the steps:
  • Initial access step S10 real-time, or, at regular intervals, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read The data on the corresponding shared disk.
  • each distributed node in the system is configured with two servers, and the two servers are in working state and standby state, according to the difference of initial state .
  • the two servers in node A respectively serve as the active server (A1) and the standby server (A2) in the working state, that is, at this time, A1 is in the working state, and A2 is ready in the standby state.
  • the primary server A1 and the backup server A2 are respectively connected to the same shared disk and can access the shared disk in the same way.
  • node B configures server B1 and server B2, and communicates with the shared disk. Among them, B1 is in working state and B2 is in standby state.
  • Node C configures server C1 and server C2, and communicates with the shared disk. Among them, C1 is in working state and C2 is in standby state. When the distributed system in this embodiment accesses the distributed node A, the system will first access the active server A1 in the node A that is currently working.
  • State conversion step S20 If the main server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state.
  • the primary server A1 and the backup server A2 are in the working state and the standby state, respectively, they appear as node A for the entire system.
  • the active server A1 When the system accesses the storage disks in the current node through the active server A1, when the active server is working normally and there is no failure, the active server A1 will directly access the storage disk connected to the communication, if the server A1 in the working state fails Can't access normally, originally in the standby state A2 will start the application process to start, and enter the working state from the standby state.
  • the switching of the working status between the active server A1 and the standby server A2 is equivalent to the node A being disconnected and connected again for a short time.
  • the standby server A2 replaces the active server A1 and plays the role of node A in the system cluster.
  • Data reading step S30 After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • the standby server A2 that has transitioned from the standby state to the working state reads the data on the shared disk of the communication connection.
  • the primary server A1 and the backup server A2 are respectively connected to the same shared disk in communication, and the two servers can access the shared disk in exactly the same way.
  • the main server A1 fails, the system accesses the shared disk through the standby server A2 that is converted to working state.
  • the method further includes:
  • Writing detection step if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding share through the main server in the distributed node Disk.
  • the monitoring module in the Ceph distributed system can detect whether data is being written. When no data is being written, the distributed system can automatically end the subsequent steps after the write detection step.
  • Step of synchronizing data if data is being written to the corresponding shared disk through the active server in the distributed node, after the standby server in the distributed node is successfully switched to the working state, the standby server The written data is written to the corresponding shared disk through the standby server in the distributed node.
  • the distributed system can recover the data changes that may be written in the system during the state transition between the primary server A1 and the backup server A2. Therefore, the system is still in a normal operating state instead of the previous one.
  • the deployment architecture takes a long time to recover a large amount of data, which affects the overall performance of the system.
  • the monitoring module detects that data is written to the active server A1 when the active server A1 and the standby server A2 are in the transition state
  • the distributed system will resynchronize the written data to the standby server that is converted to the operating state.
  • Server A2 the distributed system will resynchronize the written data to the standby server that is converted to the operating state.
  • the method further includes: configuring the same virtual address on the active server and the standby server in the distributed node.
  • the method further includes the following steps:
  • the system accesses the current node A through the virtual address 160.1.0.0 configured by the primary server A1.
  • the virtual address configured on the main server in the distributed node is configured to the standby server in the distributed node.
  • the virtual address 160.1.0.0 originally configured on the active server A1 will be dynamically configured on the standby server A2, and the standby server A2 originally in the standby state will be converted to Working status.
  • the system accesses the current node A through the virtual address 160.1.0.0 configured on the standby server A2 at this time.
  • the distributed storage system further has a cluster resource manager for configuring the virtual address.
  • the configuration of the virtual IP address can be implemented by using pacemaker resource agent.
  • this application also provides a distributed storage system
  • the distributed storage system includes electronic devices, multiple primary servers and multiple backup servers of multiple distributed nodes in communication connection.
  • the primary servers and backup servers in the same distributed node are connected through a network.
  • the active server in the node is in working state
  • the standby server in each distributed node is in standby state
  • each distributed node corresponds to a shared disk
  • the active server and standby server in the same distributed node are respectively distributed
  • the electronic device includes a memory and a processor
  • the memory stores a data reading program
  • the data reading program is executed by the processor to implement the following steps:
  • Initial access steps real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
  • Data reading step After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  • an embodiment of the present application also proposes a computer-readable storage medium that stores a data reading program, and the data reading program can be executed by at least one processor, so that the at least one The processor executes the data reading method in any of the foregoing embodiments.
  • the dual-control architecture distributed storage system, electronic device, and computer-readable storage medium proposed in this embodiment are designed with dual-control architecture at each distributed storage node, that is, equipped with two Servers are the active server and standby server respectively.
  • the two servers adopt the A/P (Active/Passive) mode, the active server of the distributed node is in a working state, and the standby server of the distributed node is in a standby state.
  • the same virtual IP address is dynamically configured through the cluster resource manager, that is, the two servers appear as the same virtual node to the entire ceph system.
  • Two actual physical hosts (servers) can access the same shared storage hard disk in the same way in working mode.
  • the storage disk under the distributed node is taken over by the active server in a working state.
  • the distributed system accesses the storage disk through the virtual IP address on the active server.
  • the cluster resource manager switches the virtual IP address on the current active server to the standby server, and the standby server of the distributed node is switched to working state and can access the storage disk data. Since most of the data information on the current shared storage disk is still valid, for the ceph system, the above-mentioned access mode is equivalent to the short-term disconnection of the distributed storage node and then reconnection.
  • This embodiment is also designed with a monitoring module that can monitor whether data is being written when the working status of the two servers is switched. If data is written during the period, the data version number will be detected and this part of the data information will be synchronized to The server that is currently working. At the same time, by manually shutting down a physical host, it is also possible to upgrade the storage node configuration (such as adding memory modules) when the ceph system is online.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present invention relates to distributed storage techniques. Disclosed are a distributed storage system comprising a dual-control architecture, a data reading method and device, and a storage medium. In the present invention, each distributed storage node has a dual-control architecture, that is, each node is provided with two servers, being respectively a master server and a backup server. The two servers are configured to be in an active/passive (A/P) mode. When the master server is in an operating state, the backup server is in a standby state. A cluster resource manager is used to dynamically configure a same virtual IP address, that is, the two servers are presented as a same virtual node for a ceph system. Compared with the prior art, the dual control design of a storage node can avoid the copying of a large amount of data merely due to a server fault, and reduces the probability of a ceph system performing node data recovery.

Description

双控构架分布式存储系统、数据读取方法、装置和存储介质Double-control architecture distributed storage system, data reading method, device and storage medium
本申请要求于2019年05月20日提交中国专利局、申请号为201910418969.3、发明名称为“双控构架分布式存储系统、数据读取方法、装置和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application requires the priority of a Chinese patent application filed with the Chinese Patent Office on May 20, 2019, the application number is 201910418969.3, and the invention title is "Dual-control architecture distributed storage system, data reading method, device and storage medium". The entire content is incorporated into the application by reference.
技术领域Technical field
本申请涉及分布式存储技术领域、特别涉及一种双控架构分布式存储系统、电子装置、数据读取方法、装置和计算机可读存储介质。This application relates to the field of distributed storage technology, and in particular to a dual-control architecture distributed storage system, electronic device, data reading method, device, and computer-readable storage medium.
背景技术Background technique
CEPH分布式文件系统是一种容量大、性能高、可靠性强的分布式存储系统。现有的CEPH分布式系统里面,设置有有多个存储节点并且可实现数据的多副本存储。采用这种分布式结构的系统中,当一台存储节点中的服务器(主机)坏掉的时候,该服务器中存储的数据也同时不可访问。这个时候ceph系统需要恢复这个服务器上的相关数据,才能进行访问。恢复一台服务器上的数据需要相当长的时间,会对集群性能产生影响。特别是随着现在硬盘的容量增加,单盘通常可具有6T或者8T的存储容量,因而一个服务器上存储的数据可能有好几十T,巨大数据量使得上述对系统的的影响更加明显。CEPH distributed file system is a distributed storage system with large capacity, high performance and strong reliability. In the existing CEPH distributed system, multiple storage nodes are provided and multiple copies of data can be stored. In a system with this distributed structure, when a server (host) in a storage node fails, the data stored in the server is also inaccessible at the same time. At this time, the ceph system needs to restore the relevant data on this server before it can be accessed. It takes a long time to restore data on a server, which will affect the performance of the cluster. Especially with the current increase in the capacity of hard disks, a single disk can usually have a storage capacity of 6T or 8T, so the data stored on a server may be dozens of terabytes. The huge amount of data makes the above impact on the system more obvious.
当实际上当一台服务器故障时,很多时候这个服务器上面的存储(磁盘)实际并没有损坏,磁盘上的数据依然是有效的,因此单纯的复制数据其实是比较无效的工作,如何避免复制大量数据又能使得系统能够正常工作成为一个亟待解决的问题。When a server fails, many times the storage (disk) on the server is not actually damaged, and the data on the disk is still valid. Therefore, simply copying data is actually a relatively ineffective work. How to avoid copying large amounts of data It also becomes an urgent problem to make the system work normally.
发明内容Summary of the invention
本申请的主要目的是提供一种双控架构分布式存储系统、电子装置、数据读取方法、装置和计算机可读存储介质,旨在通过存储节点双控设计,在仅有服务器故障的情况下避免复制大量数据,降低ceph系统进行节点数据恢复的概率。The main purpose of this application is to provide a dual-control architecture distributed storage system, electronic device, data reading method, device, and computer-readable storage medium, aiming to adopt the dual-control design of storage nodes, in the case of only server failure Avoid copying large amounts of data and reduce the probability of node data recovery by the ceph system.
为实现上述目的,本申请提出的一种电子装置,所述电子装置分别与分布式系统中多个分布式节点的主用服务器和备用服务器通信连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,所述电子装置包括存储器和处理器,所述存储器上存储有数据读取程序,所述数据读取程序被所述处理器执行时实现如下步骤:In order to achieve the above objective, this application proposes an electronic device which is respectively connected to the active server and the standby server of multiple distributed nodes in a distributed system, and the active server and the standby server in the same distributed node The servers are connected through a network. The main server in each distributed node is in working state, the standby server in each distributed node is in standby state, and each distributed node corresponds to a shared disk. The main server and the backup server are respectively connected to the same shared disk corresponding to the distributed node where they are located. The electronic device includes a memory and a processor. The memory stores a data reading program, and the data reading program is The processor implements the following steps when executing:
初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用 服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
此外,为实现上述目的,本申请还提出一种数据读取方法,所述电子装置分别与分布式系统中多个分布式节点的主用服务器和备用服务器通信连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,该方法包括步骤:In addition, in order to achieve the above-mentioned purpose, this application also proposes a data reading method. The electronic device is respectively communicatively connected with the active server and the standby server of multiple distributed nodes in the distributed system, and the main server in the same distributed node The server and the backup server are connected through a network. The main server in each distributed node is in working state, the backup server in each distributed node is in standby state, and each distributed node corresponds to a shared disk. The main server and the backup server in the distributed node are respectively connected to the same shared disk corresponding to the distributed node in communication, and the method includes the steps:
初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
此外,为实现上述目的,本申请还提供一种分布式存储系统,包括电子装置、多个分布式节点的多个主用服务器和多个备用服务器通讯连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,所述电子装置包括存储器和处理器,所述存储器上存著有数据读取程序,所述数据读取程序被所述处理器执行时实现如下步骤:In addition, in order to achieve the above purpose, the present application also provides a distributed storage system, including electronic devices, multiple active servers of multiple distributed nodes, and multiple backup servers in communication connection, and the active server in the same distributed node It is connected to the standby server through a network, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the same distributed node The main server and the backup server are respectively in communication connection with the same shared disk corresponding to the distributed node where they are located. The electronic device includes a memory and a processor, and a data reading program is stored on the memory, and the data reading program When executed by the processor, the following steps are implemented:
初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有数据读取程序,所述数据读取程序可被至少一个处理器执行,以使所述至少一个处理器执行如上述任一项所述的数据读取程序方法的步骤。In addition, in order to achieve the above-mentioned object, this application also provides a computer-readable storage medium that stores a data reading program, and the data reading program can be executed by at least one processor to enable all The at least one processor executes the steps of the data reading program method described in any one of the above.
相较于现有技术,本申请提出的提出的一种双控架构分布式存储系统、电子装置及计算机可读存储介质,通过在每个分布式存储节点进行双控构架的设计,即配备两台服务器,分别为主用服务器和备用服务器。两个服务器采用A/P(Active/Passive)模式,该分布式节点的主用服务器处于工作状态,该分布式节点的备用服务器处于待机状态。通过集群资源管理器动态配置同一个虚拟IP地址,即两台服务器对整个ceph系统呈现为同一个虚拟节点。两台实际物理主机(服务器)在工作模式下可以同样的方式访问同一个共享存储硬盘。在正常情况下,该分布式节点下的该存储磁盘由处于工作状态的主用服务器接管,分布式系统通过主用服务器上的虚拟IP地址来访问该存储磁盘,当该分布式节点的两个服务器中处于工作状态的主用服务器发生故障时,集群资源管理器将当前主用服务器上的虚拟IP地址切换到备用服务器,该分布式节点的备用服务器切换成工作状态并可访问存储磁盘上的数据。由于当前共享存储磁盘上的数据信息仍然绝大部分有效,对于ceph系统来说,上述的访问模式等效于该分布式存储节点短时间断开然后再恢复连接。通过上述的双控架构设计,可大大降低以往ceph系统在存储节点故障时候必须大规模数据恢复的概率。本系统还设计有具有监控模块,可监控当切换两台服务器的工作状态时是否正好有数据正在写入,假如期间有数据写入,会检测数据版本号,并将这部分数据信息同步到当前处于工作状态的服务器。同时,通过人工的关闭某个物理主机,还可以实现ceph系统在线情况下对存储节点配置的升级(比如添加内存条)。Compared with the prior art, the dual-control architecture distributed storage system, electronic device, and computer-readable storage medium proposed in this application are designed with dual-control architecture at each distributed storage node, that is, equipped with two Two servers, respectively the active server and standby server. The two servers adopt A/P (Active/Passive) mode, the active server of the distributed node is in working state, and the standby server of the distributed node is in standby state. The same virtual IP address is dynamically configured through the cluster resource manager, that is, the two servers appear as the same virtual node to the entire ceph system. Two actual physical hosts (servers) can access the same shared storage hard disk in the same way in working mode. Under normal circumstances, the storage disk under the distributed node is taken over by the active server in a working state. The distributed system accesses the storage disk through the virtual IP address on the active server. When the active server in the working state of the server fails, the cluster resource manager switches the virtual IP address on the current active server to the standby server, and the standby server of the distributed node is switched to working state and can access the storage disk data. Since most of the data information on the current shared storage disk is still valid, for the ceph system, the above-mentioned access mode is equivalent to the short-term disconnection of the distributed storage node and then reconnection. Through the above-mentioned dual-control architecture design, the probability that the previous ceph system must recover large-scale data when the storage node fails can be greatly reduced. The system is also designed with a monitoring module that can monitor whether data is being written when the working status of the two servers is switched. If data is written during the period, it will detect the data version number and synchronize this part of the data information to the current Server in working state. At the same time, by manually shutting down a physical host, it is also possible to upgrade the storage node configuration (such as adding memory sticks) when the ceph system is online.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图示出的结构获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on the structure shown in these drawings.
图1为本申请分布式存储系统第一实施例的系统构架示意图;Figure 1 is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of this application;
图2为本申请分布式存储系统的存储分布关系示意图;Figure 2 is a schematic diagram of the storage distribution relationship of the distributed storage system of this application;
图3为本申请数据读取程序第一实施例的运行环境示意图;3 is a schematic diagram of the operating environment of the first embodiment of the data reading program of this application;
图4为本申请的数据读取程序第一实施例的程序模块图;4 is a program module diagram of the first embodiment of the data reading program of this application;
图5为本申请的数据读取程序第一实施例的流程示意图。FIG. 5 is a schematic flowchart of the first embodiment of the data reading program of this application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
以下结合附图对本申请的原理和特征进行描述,所举实例只用于解释本申请,并非用于限定本申请的范围。The principles and features of the application are described below in conjunction with the accompanying drawings. The examples cited are only used to explain the application, and are not used to limit the scope of the application.
参阅图1所示,是本申请分布式存储系统第一实施例的系统构架示意图。Refer to FIG. 1, which is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of the present application.
在本实施例中,分布式存储系统包括多个分布式存储节点,每个分布式存储节点都具有两台物理主机,在本申请中两台物理主机为一主用服务器和一备用服务器,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,可以同样的方式访问该共享磁盘。In this embodiment, the distributed storage system includes multiple distributed storage nodes. Each distributed storage node has two physical hosts. In this application, the two physical hosts are a primary server and a backup server. The active server and the standby server in the distributed nodes are connected through a network, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, and each distributed node corresponds to For a shared disk, the active server and backup server in the same distributed node are respectively connected to the same shared disk corresponding to the distributed node, and the shared disk can be accessed in the same way.
在一些应用场景中,分布式存储系统中还设置有电子装置1,该电子装置与各个分布式节点中的主用服务器和备用服务器之间通讯连接(例如,通过网络2通讯连接)。In some application scenarios, an electronic device 1 is also provided in the distributed storage system, and the electronic device is in communication connection with the main server and the backup server in each distributed node (for example, communication connection via the network 2).
在一些应用场景中,上述电子装置1独立于分布式存储系统设置,且与分布式存储系统通讯连接(例如,通过网络2通信连接)。In some application scenarios, the above-mentioned electronic device 1 is set independently of the distributed storage system, and is in communication connection with the distributed storage system (for example, communication connection via the network 2).
下面,将基于上述分布式系统和相关设备,提出本申请的各个实施例。Below, various embodiments of the present application will be proposed based on the above-mentioned distributed system and related equipment.
本申请提出一种数据读取程序。This application proposes a data reading program.
请参阅图1,是本申请是本申请分布式存储系统第一实施例的系统构架示意图。Please refer to FIG. 1, which is a schematic diagram of the system architecture of the first embodiment of the distributed storage system of this application.
在本实施例中,数据读取程序10安装并运行于电子装置1中。电子装置1可以是桌上型计算机、笔记本、掌上电脑及服务器等计算设备。该电子装置1可包括,但不仅限于,通过程序总线相互通信的存储器11及处理器12。图3仅示出了具有组件11、12的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。In this embodiment, the data reading program 10 is installed and run in the electronic device 1. The electronic device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a server. The electronic device 1 may include, but is not limited to, a memory 11 and a processor 12 that communicate with each other through a program bus. FIG. 3 only shows the electronic device 1 with the components 11 and 12, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
存储器11在一些实施例中可以是电子装置1的内部存储单元,例如该电子装置1的硬盘或内存。存储器11在另一些实施例中也可以是电子装置1的外部存储设备,例如电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括电子装置1的内部存储单元也包括外部存储设备。存储器11用于存储安装于电子装置1的应用软件及各类数据,例如数据写入程序10的程序代码等。存储器11还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk or a memory of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, for example, a plug-in hard disk equipped on the electronic device 1, a smart media card (SMC), and a secure digital (SD) Card, Flash Card, etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 is used to store application software and various data installed in the electronic device 1, for example, the program code of the data writing program 10. The memory 11 can also be used to temporarily store data that has been output or will be output.
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行数据写入程序10等。In some embodiments, the processor 12 may be a central processing unit (CPU), a microprocessor or other data processing chip, which is used to run program codes or process data stored in the memory 11, for example to perform data writing Procedure 10 etc.
请参阅图4,是数据读取程序10第一实施例的程序模块图。在本实施例中,数据读取程序10可以被分割成一个或多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行,以完 成本申请。例如,在图4中,数据读取程序10可以被分割成初始访问模块101、状态转换模块102及数据读取模块103。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述数据读取程序10在电子装置1中的执行过程,其中:Please refer to FIG. 4, which is a program module diagram of the first embodiment of the data reading program 10. In this embodiment, the data reading program 10 can be divided into one or more modules, one or more modules are stored in the memory 11, and are run by one or more processors (in this embodiment, the processor 12) Executed to complete this application. For example, in FIG. 4, the data reading program 10 can be divided into an initial access module 101, a state transition module 102, and a data reading module 103. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is more suitable than a program to describe the execution process of the data reading program 10 in the electronic device 1, where:
初始访问模块:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据。Initial access module: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently in working state in the distributed node to read the corresponding Data on the shared disk.
在常见的分布式系统中,当一个节点上的主机(即服务器)发生故障时,大多时候与这个服务器通讯连接的存储磁盘实际上并没有发生损坏,存储磁盘上的数据依然有效。如果此时能立刻让另外一台服务器启动并接管这些数据,替换坏掉的故障服务器,那么,对于整个分布式系统来说,该节点依然在正常运行,因此此时并不需要进行大量的数据恢复,整个系统自然就很快可以恢复到正常运行的状态。In a common distributed system, when a host (ie, a server) on a node fails, most of the time the storage disk communicating with the server is not actually damaged, and the data on the storage disk is still valid. If you can immediately start another server and take over the data at this time, and replace the broken faulty server, then for the entire distributed system, the node is still running normally, so there is no need for a large amount of data at this time Recovery, the entire system will naturally be restored to normal operation soon.
请参阅图2,为分布式存储系统的存储分布关系示意图。Refer to Figure 2, which is a schematic diagram of the storage distribution relationship of a distributed storage system.
如图2所示,本实施例的的CEPH双控架构分布式系统中,在系统中的每个分布式节点都配置两台服务器,两台服务器分别处于工作状态和待机状态,按照初始状态的区别,节点A中两台服务器分别作为处于工作状态的主用服务器(A1)和处于待机状态的备用服务器(A2),即此时A1正在处于工作状态,而A2准备就绪处于待机状态。主用服务器A1和备用服务器A2分别与同一共享磁盘通讯连接,可以同样的方式访问该共享磁盘。相似的,节点B配置服务器B1和服务器B2,并与共享磁盘通讯连接,其中,B1处于工作状态,B2处于待机状态。节点C配置服务器C1和服务器C2,并与共享磁盘通讯连接,其中,C1处于工作状态,C2处于待机状态。当本实施例中分布式系统访问分布式节点A时,系统会首先访问该节点A中当前处于工作状态的主用服务器A1。As shown in Figure 2, in the CEPH dual-control architecture distributed system of this embodiment, each distributed node in the system is configured with two servers, and the two servers are in working and standby states respectively, according to the initial state The difference is that the two servers in node A serve as the active server (A1) and the standby server (A2) respectively in the working state, that is, A1 is in the working state at this time, and A2 is ready in the standby state. The primary server A1 and the backup server A2 are respectively connected to the same shared disk and can access the shared disk in the same way. Similarly, node B configures server B1 and server B2, and communicates with the shared disk. Among them, B1 is in working state and B2 is in standby state. Node C configures server C1 and server C2, and communicates with the shared disk. Among them, C1 is in working state and C2 is in standby state. When the distributed system in this embodiment accesses the distributed node A, the system will first access the active server A1 in the node A that is currently working.
状态转换模块:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态。State conversion module: if the main server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state.
请参阅图3,是数据读取程序第一实施例的运行环境示意图。Please refer to FIG. 3, which is a schematic diagram of the operating environment of the first embodiment of the data reading program.
如图3所示,在本实施例中,主用服务器A1和备用服务器A2虽然分别处于工作状态和待机状态,但对系统整体表现为节点A。当系统通过主用服务器A1访问当前节点中的存储磁盘时,当主用服务器正常工作,并未出现故障时,主用服务器A1讲直接访问通讯连接的存储磁盘,若处于工作状态的服务器A1发生故障不能正常访问,本来处于待机状态A2则启动应用进程启动,从待机状态进入工作状态。由于主用服务器A1和备用服务器A2对系统表现为同一个节点A,主用服务器A1和备用服务器A2之间工作状态的切换对系统而言,等效于节点A短时间断开并再次连接上,而此时备用服务器A2替换主用服务器A1扮演了系统集群中节点A的角色。As shown in FIG. 3, in this embodiment, although the main server A1 and the backup server A2 are in the working state and the standby state, respectively, they appear as node A for the entire system. When the system accesses the storage disk in the current node through the active server A1, when the active server is working normally and there is no failure, the active server A1 will directly access the storage disk connected to the communication, if the server A1 in the working state fails Can't access normally, originally in the standby state A2 will start the application process to start, and enter the working state from the standby state. Since the active server A1 and the standby server A2 appear to the system as the same node A, the switching of the working status between the active server A1 and the standby server A2 is equivalent to the node A being disconnected and connected again for a short time. , And at this time, the standby server A2 replaces the active server A1 and plays the role of node A in the system cluster.
数据读取模块:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading module: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
在本实施例中,节点A中,因主用服务器A1发生故障,从待机状态转换为工作状态的备用服务器A2读取通讯连接的共享磁盘上的数据。如上所述,主用服务器A1和备用服务器A2分别与同一个共享磁盘通讯连接,两个服务器可以完全相同的方式分别访问该共享磁盘。当主用服务器A1发生故障时,系统通过转换成工作状态的备用服务器A2访问该共享磁盘。In this embodiment, in node A, due to a failure of the active server A1, the standby server A2 that has transitioned from the standby state to the working state reads the data on the shared disk of the communication connection. As mentioned above, the primary server A1 and the backup server A2 are respectively connected to the same shared disk in communication, and the two servers can access the shared disk in exactly the same way. When the main server A1 fails, the system accesses the shared disk through the standby server A2 that is converted to working state.
优选的,本实施例中,该程序还包括监控模块(图中未示出),用于执行所述状态转换步骤时还实现以下步骤:Preferably, in this embodiment, the program further includes a monitoring module (not shown in the figure), which is used to implement the following steps when performing the state transition step:
写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过所述监控模块检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘。Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring module detects whether data is being written to the corresponding share through the main server in the distributed node Disk.
在本实施例中,Ceph分布式系统中具有的监控模块可以检测到是否有数据正在写入,当没有数据写入时,分布式系统可在写入检测步骤之后自动结束后续步骤。In this embodiment, the monitoring module in the Ceph distributed system can detect whether data is being written. When no data is being written, the distributed system can automatically end the subsequent steps after the write detection step.
同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将所述待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: if data is being written to the corresponding shared disk through the active server in the distributed node, after the standby server in the distributed node is successfully switched to the working state, the standby server The written data is written to the corresponding shared disk through the standby server in the distributed node.
在本实施例中,分布式系统可恢复这一段主用服务器A1和备用服务器A2之间状态转换时系统内可能写入的数据变化,因此系统仍然是属于正常运行的状态,而不用像之前的部署架构,需要很长时间来恢复大量数据,进而影响系统的整体性能。当监控模块检检测到当该分布式节点中的主用服务器A1和备用服务器A2在转换工作状态时有数据在写入主用服务器A1时,分布式系统会将这部分写入的数据重新同步至转换为工作状态的该分布式节点中的备用服务器A2。In this embodiment, the distributed system can recover the data changes that may be written in the system during the state transition between the primary server A1 and the backup server A2. Therefore, the system is still in a normal operating state instead of the previous one. The deployment architecture takes a long time to recover a large amount of data, which affects the overall performance of the system. When the monitoring module detects that data is being written to the active server A1 when the active server A1 and the backup server A2 in the distributed node are in transition, the distributed system will resynchronize the written data To the standby server A2 in the distributed node that has been converted to a working state.
优选的,本实施例中,该分布式节点中的所述主用服务器和备用服务器上配置所述同一虚拟地址。Preferably, in this embodiment, the same virtual address is configured on the active server and the standby server in the distributed node.
在本实施例中,通过对同一个节点中的两台服务器配置同一个虚拟IP地址,该分布式节点中的主用服务器A1和备用服务器A2可对分布式系统整体表现为同一个节点A。该虚拟IP地址是动态配置在正在处于工作状态的主用服务器A1或者备用服务器A2上。比如,两个服务器的虚拟地址都设置为160.1.0.0,当两个服务器切进行工作状态切换时,对分布式系统整体表现为该节点一直在线工作,并未发生状态转换和地址转换。In this embodiment, by configuring the same virtual IP address for two servers in the same node, the active server A1 and the backup server A2 in the distributed node can behave as the same node A to the distributed system as a whole. The virtual IP address is dynamically configured on the active server A1 or the standby server A2 that is in the working state. For example, the virtual addresses of the two servers are both set to 160.1.0.0. When the two servers switch their working states, the overall performance of the distributed system is that the node is always working online, and no state transition or address transition occurs.
优选的,通过虚拟地址的配置,所述数据读取程序被所述处理器执行时还实现如下步骤:Preferably, through the configuration of the virtual address, the following steps are further implemented when the data reading program is executed by the processor:
虚拟访问步骤:通过所述虚拟地址访问该分布式节点中的所述主用服务器。Virtual access step: access the main server in the distributed node through the virtual address.
在本实施例中,系统通过所述主用服务器A1所配置的虚拟地址160.1.0.0来访问当前节点A。In this embodiment, the system accesses the current node A through the virtual address 160.1.0.0 configured by the primary server A1.
地址转换步骤:当该分布式节点中的所述主用服务器出现故障时,将配 置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器。Address conversion step: when the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node .
在本实施例中,当所述主用服务器A2出现故障时,原本配置在主用服务器A1上的虚拟地址160.1.0.0将动态配置至备用服务器A2上,原本处于待机状态的备用服务器A2转换成工作状态。In this embodiment, when the active server A2 fails, the virtual address 160.1.0.0 originally configured on the active server A1 will be dynamically configured on the standby server A2, and the standby server A2 originally in the standby state will be converted to Working status.
地址访问步骤:通过所述虚拟地址访问该分布式节点中的所述备用服务器。Address access step: access the standby server in the distributed node through the virtual address.
在本实施例中,系统通过此时配置在备用服务器A2上的虚拟地址160.1.0.0来访问当前节点A。In this embodiment, the system accesses the current node A through the virtual address 160.1.0.0 configured on the standby server A2 at this time.
在上述的访问过程中,通过同一个虚拟地址在主用服务器和备用服务器上的动态配置,当主用服务器A1发生故障时,并没有对系统的正常运行产生影响,系统此时通过服用服务器A2访问了当前分布式节点。In the above-mentioned access process, through the dynamic configuration of the same virtual address on the primary server and the backup server, when the primary server A1 fails, it does not affect the normal operation of the system, and the system is accessed through the server A2 at this time The current distributed node.
优选地,所述的电子装置中,还具有集群资源管理器,用于配置所述虚拟地址。Preferably, the electronic device further has a cluster resource manager for configuring the virtual address.
在本实施例中,虚拟IP地址的配置可通过使用pacemaker资源代理实现。例如在当前系统中运行脚本Ocf::heartbeat::IPaddr2,当服务器A2接替服务器A1,工作状态产生切换的时候,实现一个简单脚本Ocf::heartbeat::ceph,启动ceph的相关进程,如mon或osd即可,从而实现在两个服务器上动态配置同一个虚拟地址。In this embodiment, the configuration of the virtual IP address can be implemented by using pacemaker resource agent. For example, run the script Ocf::heartbeat::IPaddr2 in the current system. When the server A2 takes over from the server A1 and the working status is switched, implement a simple script Ocf::heartbeat::ceph to start the ceph related process, such as mon or osd can be used to dynamically configure the same virtual address on two servers.
此外,为实现上述目的,本申请还提出一种数据读取方法。In addition, in order to achieve the above object, this application also proposes a data reading method.
如图5所示,图5为本申请数据读取方法第一实施例的流程示意图。As shown in FIG. 5, FIG. 5 is a schematic flowchart of the first embodiment of the data reading method of this application.
本实施例的数据读取方法适用于电子装置,所述电子装置分别与分布式系统中多个分布式节点的主用服务器和备用服务器通信连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,该方法包括步骤:The data reading method of this embodiment is applicable to electronic devices, which are respectively connected to the active server and the standby server of multiple distributed nodes in a distributed system, and the active server and the standby server in the same distributed node The main server in each distributed node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the main server in the same distributed node is The server and the standby server are respectively connected to the same shared disk corresponding to the distributed node where they are located, and the method includes the steps:
初始访问步骤S10:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据。Initial access step S10: real-time, or, at regular intervals, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read The data on the corresponding shared disk.
如图3所示,本实施例的CEPH双控架构分布式系统中,在系统中的每个分布式节点都配置两台服务器,两台服务器分别处于工作状态和待机状态,按照初始状态的区别,节点A中两台服务器分别作为处于工作状态的主用服务器(A1)和处于待机状态的备用服务器(A2),即此时A1正在处于工作状态,而A2准备就绪处于待机状态。主用服务器A1和备用服务器A2分别与同一共享磁盘通讯连接,可以同样的方式访问该共享磁盘。相似的,节点B配置服务器B1和服务器B2,并与共享磁盘通讯连接,其中,B1处于工作状态,B2处于待机状态。节点C配置服务器C1和服务器C2,并与共享磁盘通 讯连接,其中,C1处于工作状态,C2处于待机状态。当本实施例中分布式系统访问分布式节点A时,系统会首先访问该节点A中当前处于工作状态的主用服务器A1。As shown in Figure 3, in the CEPH dual-control architecture distributed system of this embodiment, each distributed node in the system is configured with two servers, and the two servers are in working state and standby state, according to the difference of initial state , The two servers in node A respectively serve as the active server (A1) and the standby server (A2) in the working state, that is, at this time, A1 is in the working state, and A2 is ready in the standby state. The primary server A1 and the backup server A2 are respectively connected to the same shared disk and can access the shared disk in the same way. Similarly, node B configures server B1 and server B2, and communicates with the shared disk. Among them, B1 is in working state and B2 is in standby state. Node C configures server C1 and server C2, and communicates with the shared disk. Among them, C1 is in working state and C2 is in standby state. When the distributed system in this embodiment accesses the distributed node A, the system will first access the active server A1 in the node A that is currently working.
状态转换步骤S20:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态。State conversion step S20: If the main server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state.
在本实施例中,主用服务器A1和备用服务器A2虽然分别处于工作状态和待机状态,但对系统整体表现为节点A。当系统通过主用服务器A1访问当前节点中的存储磁盘时,当主用服务器正常工作,并未出现故障时,主用服务器A1将直接访问通讯连接的存储磁盘,若处于工作状态的服务器A1发生故障不能正常访问,本来处于待机状态A2则启动应用进程启动,从待机状态进入工作状态。由于主用服务器A1和备用服务器A2对系统表现为同一个节点A,主用服务器A1和备用服务器A2之间工作状态的切换对系统而言,等效于节点A短时间断开并再次连接上,而此时备用服务器A2替换主用服务器A1扮演了系统集群中节点A的角色。In this embodiment, although the primary server A1 and the backup server A2 are in the working state and the standby state, respectively, they appear as node A for the entire system. When the system accesses the storage disks in the current node through the active server A1, when the active server is working normally and there is no failure, the active server A1 will directly access the storage disk connected to the communication, if the server A1 in the working state fails Can't access normally, originally in the standby state A2 will start the application process to start, and enter the working state from the standby state. Since the active server A1 and the standby server A2 appear to the system as the same node A, the switching of the working status between the active server A1 and the standby server A2 is equivalent to the node A being disconnected and connected again for a short time. , And at this time, the standby server A2 replaces the active server A1 and plays the role of node A in the system cluster.
数据读取步骤S30:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step S30: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
在本实施例中,节点A中,因主用服务器A1发生故障,从待机状态转换为工作状态的备用服务器A2读取通讯连接的共享磁盘上的数据。如上所述,主用服务器A1和备用服务器A2分别与同一个共享磁盘通讯连接,两个服务器可以完全相同的方式分别访问该共享磁盘。当主用服务器A1发生故障时,系统通过转换成工作状态的备用服务器A2访问该共享磁盘。In this embodiment, in node A, due to a failure of the active server A1, the standby server A2 that has transitioned from the standby state to the working state reads the data on the shared disk of the communication connection. As mentioned above, the primary server A1 and the backup server A2 are respectively connected to the same shared disk in communication, and the two servers can access the shared disk in exactly the same way. When the main server A1 fails, the system accesses the shared disk through the standby server A2 that is converted to working state.
优选地,所述的数据读取方法中,在状态转换步骤中,该方法还包括:Preferably, in the data reading method, in the state transition step, the method further includes:
写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过所述监控单元检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘。Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding share through the main server in the distributed node Disk.
在本实施例中,Ceph分布式系统中具有的监控模块可以检测到是否有数据正在写入,当没有数据写入时,分布式系统可在写入检测步骤之后自动结束后续步骤。In this embodiment, the monitoring module in the Ceph distributed system can detect whether data is being written. When no data is being written, the distributed system can automatically end the subsequent steps after the write detection step.
同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将所述待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: if data is being written to the corresponding shared disk through the active server in the distributed node, after the standby server in the distributed node is successfully switched to the working state, the standby server The written data is written to the corresponding shared disk through the standby server in the distributed node.
在本实施例中,分布式系统可恢复这一段主用服务器A1和备用服务器A2之间状态转换时系统内可能写入的数据变化,因此系统仍然是属于正常运行的状态,而不用像之前的部署架构,需要很长时间来恢复大量数据,进而影响系统的整体性能。当监控模块检检测到当主用服务器A1和备用服务器A2在转换工作状态时有数据在写入主用服务器A1时,分布式系统会将这部分写入的数据重新同步至转换为工作状态的备用服务器A2。In this embodiment, the distributed system can recover the data changes that may be written in the system during the state transition between the primary server A1 and the backup server A2. Therefore, the system is still in a normal operating state instead of the previous one. The deployment architecture takes a long time to recover a large amount of data, which affects the overall performance of the system. When the monitoring module detects that data is written to the active server A1 when the active server A1 and the standby server A2 are in the transition state, the distributed system will resynchronize the written data to the standby server that is converted to the operating state. Server A2.
优选地,所述的数据读取方法中,该方法还包括:该分布式节点中的所述主用服务器和备用服务器上配置所述同一虚拟地址。Preferably, in the data reading method, the method further includes: configuring the same virtual address on the active server and the standby server in the distributed node.
优选地,所述的数据读取方法中,该方法还包括以下步骤:Preferably, in the data reading method, the method further includes the following steps:
通过所述虚拟地址访问该分布式节点中的所述主用服务器。Access the main server in the distributed node through the virtual address.
在本实施例中,系统通过所述主用服务器A1所配置的虚拟地址160.1.0.0来访问当前节点A。In this embodiment, the system accesses the current node A through the virtual address 160.1.0.0 configured by the primary server A1.
当该分布式节点中的所述主用服务器出现故障时,将配置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器。When the main server in the distributed node fails, the virtual address configured on the main server in the distributed node is configured to the standby server in the distributed node.
在本实施例中,当所述主用服务器A2出现故障时,原本配置在主用服务器A1上的虚拟地址160.1.0.0将动态配置至备用服务器A2上,原本处于待机状态的备用服务器A2转换成工作状态。In this embodiment, when the active server A2 fails, the virtual address 160.1.0.0 originally configured on the active server A1 will be dynamically configured on the standby server A2, and the standby server A2 originally in the standby state will be converted to Working status.
通过所述虚拟地址访问该分布式节点中的所述备用服务器。Access the standby server in the distributed node through the virtual address.
在本实施例中,系统通过此时配置在备用服务器A2上的虚拟地址160.1.0.0来访问当前节点A。In this embodiment, the system accesses the current node A through the virtual address 160.1.0.0 configured on the standby server A2 at this time.
优选地,所述的分布式存储系统中,还具有集群资源管理器,用于配置所述虚拟地址。Preferably, the distributed storage system further has a cluster resource manager for configuring the virtual address.
在本实施例中,虚拟IP地址的配置可通过使用pacemaker资源代理实现。In this embodiment, the configuration of the virtual IP address can be implemented by using pacemaker resource agent.
此外,为实现上述目的,本申请还提供一种分布式存储系统,In addition, in order to achieve the above purpose, this application also provides a distributed storage system,
所述分布式存储系统包括电子装置、多个分布式节点的多个主用服务器和多个备用服务器通讯连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,所述电子装置包括存储器和处理器,所述存储器上存著有数据读取程序,所述数据读取程序被所述处理器执行时实现如下步骤:The distributed storage system includes electronic devices, multiple primary servers and multiple backup servers of multiple distributed nodes in communication connection. The primary servers and backup servers in the same distributed node are connected through a network. The active server in the node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the active server and standby server in the same distributed node are respectively distributed The same shared disk communication connection corresponding to the mode node, the electronic device includes a memory and a processor, the memory stores a data reading program, and the data reading program is executed by the processor to implement the following steps:
初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质存储有数据读取程序,所述数据读取程序可被至少一个处理器执行,以使所述至少一个处理器执行上述任一实施例中的数据读取方法。In addition, an embodiment of the present application also proposes a computer-readable storage medium that stores a data reading program, and the data reading program can be executed by at least one processor, so that the at least one The processor executes the data reading method in any of the foregoing embodiments.
与现有技术相比,本实施例提出的一种双控架构分布式存储系统、电子装置及计算机可读存储介质,通过在每个分布式存储节点进行双控构架的设计,即配备两台服务器,分别为主用服务器和备用服务器。两个服务器采用A/P(Active/Passive)模式,该分布式节点的主用服务器处于工作状态,该分布式节点的备用服务器处于待机状态。通过集群资源管理器动态配置同一个虚拟IP地址,即两台服务器对整个ceph系统呈现为同一个虚拟节点。两台实际物理主机(服务器)在工作模式下可以同样的方式访问同一个共享存储硬盘。在正常情况下,该分布式节点下的该存储磁盘由处于工作状态的主用服务器接管,分布式系统通过主用服务器上的虚拟IP地址来访问该存储磁盘,当该分布式节点的两个服务器中处于工作状态的主用服务器发生故障时,集群资源管理器将当前主用服务器上的虚拟IP地址切换到备用服务器,该分布式节点的备用服务器切换成工作状态并可访问存储磁盘上的数据。由于当前共享存储磁盘上的数据信息仍然绝大部分有效,对于ceph系统来说,上述的访问模式等效于该分布式存储节点短时间断开然后再恢复连接。通过上述的双控架构设计,可大大降低以往ceph系统在存储节点故障时候必须大规模数据恢复的概率。本实施例还设计有具有监控模块,可监控当切换两台服务器的工作状态时是否正好有数据正在写入,假如期间有数据写入,会检测数据版本号,并将这部分数据信息同步到当前处于工作状态的服务器。同时,通过人工的关闭某个物理主机,还可以实现ceph系统在线情况下对存储节点配置的升级(比如添加内存条)。Compared with the prior art, the dual-control architecture distributed storage system, electronic device, and computer-readable storage medium proposed in this embodiment are designed with dual-control architecture at each distributed storage node, that is, equipped with two Servers are the active server and standby server respectively. The two servers adopt the A/P (Active/Passive) mode, the active server of the distributed node is in a working state, and the standby server of the distributed node is in a standby state. The same virtual IP address is dynamically configured through the cluster resource manager, that is, the two servers appear as the same virtual node to the entire ceph system. Two actual physical hosts (servers) can access the same shared storage hard disk in the same way in working mode. Under normal circumstances, the storage disk under the distributed node is taken over by the active server in a working state. The distributed system accesses the storage disk through the virtual IP address on the active server. When the active server in the working state of the server fails, the cluster resource manager switches the virtual IP address on the current active server to the standby server, and the standby server of the distributed node is switched to working state and can access the storage disk data. Since most of the data information on the current shared storage disk is still valid, for the ceph system, the above-mentioned access mode is equivalent to the short-term disconnection of the distributed storage node and then reconnection. Through the above-mentioned dual-control architecture design, the probability that the previous ceph system must recover large-scale data when the storage node fails can be greatly reduced. This embodiment is also designed with a monitoring module that can monitor whether data is being written when the working status of the two servers is switched. If data is written during the period, the data version number will be detected and this part of the data information will be synchronized to The server that is currently working. At the same time, by manually shutting down a physical host, it is also possible to upgrade the storage node configuration (such as adding memory modules) when the ceph system is online.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article or method that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments. Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种电子装置,其特征在于,所述电子装置分别与分布式系统中多个分布式节点的主用服务器和备用服务器通信连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,所述电子装置包括存储器和处理器,所述存储器上存储有数据读取程序,所述数据读取程序被所述处理器执行时实现如下步骤:An electronic device, characterized in that the electronic device is respectively connected to the main server and the backup server of a plurality of distributed nodes in a distributed system, and the main server and the backup server in the same distributed node are connected through a network Connection, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, the active server and standby in the same distributed node The servers are respectively in communication connection with the same shared disk corresponding to the distributed node where they are located, the electronic device includes a memory and a processor, the memory stores a data reading program, and the data reading program is implemented when the processor is executed The following steps:
    初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
    状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
    数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  2. 如权利要求1所述的电子装置,其特征在于,所述电子装置还与所述分布式系统中的监控单元通信连接,所述数据读取程序被所述处理器执行时还实现如下步骤:5. The electronic device according to claim 1, wherein the electronic device is also in communication connection with a monitoring unit in the distributed system, and the data reading program further implements the following steps when being executed by the processor:
    写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过所述监控单元检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘;Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding share through the main server in the distributed node Disk
    同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: If data is being written to the corresponding shared disk through the active server in the distributed node, after the backup server in the distributed node is successfully switched to the working state, the data to be written The data of is written into the corresponding shared disk through the standby server in the distributed node.
  3. 如权利要求2所述的电子装置,其特征在于,该分布式节点中的主用服务器和备用服务器动态配置同一虚拟地址,所述数据读取程序被所述处理器执行时还实现如下步骤:3. The electronic device according to claim 2, wherein the active server and the standby server in the distributed node dynamically configure the same virtual address, and the data reading program further implements the following steps when being executed by the processor:
    通过所述虚拟地址访问该分布式节点中的所述主用服务器;Access the primary server in the distributed node through the virtual address;
    当该分布式节点中的所述主用服务器出现故障时,将配置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器;When the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node;
    通过所述虚拟地址访问该分布式节点中的所述备用服务器。Access the standby server in the distributed node through the virtual address.
  4. 如权利要求3所述的电子装置,其特征在于,所述电子装置还与所述分布式系统中的集群资源管理器通信连接,用于在该分布式节点中的所述主 用服务器和备用服务器上配置所述同一虚拟地址。The electronic device according to claim 3, wherein the electronic device is also in communication connection with a cluster resource manager in the distributed system for the active server and the backup in the distributed node The same virtual address is configured on the server.
  5. 如权利要求1所述的电子装置,其特征在于,所述在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据包括:5. The electronic device of claim 1, wherein after the backup server in the distributed node is successfully switched to a working state, the backup server in the distributed node is accessed to read the corresponding The data on the shared disk includes:
    所述主用服务器和备用服务器分别与同一个共享磁盘通讯连接,两个服务器以相同的方式分别访问该共享磁盘,当所述主用服务器发生故障时,通过转换成工作状态的备用服务器访问该共享磁盘以读取数据。The active server and the standby server are respectively connected to the same shared disk in communication, and the two servers respectively access the shared disk in the same manner. When the active server fails, the standby server is converted to a working state to access the shared disk. Share the disk to read the data.
  6. 一种数据读取方法,适用于电子装置,其特征在于,所述电子装置分别与分布式系统中多个分布式节点的主用服务器和备用服务器通信连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,该方法包括步骤:A data reading method suitable for electronic devices, characterized in that the electronic devices are respectively connected to the main server and the backup server of multiple distributed nodes in a distributed system, and the main server in the same distributed node It is connected to the standby server through a network, the active server in each distributed node is in working state, the standby server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the same distributed node The main server and the standby server in the, are respectively connected to the same shared disk corresponding to the distributed node where they are located, and the method includes the steps:
    初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
    状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
    数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  7. 如权利要求6所述的数据读取方法,其特征在于,在状态转换步骤中,该方法还包括:8. The data reading method according to claim 6, wherein in the state transition step, the method further comprises:
    写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过监控单元检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘;Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding shared disk through the main server in the distributed node;
    同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: If data is being written to the corresponding shared disk through the active server in the distributed node, after the backup server in the distributed node is successfully switched to the working state, the data to be written The data of is written into the corresponding shared disk through the standby server in the distributed node.
  8. 如权利要求6或7所述的数据读取方法,其特征在于,该方法还包括:该分布式节点中的所述主用服务器和备用服务器上配置所述同一虚拟地址。The data reading method according to claim 6 or 7, wherein the method further comprises: configuring the same virtual address on the active server and the standby server in the distributed node.
  9. 如权利要求8所述的数据读取方法,其特征在于,该方法还包括以下步骤:9. The data reading method according to claim 8, wherein the method further comprises the following steps:
    通过所述虚拟地址访问该分布式节点中的所述主用服务器;Access the primary server in the distributed node through the virtual address;
    当该分布式节点中的所述主用服务器出现故障时,将配置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器;When the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node;
    通过所述虚拟地址访问该分布式节点中的所述备用服务器。Access the standby server in the distributed node through the virtual address.
  10. 如权利要求6所述的数据读取方法,其特征在于,所述在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据包括:The data reading method according to claim 6, wherein after the backup server in the distributed node is successfully switched to the working state, the backup server in the distributed node is accessed to read The data on the corresponding shared disk includes:
    所述主用服务器和备用服务器分别与同一个共享磁盘通讯连接,两个服务器以相同的方式分别访问该共享磁盘,当所述主用服务器发生故障时,通过转换成工作状态的备用服务器访问该共享磁盘以读取数据。The active server and the standby server are respectively connected to the same shared disk in communication, and the two servers respectively access the shared disk in the same manner. When the active server fails, the standby server is converted to a working state to access the shared disk. Share the disk to read the data.
  11. 一种分布式存储系统,其特征在于,所述分布式存储系统包括电子装置、多个分布式节点的多个主用服务器和多个备用服务器通讯连接,同一分布式节点中的主用服务器和备用服务器之间通过网络连接,各个分布式节点中的主用服务器分别处于工作状态,各个分布式节点中的备用服务器分别处于待机状态,各个分布式节点分别对应一个共享磁盘,同一分布式节点中的主用服务器和备用服务器分别与所在分布式节点对应的同一共享磁盘通信连接,所述电子装置包括存储器和处理器,所述存储器上存著有数据读取程序,所述数据读取程序被所述处理器执行时实现如下步骤:A distributed storage system, characterized in that the distributed storage system includes electronic devices, multiple primary servers of multiple distributed nodes and multiple backup servers in communication connection, and the primary server in the same distributed node and The backup servers are connected through a network, the active server in each distributed node is in working state, the backup server in each distributed node is in standby state, each distributed node corresponds to a shared disk, and the same distributed node The main server and the backup server are respectively in communication connection with the same shared disk corresponding to the distributed node where they are located, the electronic device includes a memory and a processor, and a data reading program is stored on the memory, and the data reading program is The processor implements the following steps when executing:
    初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
    状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
    数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  12. 如权利要求11所述的分布式存储系统,其特征在于,在状态转换步骤中,该分布式存储系统还包括:11. The distributed storage system of claim 11, wherein in the state transition step, the distributed storage system further comprises:
    写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过监控单元检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘;Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding shared disk through the main server in the distributed node;
    同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: If data is being written to the corresponding shared disk through the active server in the distributed node, after the backup server in the distributed node is successfully switched to the working state, the data to be written The data of is written into the corresponding shared disk through the standby server in the distributed node.
  13. 如权利要求11或12所述的分布式存储系统,其特征在于,该分布式存储系统还包括:The distributed storage system according to claim 11 or 12, wherein the distributed storage system further comprises:
    该分布式节点中的所述主用服务器和备用服务器上配置所述同一虚拟地址。The same virtual address is configured on the active server and the standby server in the distributed node.
  14. 如权利要求13所述的分布式存储系统,其特征在于,该分布式存储系统还包括以下步骤:13. The distributed storage system of claim 13, wherein the distributed storage system further comprises the following steps:
    通过所述虚拟地址访问该分布式节点中的所述主用服务器;Access the primary server in the distributed node through the virtual address;
    当该分布式节点中的所述主用服务器出现故障时,将配置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器;When the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node;
    通过所述虚拟地址访问该分布式节点中的所述备用服务器。Access the standby server in the distributed node through the virtual address.
  15. 如权利要求11所述的分布式存储系统,其特征在于,所述在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据包括:The distributed storage system of claim 11, wherein after the backup server in the distributed node is successfully switched to the working state, the backup server in the distributed node is accessed to read The data on the corresponding shared disk includes:
    所述主用服务器和备用服务器分别与同一个共享磁盘通讯连接,两个服务器以相同的方式分别访问该共享磁盘,当所述主用服务器发生故障时,通过转换成工作状态的备用服务器访问该共享磁盘以读取数据。The active server and the standby server are respectively connected to the same shared disk in communication, and the two servers respectively access the shared disk in the same manner. When the active server fails, the standby server is converted to a working state to access the shared disk. Share the disk to read the data.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有数据读取程序,所述数据读取程序可被至少一个处理器执行,以使所述至少一个处理器执行如下所述的数据读取程序方法的步骤:A computer-readable storage medium, wherein the computer-readable storage medium stores a data reading program, and the data reading program can be executed by at least one processor, so that the at least one processor executes the following The steps of the data reading program method:
    初始访问步骤:实时,或者,定时,或者,在收到从一分布式节点的主用服务器读取数据的指令后,访问该分布式节点中当前处于工作状态的主用服务器,以读取对应的共享磁盘上的数据;Initial access steps: real-time, or, timing, or, after receiving an instruction to read data from the main server of a distributed node, access the main server currently working in the distributed node to read the corresponding Data on the shared disk;
    状态转换步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则将该分布式节点中当前处于待机状态的备用服务器切换至工作状态;State conversion step: if the active server in the distributed node fails and cannot be accessed, the standby server in the distributed node currently in the standby state is switched to the working state;
    数据读取步骤:在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据。Data reading step: After the backup server in the distributed node is successfully switched to the working state, access the backup server in the distributed node to read the data on the corresponding shared disk.
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,在状态转换步骤中,该计算机可读存储介质还包括:16. The computer-readable storage medium of claim 16, wherein in the state transition step, the computer-readable storage medium further comprises:
    写入检测步骤:若该分布式节点中的所述主用服务器出现故障无法访问,则通过监控单元检测是否有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘;Writing detection step: if the main server in the distributed node fails and cannot be accessed, the monitoring unit detects whether data is being written to the corresponding shared disk through the main server in the distributed node;
    同步数据步骤:若有数据正在通过该分布式节点中的所述主用服务器写入对应的共享磁盘,则在该分布式节点中的所述备用服务器成功切换至工作状态后,将待写入的数据通过该分布式节点中的所述备用服务器写入对应的共享磁盘。Step of synchronizing data: If data is being written to the corresponding shared disk through the active server in the distributed node, after the backup server in the distributed node is successfully switched to the working state, the data to be written The data of is written into the corresponding shared disk through the standby server in the distributed node.
  18. 如权利要求16或17所述的计算机可读存储介质,其特征在于,该计算机可读存储介质还包括:18. The computer-readable storage medium of claim 16 or 17, wherein the computer-readable storage medium further comprises:
    该分布式节点中的所述主用服务器和备用服务器上配置同一虚拟地址。The same virtual address is configured on the active server and the standby server in the distributed node.
  19. 如权利要求18所述的计算机可读存储介质,其特征在于,该计算机可读存储介质还包括以下步骤:18. The computer-readable storage medium of claim 18, wherein the computer-readable storage medium further comprises the following steps:
    通过所述虚拟地址访问该分布式节点中的所述主用服务器;Access the primary server in the distributed node through the virtual address;
    当该分布式节点中的所述主用服务器出现故障时,将配置在该分布式节点中的所述主用服务器上的虚拟地址配置至该分布式节点中的所述备用服务器;When the active server in the distributed node fails, configure the virtual address configured on the active server in the distributed node to the standby server in the distributed node;
    通过所述虚拟地址访问该分布式节点中的所述备用服务器。Access the standby server in the distributed node through the virtual address.
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,所述在该分布式节点中的所述备用服务器成功切换至工作状态后,访问该分布式节点中的所述备用服务器,以读取对应的共享磁盘上的数据包括:The computer-readable storage medium according to claim 16, wherein after the backup server in the distributed node is successfully switched to the working state, the backup server in the distributed node is accessed to Reading the data on the corresponding shared disk includes:
    所述主用服务器和备用服务器分别与同一个共享磁盘通讯连接,两个服务器以相同的方式分别访问该共享磁盘,当所述主用服务器发生故障时,通过转换成工作状态的备用服务器访问该共享磁盘以读取数据。The active server and the standby server are respectively connected to the same shared disk in communication, and the two servers respectively access the shared disk in the same manner. When the active server fails, the standby server is converted to a working state to access the shared disk. Share the disk to read the data.
PCT/CN2019/117349 2019-05-20 2019-11-12 Distributed storage system comprising dual-control architecture, data reading method and device, and storage medium WO2020233001A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910418969.3A CN110286852A (en) 2019-05-20 2019-05-20 Dual control framework distributed memory system, method for reading data, device and storage medium
CN201910418969.3 2019-05-20

Publications (1)

Publication Number Publication Date
WO2020233001A1 true WO2020233001A1 (en) 2020-11-26

Family

ID=68002769

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117349 WO2020233001A1 (en) 2019-05-20 2019-11-12 Distributed storage system comprising dual-control architecture, data reading method and device, and storage medium

Country Status (2)

Country Link
CN (1) CN110286852A (en)
WO (1) WO2020233001A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110286852A (en) * 2019-05-20 2019-09-27 平安科技(深圳)有限公司 Dual control framework distributed memory system, method for reading data, device and storage medium
CN111901415B (en) * 2020-07-27 2023-07-14 北京星辰天合科技股份有限公司 Data processing method and system, computer readable storage medium and processor
CN115277377A (en) * 2022-05-19 2022-11-01 亿点云计算(珠海)有限公司 Service acquisition method, device, terminal and storage medium based on distributed cloud

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046446B1 (en) * 2004-10-18 2011-10-25 Symantec Operating Corporation System and method for providing availability using volume server sets in a storage environment employing distributed block virtualization
CN103077242A (en) * 2013-01-11 2013-05-01 北京佳讯飞鸿电气股份有限公司 Method for hot standby of dual database servers
CN105553701A (en) * 2015-12-11 2016-05-04 国网青海省电力公司 Distribution network adjustment and control system and control method thereof
CN106982259A (en) * 2017-04-19 2017-07-25 聚好看科技股份有限公司 The failure solution of server cluster
CN108259239A (en) * 2018-01-11 2018-07-06 郑州云海信息技术有限公司 A kind of database high availability support method and system
CN110286852A (en) * 2019-05-20 2019-09-27 平安科技(深圳)有限公司 Dual control framework distributed memory system, method for reading data, device and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170123943A1 (en) * 2015-10-30 2017-05-04 Netapp, Inc. Distributed data storage and processing techniques
CN106445741B (en) * 2016-09-28 2019-08-02 郑州云海信息技术有限公司 One kind realizing oracle database disaster-tolerant backup method based on ceph
CN107948248A (en) * 2017-11-01 2018-04-20 平安科技(深圳)有限公司 Distributed storage method, control server and computer-readable recording medium
CN109271280A (en) * 2018-08-30 2019-01-25 重庆富民银行股份有限公司 Storage failure is switched fast processing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046446B1 (en) * 2004-10-18 2011-10-25 Symantec Operating Corporation System and method for providing availability using volume server sets in a storage environment employing distributed block virtualization
CN103077242A (en) * 2013-01-11 2013-05-01 北京佳讯飞鸿电气股份有限公司 Method for hot standby of dual database servers
CN105553701A (en) * 2015-12-11 2016-05-04 国网青海省电力公司 Distribution network adjustment and control system and control method thereof
CN106982259A (en) * 2017-04-19 2017-07-25 聚好看科技股份有限公司 The failure solution of server cluster
CN108259239A (en) * 2018-01-11 2018-07-06 郑州云海信息技术有限公司 A kind of database high availability support method and system
CN110286852A (en) * 2019-05-20 2019-09-27 平安科技(深圳)有限公司 Dual control framework distributed memory system, method for reading data, device and storage medium

Also Published As

Publication number Publication date
CN110286852A (en) 2019-09-27

Similar Documents

Publication Publication Date Title
US8443231B2 (en) Updating a list of quorum disks
US8713362B2 (en) Obviation of recovery of data store consistency for application I/O errors
US7219260B1 (en) Fault tolerant system shared system resource with state machine logging
US20140244578A1 (en) Highly available main memory database system, operating method and uses thereof
WO2020233001A1 (en) Distributed storage system comprising dual-control architecture, data reading method and device, and storage medium
US8533171B2 (en) Method and system for restarting file lock services at an adoptive node during a network filesystem server migration or failover
US20060089975A1 (en) Online system recovery system, method and program
EP4083786A1 (en) Cloud operating system management method and apparatus, server, management system, and medium
EP2851807A1 (en) Method and system for supporting resource isolation under multi-core architecture
WO2020232859A1 (en) Distributed storage system, data writing method, device, and storage medium
CN113783765B (en) Method, system, equipment and medium for realizing intercommunication between cloud internal network and cloud external network
CN104503965A (en) High-elasticity high availability and load balancing realization method of PostgreSQL (Structured Query Language)
CN107666493B (en) Database configuration method and equipment thereof
EP3648405B1 (en) System and method to create a highly available quorum for clustered solutions
US20230029074A1 (en) Shadow live migration over a smart network interface card
US8683258B2 (en) Fast I/O failure detection and cluster wide failover
US20220334733A1 (en) Data restoration method and related device
CN115167782B (en) Temporary storage copy management method, system, equipment and storage medium
US20050262381A1 (en) System and method for highly available data processing in cluster system
US8621260B1 (en) Site-level sub-cluster dependencies
CN111488247B (en) High availability method and equipment for managing and controlling multiple fault tolerance of nodes
CN112732492A (en) Extraction backup method and system based on cloud database
CN106484495A (en) A kind of magnetic disk of virtual machine data block synchronization method
US11226875B2 (en) System halt event recovery
WO2022218346A1 (en) Fault processing method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19929559

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19929559

Country of ref document: EP

Kind code of ref document: A1